Creating Narration Audio

Sunday, May 12, 2024

Using Software

I use the following software:

Software	Description
NaturalReader online	It reads the text aloud in a natural voice.
OBS Studio	It records the desktop screen, including audio.
FFmpeg	It extracts MP3 audio from video files.
Audacity	It’s a digital audio editing software.

Note

Detailed setup and operation instructions for each software are omitted here.

Natural Reader online

NaturalReader screenshot

NaturalReader online offers enough free reading for speeches lasting up to one minute.
In practice, this free range extends to approximately 20,000 characters.
Also, since downloading audio files directly from NaturalReader is a paid feature, I always use an alternate method to extract the audio files.

OBS (Open Broadcast Software) Studio

OBS Studio screenshot

While NaturalReader reads the text, I use OBS Studio to capture the desktop screen, saving it as an MP4 video file.
The image above shows my script being read by NaturalReader online, with the screen captured using OBS Studio.

FFmpeg

After that, I use FFmpeg to extract an MP3 audio file from the MP4 video.
Here’s the command prompt for extracting the MP3:

❯ ffmpeg -i narration.mp4 -vn narration.mp3

Audacity

Audacity screenshot

When recording screen captures with OBS Studio, it’s common to end up with silent sections at the beginning and end of the recording.
To address this, I use Audacity, an audio editing software, to trim out the unnecessary silent parts (highlighted in red in the diagram) and extract the narration audio file.
The image above shows my script’s narration audio being edited in Audacity.

The following is a sample audio narration of the script for Exercise 1 from Engoo’s Photo Desctiption Lesson 1.

This is how I create a narration audio file that closely resembles natural speech from the speech manuscript.