Package aeneas
¶
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
Goal¶
aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the (same) text. In computer science this task is known as (automatically computing a) forced alignment.
For example, given the verses and a 53.240s
-long audio recording
of Sonnet I by William Shakespeare,
aeneas might compute a map like the following:
1 => [00:00:00.000, 00:00:02.640]
From fairest creatures we desire increase, => [00:00:02.640, 00:00:05.880]
That thereby beauty's rose might never die, => [00:00:05.880, 00:00:09.240]
But as the riper should by time decease, => [00:00:09.240, 00:00:11.920]
His tender heir might bear his memory: => [00:00:11.920, 00:00:15.280]
But thou contracted to thine own bright eyes, => [00:00:15.280, 00:00:18.800]
Feed'st thy light's flame with self-substantial fuel, => [00:00:18.800, 00:00:22.760]
Making a famine where abundance lies, => [00:00:22.760, 00:00:25.680]
Thy self thy foe, to thy sweet self too cruel: => [00:00:25.680, 00:00:31.240]
Thou that art now the world's fresh ornament, => [00:00:31.240, 00:00:34.400]
And only herald to the gaudy spring, => [00:00:34.400, 00:00:36.920]
Within thine own bud buriest thy content, => [00:00:36.920, 00:00:40.640]
And tender churl mak'st waste in niggarding: => [00:00:40.640, 00:00:43.640]
Pity the world, or else this glutton be, => [00:00:43.640, 00:00:48.080]
To eat the world's due, by the grave and thee. => [00:00:48.080, 00:00:53.240]
The above map is just an abstract representation of a sync map. In practice, the sync map will be output to a file with a precise syntax. Currently, the following formats are supported:
- ELAN annotation format (EAF) for research purposes,
- SMIL for EPUB 3 ebooks with Media Overlays,
- SBV/SRT/SUB/TTML/VTT for closed captioning,
- JSON for consumption on the Web, and
- “raw” AUD/CSV/SSV/TSV/TXT/XML for further processing.
Usage¶
aeneas can be used via the built-in command line tools, or as a Python package inside third-party code.
(If you do not plan to write Python code, just proceed to the next section describing the built-in command line tools.)
Using aeneas Built-in Command Line Tools¶
aeneas provides the following two main command line programs:
aeneas.tools.execute_task
aeneas.tools.execute_job
Note
aeneas contains a dozen of other programs, mostly useful for debugging or converting between different file formats. See the Package aeneas.tools section for details.
A Task is a triple (audio file, text file, parameters)
for which you want to compute a single sync map file.
Example
A Task might consist of the audio track of a video as an MP3 file, its transcript written as a plain text file, and parameters like “output in SRT format, language is English”.
Example
A Task might consist of
the audio file as an MP4/AAC file,
the text of an ebook chapter as a XHTML file,
and parameters like “output in SMIL format, language is Italian,
extract text from the elements in the XHTML file with
attribute id
matching the regular expression f[0-9]+
”.
A Job is a container (ZIP or TAR file, or an uncompressed directory) including one or more Tasks; normally a Job is handy to batch processing multiple Tasks sharing the same execution parameters.
Example
A Job might consists of fifteen Tasks, each corresponding to an XHTML page inside a Fixed-Layout EPUB 3 file. Fifteen SMIL files, one for each Task (i.e., XHTML page) will be produced.
Run the above commands without arguments to get an help message. aeneas includes some example input files which cover common use cases, enabling the user to run live examples.
The help message for aeneas.tools.execute_job
reads:
$ python -m aeneas.tools.execute_job
NAME
execute_job - Execute a Job, passed as a container.
SYNOPSIS
python -m aeneas.tools.execute_job [-h|--help|--help-rconf|--version]
python -m aeneas.tools.execute_job --list-parameters
python -m aeneas.tools.execute_job CONTAINER OUTPUT_DIR [CONFIG_STRING] [OPTIONS]
OPTIONS
--cewsubprocess : run cew in separate process (see docs)
--help : print full help and exit
--help-rconf : list all runtime configuration parameters
--skip-validator : do not validate the given container and/or config string
--version : print the program name and version and exit
-h : print short help and exit
-l[=FILE], --log[=FILE] : log verbose output to tmp file or FILE if specified
-r=CONF, --runtime-configuration=CONF : apply runtime configuration CONF
-v, --verbose : verbose output
-vv, --very-verbose : verbose output, print date/time values
EXAMPLES
python -m aeneas.tools.execute_job aeneas/tools/res/job.zip output/
python -m aeneas.tools.execute_job aeneas/tools/res/job.zip output/ --cewsubprocess
python -m aeneas.tools.execute_job aeneas/tools/res/job_no_config.zip output/ "is_hierarchy_type=flat|is_hierarchy_prefix=assets/|is_text_file_relative_path=.|is_text_file_name_regex=.*\.xhtml|is_text_type=unparsed|is_audio_file_relative_path=.|is_audio_file_name_regex=.*\.mp3|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric|os_job_file_name=demo_sync_job_output|os_job_file_container=zip|os_job_file_hierarchy_type=flat|os_job_file_hierarchy_prefix=assets/|os_task_file_name=\$PREFIX.xhtml.smil|os_task_file_format=smil|os_task_file_smil_page_ref=\$PREFIX.xhtml|os_task_file_smil_audio_ref=../Audio/\$PREFIX.mp3|job_language=eng|job_description=Demo Sync Job"
The paths in the example might differ, depending on the installation location of aeneas.
Usually, each command line in the EXAMPLES
section
can be copied-and-pasted to see the corresponding example running live.
The help message for aeneas.tools.execute_task
reads:
$ python -m aeneas.tools.execute_task
NAME
execute_task - Execute a Task.
SYNOPSIS
python -m aeneas.tools.execute_task [-h|--help|--help-rconf|--version]
python -m aeneas.tools.execute_task --list-parameters
python -m aeneas.tools.execute_task --list-values[=PARAM]
python -m aeneas.tools.execute_task AUDIO_FILE TEXT_FILE CONFIG_STRING OUTPUT_FILE [OPTIONS]
python -m aeneas.tools.execute_task YOUTUBE_URL TEXT_FILE CONFIG_STRING OUTPUT_FILE -y [OPTIONS]
OPTIONS
--faster-rate : print fragments with rate > task_adjust_boundary_rate_value
--help : print full help and exit
--help-rconf : list all runtime configuration parameters
--keep-audio : do not delete the audio file downloaded from YouTube (-y only)
--largest-audio : download largest audio stream (-y only)
--list-parameters : list all parameters
--list-values : list all parameters for which values can be listed
--list-values=PARAM : list all allowed values for parameter PARAM
--output-html : output HTML file for fine tuning
--presets-word : apply presets for word-level alignment (MFCC masking)
--rate : print rate of each fragment
--skip-validator : do not validate the given config string
--version : print the program name and version and exit
--zero : print fragments with zero duration
-h : print short help and exit
-l[=FILE], --log[=FILE] : log verbose output to tmp file or FILE if specified
-r=CONF, --runtime-configuration=CONF : apply runtime configuration CONF
-v, --verbose : verbose output
-vv, --very-verbose : verbose output, print date/time values
-y, --youtube : download audio from YouTube video
EXAMPLES
python -m aeneas.tools.execute_task --examples
python -m aeneas.tools.execute_task --examples-all
The --examples
switch prints a list of common built-in live examples:
$ python -m aeneas.tools.execute_task --examples
Example 1 (input: plain text, output: EAF)
$ python -m aeneas.tools.execute_task --example-eaf
Example 2 (input: plain text, output: JSON)
$ python -m aeneas.tools.execute_task --example-json
Example 3 (input: multilevel plain text (mplain), output: SMIL)
$ python -m aeneas.tools.execute_task --example-mplain-smil
Example 4 (input: multilevel unparsed text (munparsed), output: SMIL)
$ python -m aeneas.tools.execute_task --example-munparsed-smil
Example 5 (input: unparsed text, output: SMIL)
$ python -m aeneas.tools.execute_task --example-smil
Example 6 (input: subtitles text, output: SRT)
$ python -m aeneas.tools.execute_task --example-srt
Example 7 (input: parsed text, output: TextGrid)
$ python -m aeneas.tools.execute_task --example-textgrid
Example 8 (input: parsed text, output: TSV)
$ python -m aeneas.tools.execute_task --example-tsv
Example 9 (input: single word granularity plain text, output: AUD)
$ python -m aeneas.tools.execute_task --example-words
Example 10 (input: audio from YouTube, output: TXT)
$ python -m aeneas.tools.execute_task --example-youtube
Similarly, the --examples-all
switch prints a list
of more than twenty built-in examples,
covering more peculiar input/output/parameter combinations.
For example, --example-srt
produces the following output:
$ python -m aeneas.tools.execute_task --example-srt
[INFO] Running example task with arguments:
Audio file: aeneas/tools/res/audio.mp3
Text file: aeneas/tools/res/subtitles.txt
Config string: task_language=eng|is_text_type=subtitles|os_task_file_format=srt
Sync map file: output/sonnet.srt
[INFO] Creating task...
[INFO] Creating task... done
[INFO] Executing task...
[INFO] Executing task... done
[INFO] Creating output sync map file...
[INFO] Creating output sync map file... done
[INFO] Created file 'output/sonnet.srt'
A new file, named sonnet.srt
is created in the output/
subdirectory
of the current working directory.
This SRT file contains the subtitles read from subtitles.txt
,
automatically aligned with the audio file audio.mp3
.
Example shortcuts also print the actual parameters
which are hidden behind the --example-srt
shortcut.
Thus, the above example is equivalent to:
$ python -m aeneas.tools.execute_task aeneas/tools/res/audio.mp3 aeneas/tools/res/subtitles.txt "task_language=eng|is_text_type=subtitles|os_task_file_format=srt" output/sonnet.srt
[INFO] Validating config string (specify --skip-validator to bypass)...
[INFO] Validating config string... done
[INFO] Creating task...
[INFO] Creating task... done
[INFO] Executing task...
[INFO] Executing task... done
[INFO] Creating output sync map file...
[INFO] Creating output sync map file... done
[INFO] Created file 'output/sonnet.srt'
Note that a validation of the input files and parameters is performed as the first step. If incorrect or incomplete parameters are specified, an error message is printed:
$ python -m aeneas.tools.execute_task aeneas/tools/res/audio.mp3 aeneas/tools/res/subtitles.txt "task_language=eng|is_text_type=subtitles" output/sonnet.srt
[INFO] Validating config string (specify --skip-validator to bypass)...
[ERRO] The given config string is not valid:
Errors:
Required parameter 'os_task_file_format' not set.
$ python -m aeneas.tools.execute_task aeneas/tools/res/audio.mp3 aeneas/tools/res/subtitles.txt "task_language=eng|is_text_type=subtitles|os_task_file_format=srt" /foo/bar/sonnet.srt
[ERRO] Unable to create file '/foo/bar/sonnet.srt'
[ERRO] Make sure the file path is written/escaped correctly and that you have write permission on it
To learn more, please continue with the aeneas Built-in Command Line Tools Tutorial.
Using aeneas As A Python Package¶
Please consult the aeneas Library Tutorial.
Topics¶
- aeneas Built-in Command Line Tools Tutorial
- aeneas Library Tutorial
- Overview
- Package
aeneas
- adjustboundaryalgorithm
- analyzecontainer
- audiofile
- audiofilemfcc
- cewsubprocess
- configuration
- container
- diagnostics
- downloader
- dtw
- exacttiming
- executejob
- executetask
- ffmpegwrapper
- ffprobewrapper
- globalconstants
- globalfunctions
- hierarchytype
- idsortingalgorithm
- job
- language
- logger
- mfcc
- plotter
- runtimeconfiguration
- sd
- syncmap
- synthesizer
- task
- textfile
- vad
- validator
- Package
aeneas.extra
- Package
aeneas.tests
- Package
aeneas.tools
- Package
aeneas.ttswrappers
- Changelog
- v1.7.3 (2017-03-15)
- v1.7.2 (2017-03-03)
- v1.7.1 (2016-12-20)
- v1.7.0 (2016-12-07)
- v1.6.0.1 (2016-09-30)
- v1.6.0 (2016-09-26)
- v1.5.1 (2016-07-25)
- v1.5.0.3 (2016-04-23)
- v1.5.0.2 (2016-04-09)
- v1.5.0.1 (2016-04-03)
- v1.5.0 (2016-04-02)
- v1.4.1 (2016-02-13)
- v1.4.0 (2016-01-15)
- v1.3.3 (2015-12-20)
- v1.3.2 (2015-11-11)
- v1.3.1.1 (2015-11-03)
- v1.3.1 (2015-10-28)
- v1.3.0 (2015-10-14)
- v1.2.0 (2015-09-27)
- v1.1.2 (2015-09-24)
- v1.1.1 (2015-08-23)
- v1.1.0 (2015-08-21)
- v1.0.4 (2015-08-09)
- v1.0.3 (2015-06-13)
- v1.0.2 (2015-05-14)
- v1.0.1 (2015-05-12)