runtimeconfiguration¶
This module contains the following classes:
RuntimeConfiguration
, representing the runtime configuration.
New in version 1.4.1.
-
class
aeneas.runtimeconfiguration.
RuntimeConfiguration
(config_string=None)[source]¶ A structure representing a runtime configuration, that is, a set of parameters for the algorithms which process jobs and tasks.
Allowed keys are listed below as class members.
Parameters: config_string (string) – the configuration string Raises: TypeError: if config_string
is notNone
and it is not a Unicode stringRaises: KeyError: if trying to access a key not listed above -
ABA_NONSPEECH_TOLERANCE
= 'aba_nonspeech_tolerance'¶ Tolerance, in seconds, for considering a given time value inside a nonspeech interval.
Default:
0.080
seconds.New in version 1.7.0.
-
ABA_NO_ZERO_DURATION
= 'aba_no_zero_duration'¶ Offset, in seconds, to be added to fragments with zero length.
Default:
0.001
seconds.New in version 1.7.0.
-
ALLOW_UNLISTED_LANGUAGES
= 'allow_unlisted_languages'¶ If
True
, allow using a language code not listed in the TTS supported languages list; otherwise, generate an error if the user attempts to use a language not listed.Default:
False
.New in version 1.4.1.
-
CDTW
= 'cdtw'¶ If
True
and the Python C extensioncdtw
is available, use it. Otherwise, use the pure Python code.Default:
True
.New in version 1.5.1.
-
CEW
= 'cew'¶ If
True
and the Python C extensioncew
is available, use it. Otherwise, use the pure Python code.Default:
True
.New in version 1.5.1.
-
CEW_SUBPROCESS_ENABLED
= 'cew_subprocess_enabled'¶ If
True
, calls toaeneas.cew
will be done viasubprocess
, using theCEWSubprocess
helper class.Default:
False
.New in version 1.5.0.
-
CEW_SUBPROCESS_PATH
= 'cew_subprocess_path'¶ Use the given path to the python executable when calling
aeneas.cew
viasubprocess
.You might need to use a full path, like
/path/to/your/python
.Default:
python
.New in version 1.5.0.
-
CFW
= 'cfw'¶ If
True
and the Python C++ extensioncfw
is available, use it. Otherwise, use the pure Python code.Default:
True
.New in version 1.6.0.
-
CMFCC
= 'cmfcc'¶ If
True
and the Python C extensioncmfcc
is available, use it. Otherwise, use the pure Python code.Default:
True
.New in version 1.5.1.
-
C_EXTENSIONS
= 'c_extensions'¶ If
True
and the Python C/C++ extensions are available, use them. Otherwise, use the pure Python code.This option is equivalent to setting
CDTW
,CEW
,CFW
, andCMFCC
toTrue
orFalse
at once.Default:
True
.New in version 1.4.1.
-
DOWNLOADER_RETRY_ATTEMPTS
= 'downloader_retry_attempts'¶ Retry an HTTP POST request generated by the
Downloader
for this number of times before giving up. It must be an integer greater than zero.Default:
5
.New in version 1.7.2.
-
DOWNLOADER_SLEEP
= 'downloader_sleep'¶ Wait this number of seconds before the next HTTP POST request of the
Downloader
. This parameter can be used to throttle the HTTP usage. It cannot be a negative value.Default:
1.000
.New in version 1.7.2.
-
DTW_ALGORITHM
= 'dtw_algorithm'¶ DTW aligner algorithm.
Allowed values:
New in version 1.4.1.
-
DTW_MARGIN
= 'dtw_margin'¶ DTW aligner margin, in seconds, for the
stripe
algorithm.Default:
60
, corresponding to60 s
ahead and behind (i.e.,120 s
total margin).New in version 1.4.1.
-
DTW_MARGIN_L1
= 'dtw_margin_l1'¶ DTW aligner margin, in seconds, for the
stripe
algorithm at level 1 (paragraph).Default:
60
, corresponding to60 s
ahead and behind (i.e.,120 s
total margin).New in version 1.7.0.
-
DTW_MARGIN_L2
= 'dtw_margin_l2'¶ DTW aligner margin, in seconds, for the
stripe
algorithm at level 2 (sentence).Default:
30
, corresponding to30 s
ahead and behind (i.e.,60 s
total margin).New in version 1.7.0.
-
DTW_MARGIN_L3
= 'dtw_margin_l3'¶ DTW aligner margin, in seconds, for the
stripe
algorithm at level 3 (word).Default:
10
, corresponding to10 s
ahead and behind (i.e.,20s
total margin).New in version 1.7.0.
-
FFMPEG_PATH
= 'ffmpeg_path'¶ Path to the
ffmpeg
executable.You might need to use a full path, like
/path/to/your/ffmpeg
.Default:
ffmpeg
.New in version 1.4.1.
-
FFMPEG_SAMPLE_RATE
= 'ffmpeg_sample_rate'¶ Sample rate for
ffmpeg
, in Hertz.Default:
16000
.New in version 1.4.1.
-
FFPROBE_PATH
= 'ffprobe_path'¶ Path to the
ffprobe
executable.You might use a full path, like
/path/to/your/ffprobe
.Default:
ffprobe
.New in version 1.4.1.
-
JOB_MAX_TASKS
= 'job_max_tasks'¶ Maximum number of Tasks of a Job. If a Job has more Tasks than this value, it will not be executed and an error will be raised. Use
0
for disabling this check.Default:
0
(disabled).New in version 1.4.1.
-
MFCC_EMPHASIS_FACTOR
= 'mfcc_emphasis_factor'¶ Emphasis factor to be applied to MFCCs.
Default:
0.970
.New in version 1.4.1.
-
MFCC_FFT_ORDER
= 'mfcc_fft_order'¶ Order of the RFFT for extracting MFCCs. It must be a power of two.
Default:
512
.New in version 1.4.1.
-
MFCC_FILTERS
= 'mfcc_filters'¶ Number of filters for extracting MFCCs.
Default:
40
.New in version 1.4.1.
-
MFCC_GRANULARITY_MAP
= {1: ('dtw_margin_l1', 'mfcc_mask_nonspeech_l1', 'mfcc_window_length_l1', 'mfcc_window_shift_l1'), 2: ('dtw_margin_l2', 'mfcc_mask_nonspeech_l2', 'mfcc_window_length_l2', 'mfcc_window_shift_l2'), 3: ('dtw_margin_l3', 'mfcc_mask_nonspeech_l3', 'mfcc_window_length_l3', 'mfcc_window_shift_l3')}¶ Map level numbers to
DTW_MARGIN_*
,MFCC_MASK_NONSPEECH_*
,MFCC_WINDOW_LENGTH_*
, andMFCC_WINDOW_SHIFT_*
keys.New in version 1.5.0.
-
MFCC_LOWER_FREQUENCY
= 'mfcc_lower_frequency'¶ Lower frequency to be used for extracting MFCCs, in Hertz.
Default:
133.3333
.New in version 1.4.1.
-
MFCC_MASK_EXTEND_SPEECH_INTERVAL_AFTER
= 'mfcc_mask_extend_speech_after'¶ Extend to the right (after/future) a speech interval found by the VAD algorithm, by this many frames, when masking nonspeech out.
Default:
0
.New in version 1.7.0.
-
MFCC_MASK_EXTEND_SPEECH_INTERVAL_BEFORE
= 'mfcc_mask_extend_speech_before'¶ Extend to the left (before/past) a speech interval found by the VAD algorithm, by this many frames, when masking nonspeech out.
Default:
0
.New in version 1.7.0.
-
MFCC_MASK_LOG_ENERGY_THRESHOLD
= 'mfcc_mask_log_energy_threshold'¶ Threshold for the VAD algorithm to decide that a given frame contains speech, when masking nonspeech out. Note that this is the log10 of the energy coefficient.
Default:
0.699
=log10(5)
, that is, a frame must have an energy at least 5 times higher than the minimum to be considered a speech frame.New in version 1.7.0.
-
MFCC_MASK_MIN_NONSPEECH_LENGTH
= 'mfcc_mask_min_nonspeech_length'¶ Minimum length, in frames, of a nonspeech interval to be masked out.
Default:
1
.New in version 1.7.0.
-
MFCC_MASK_NONSPEECH
= 'mfcc_mask_nonspeech'¶ If
True
, computes the DTW path ignoring nonspeech frames. Setting this parameter toTrue
might help aligning at word level granularity.Default:
False
.New in version 1.7.0.
-
MFCC_MASK_NONSPEECH_L1
= 'mfcc_mask_nonspeech_l1'¶ If
True
, computes the DTW path ignoring nonspeech frames at level 1 (paragraph).Default:
False
.New in version 1.7.0.
-
MFCC_MASK_NONSPEECH_L2
= 'mfcc_mask_nonspeech_l2'¶ If
True
, computes the DTW path ignoring nonspeech frames at level 2 (sentence).Default:
False
.New in version 1.7.0.
-
MFCC_MASK_NONSPEECH_L3
= 'mfcc_mask_nonspeech_l3'¶ If
True
, computes the DTW path ignoring nonspeech frames at level 3 (word).Default:
False
.New in version 1.7.0.
-
MFCC_SIZE
= 'mfcc_size'¶ Number of MFCCs to extract, including the 0th.
Default:
13
.New in version 1.4.1.
-
MFCC_UPPER_FREQUENCY
= 'mfcc_upper_frequency'¶ Upper frequency to be used for extracting MFCCs, in Hertz.
Default:
6855.4976
.New in version 1.4.1.
-
MFCC_WINDOW_LENGTH
= 'mfcc_window_length'¶ Length of the window for extracting MFCCs, in seconds. It is usual to set it between 1.5 and 4 times the value of
MFCC_WINDOW_SHIFT
.Default:
0.100
.New in version 1.4.1.
-
MFCC_WINDOW_LENGTH_L1
= 'mfcc_window_length_l1'¶ Length of the window, in seconds, for extracting MFCCs at level 1 (paragraph). It is usual to set it between 1.5 and 4 times the value of
MFCC_WINDOW_SHIFT_L1
.Default:
0.100
.New in version 1.5.0.
-
MFCC_WINDOW_LENGTH_L2
= 'mfcc_window_length_l2'¶ Length of the window, in seconds, for extracting MFCCs at level 2 (sentence). It is usual to set it between 1.5 and 4 times the value of
MFCC_WINDOW_SHIFT_L2
.Default:
0.050
.New in version 1.5.0.
-
MFCC_WINDOW_LENGTH_L3
= 'mfcc_window_length_l3'¶ Length of the window, in seconds, for extracting MFCCs at level 3 (word). It is usual to set it between 1.5 and 4 times the value of
MFCC_WINDOW_SHIFT_L3
.Default:
0.020
.New in version 1.5.0.
-
MFCC_WINDOW_SHIFT
= 'mfcc_window_shift'¶ Shift of the window for extracting MFCCs, in seconds. This parameter is basically the time step of the synchronization maps output.
Default:
0.040
.New in version 1.4.1.
-
MFCC_WINDOW_SHIFT_L1
= 'mfcc_window_shift_l1'¶ Shift of the window, in seconds, for extracting MFCCs at level 1 (paragraph). This parameter is basically the time step of the synchronization map output at level 1.
Default:
0.040
.New in version 1.5.0.
-
MFCC_WINDOW_SHIFT_L2
= 'mfcc_window_shift_l2'¶ Shift of the window, in seconds, for extracting MFCCs at level 2 (sentence). This parameter is basically the time step of the synchronization map output at level 2.
Default:
0.020
.New in version 1.5.0.
-
MFCC_WINDOW_SHIFT_L3
= 'mfcc_window_shift_l3'¶ Shift of the window, in seconds, for extracting MFCCs at level 3 (word). This parameter is basically the time step of the synchronization map output at level 3.
Default:
0.005
.New in version 1.5.0.
-
NUANCE_TTS_API_ID
= 'nuance_tts_api_id'¶ Your ID value to use the Nuance TTS API.
You will be billed according to your Nuance Developers account plan.
Important: this feature is experimental, use at your own risk. It is recommended not to use this TTS at word-level granularity, as it will create many requests, hence it will be expensive. If you still want to use it, you can enable the TTS caching mechanism by setting
TTS_CACHE
toTrue
.New in version 1.5.0.
-
NUANCE_TTS_API_KEY
= 'nuance_tts_api_key'¶ Your KEY value to use the Nuance TTS API.
You will be billed according to your Nuance Developers account plan.
Important: this feature is experimental, use at your own risk. It is recommended not to use this TTS at word-level granularity, as it will create many requests, hence it will be expensive. If you still want to use it, you can enable the TTS caching mechanism by setting
TTS_CACHE
toTrue
.New in version 1.5.0.
-
SAFETY_CHECKS
= 'safety_checks'¶ If
True
, perform safety checks on input files and parameters. If set toFalse
, it disables:- checks perfomed by
Validator
; - converting the audio file synthesized by the TTS engine so that its sample rate times the MFCC shift is an integer value.
Warning
Setting this parameter to
False
might result in runtime errors. Please be sure to understand the implications.Default:
True
.New in version 1.7.0.
- checks perfomed by
-
TASK_MAX_AUDIO_LENGTH
= 'task_max_audio_length'¶ Maximum length of the audio file of a Task, in seconds. If a Task has an audio file longer than this value, it will not be executed and an error will be raised.
Use
0
to disable this check.Default:
0
seconds.New in version 1.4.1.
-
TASK_MAX_TEXT_LENGTH
= 'task_max_text_length'¶ Maximum number of text fragments in the text file of a Task. If a Task has more text fragments than this value, it will not be executed and an error will be raised.
Use
0
to disable this check.Default:
0
(disabled).New in version 1.4.1.
-
TMP_PATH
= 'tmp_path'¶ Path to the temporary directory to be used. Default:
None
, meaning that the default temporary directory will be set byTMP_PATH_DEFAULT_POSIX
orTMP_PATH_DEFAULT_NONPOSIX
depending on your OS.New in version 1.4.1.
-
TTS
= 'tts'¶ The TTS engine to use for synthesizing text.
Allowed values are listed in
ALLOWED_VALUES
.The default value is
ESPEAK
(espeak
) which will use the built-in eSpeak TTS wrapper. You might need to provide a/full/path/to/your/espeak
value to theTTS_PATH
parameter if the commandespeak
is not available in one of the directories listed in yourPATH
environment variable.Specify the value
ESPEAKNG
(espeak-ng
) to use the eSpeak-ng TTS wrapper. You might need to provide a/full/path/to/your/espeak-ng
value to theTTS_PATH
parameter if the commandespeak-ng
is not available in one of the directories listed in yourPATH
environment variable.Specify the value
FESTIVAL
(festival
) to use the built-in Festival TTS wrapper. You might need to provide a/full/path/to/your/text2wave
value to theTTS_PATH
parameter if the commandtext2wave
is not available in one of the directories listed in yourPATH
environment variable.Specify the value
AWS
(aws
) to use the built-in AWS Polly TTS API wrapper; you will need to provide your AWS API Access Key and Secret Access Key by either storing them on disk (e.g., in~/.aws/credentials
and~/.aws/config
) or setting them in environment variables. Please refer to http://boto3.readthedocs.io/en/latest/guide/configuration.html for further details.Specify the value
NUANCE
(nuance
) to use the built-in Nuance TTS API wrapper; you will need to provide your Nuance Developer API ID and API Key using theNUANCE_TTS_API_ID
andNUANCE_TTS_API_KEY
parameters. Please note that you will be billed according to your Nuance Developers account plan.Specify the value
CUSTOM
(custom
) to use a custom TTS; you will need to provide the path to the Python source file containing your TTS wrapper using theTTS_PATH
parameter.New in version 1.5.0.
-
TTS_API_RETRY_ATTEMPTS
= 'tts_api_retry_attempts'¶ Retry an HTTP POST request to the Nuance TTS API for this number of times before giving up. It must be an integer greater than zero.
Note that this parameter was called
nuance_tts_api_retry_attempts
before v1.7.0.Default:
5
.New in version 1.5.0.
-
TTS_API_SLEEP
= 'tts_api_sleep'¶ Wait this number of seconds before the next HTTP POST request to the Nuance TTS API. This parameter can be used to throttle the HTTP usage. It cannot be a negative value.
Note that this parameter was called
nuance_tts_api_sleep
before v1.7.0.Default:
1.000
.New in version 1.5.0.
-
TTS_CACHE
= 'tts_cache'¶ If set to
True
, synthesize each distinct text fragment only once, caching the resulting audio data as a file on disk.The cache files will be removed after the synthesis is compled.
This option is useful when calling TTS engines, via subprocess or remote APIs, on text files with many identical fragments, for example when aligning at word-level granularity.
Enabling this option will create the cache files in
TMP_PATH
, so make sure that that path has enough free space.Default:
False
.New in version 1.6.0.
-
TTS_GRANULARITY_MAP
= {1: ('tts_l1', 'tts_path_l1'), 2: ('tts_l2', 'tts_path_l2'), 3: ('tts_l3', 'tts_path_l3')}¶ Map level numbers to
TTS_*
andTTS_PATH_*
keys.New in version 1.6.0.
-
TTS_L1
= 'tts_l1'¶ The TTS engine to use for synthesizing text at level 1 (paragraph).
See also
TTS
.Default:
espeak
.New in version 1.6.0.
-
TTS_L2
= 'tts_l2'¶ The TTS engine to use for synthesizing text at level 2 (sentence).
See also
TTS
.Default:
espeak
.New in version 1.6.0.
-
TTS_L3
= 'tts_l3'¶ The TTS engine to use for synthesizing text at level 3 (word).
See also
TTS
.Default:
espeak
.New in version 1.6.0.
-
TTS_PATH
= 'tts_path'¶ Path to the TTS engine executable or the Python CustomTTSWrapper
.py
source file (see theaeneas/extra
directory for examples).You might need to use a full path, like
/path/to/your/ttsengine
or/path/to/your/ttswrapper.py
.Default:
None
, implying to use the default path defined by each TTS wrapper, if it calls the TTS engine viasubprocess
(otherwise it does not matter).New in version 1.5.0.
-
TTS_PATH_L1
= 'tts_path_l1'¶ Path to the TTS engine executable to use for synthesizing text at level 1 (paragraph).
See also
TTS_PATH
.Default:
None
.New in version 1.6.0.
-
TTS_PATH_L2
= 'tts_path_l2'¶ Path to the TTS engine executable to use for synthesizing text at level 2 (sentence).
See also
TTS_PATH
.Default:
None
.New in version 1.6.0.
-
TTS_PATH_L3
= 'tts_path_l3'¶ Path to the TTS engine executable to use for synthesizing text at level 3 (word).
See also
TTS_PATH
.Default:
None
.New in version 1.6.0.
-
TTS_VOICE_CODE
= 'tts_voice_code'¶ The code of the TTS voice to use. If you specify this value, it will override the default voice code associated with the language of your text.
Default:
None
.New in version 1.5.0.
-
VAD_EXTEND_SPEECH_INTERVAL_AFTER
= 'vad_extend_speech_after'¶ Extend to the right (after/future) a speech interval found by the VAD algorithm, by this many seconds.
Default:
0
seconds.New in version 1.4.1.
-
VAD_EXTEND_SPEECH_INTERVAL_BEFORE
= 'vad_extend_speech_before'¶ Extend to the left (before/past) a speech interval found by the VAD algorithm, by this many seconds.
Default:
0
seconds.New in version 1.4.1.
-
VAD_LOG_ENERGY_THRESHOLD
= 'vad_log_energy_threshold'¶ Threshold for the VAD algorithm to decide that a given frame contains speech. Note that this is the log10 of the energy coefficient.
Default:
0.699
=log10(5)
, that is, a frame must have an energy at least 5 times higher than the minimum to be considered a speech frame.New in version 1.4.1.
-
VAD_MIN_NONSPEECH_LENGTH
= 'vad_min_nonspeech_length'¶ Minimum length, in seconds, of a nonspeech interval.
Default:
0.200
seconds.New in version 1.4.1.
-
dtw_margin
¶ Return the value of the
DTW_MARGIN
key stored in this configuration object.Return type: TimeValue
-
mmn
¶ Return the value of the
MFCC_MASK_NONSPEECH
key stored in this configuration object.Return type: bool
-
mwl
¶ Return the value of the
MFCC_WINDOW_LENGTH
key stored in this configuration object.Return type: TimeValue
-
mws
¶ Return the value of the
MFCC_WINDOW_SHIFT
key stored in this configuration object.Return type: TimeValue
-
safety_checks
¶ Return the value of the
SAFETY_CHECKS
key stored in this configuration object.If
False
, safety checks are not performed.Return type: bool
-
sample_rate
¶ Return the value of the
FFMPEG_SAMPLE_RATE
key stored in this configuration object.Return type: int
-
set_granularity
(level)[source]¶ Set the values for
MFCC_WINDOW_LENGTH
andMFCC_WINDOW_SHIFT
matching the given granularity level.Currently supported levels:
1
(paragraph)2
(sentence)3
(word)
Parameters: level (int) – the desired granularity level
-