When you perform speech-to-text conversion on a live audio stream, you can specify a mode that defines the rate to perform analysis. Versions of IDOL Speech Server from 10.8 upwards and the 6.0+ versions of the language packs use DNN acoustic models to improve speech-to-text accuracy. Each language pack contains at least two DNN acoustic models of different sizes. By default, in fixed mode the larger, most accurate model is used.
To override the default option, specify a different DNN file as the value of the DnnFile
parameter in the task configuration file or at the command line.
Caution: You can use DNN acoustic modelling in live or relative mode only if your DNN files are smaller than a certain size. In addition, you must be using Intel (or compatible) Processors that support SIMD extensions SSSE3 and SSE4.1. If this is not possible, you can set the DnnFile
parameter to none to allow non-DNN speech-to-text without hardware limitations.
relative
mode, which uses a constant rate. If the rate varies, use the live
mode. In live
mode, recognition keeps pace with the rate at which data is sent to the server.
Note: If data streams too fast for the system’s computational resources, recognition accuracy might be impaired.
Tip: In live or relative mode, it is crucial to ensure that sufficient time is allowed for the recognition process to be robust and to produce transcriptions with good accuracy. To ensure this:
GetStatus
action. Speech Server repeats the warning after each hour of audio processing. To use live
mode in live stream speech-to-text tasks, you must add the Mode
configuration parameter to the configuration sections for the stt
and stream
modules, if it is not already present. For example:
[stt] Mode=$params.Mode
[stream] Mode=$params.Mode
This configuration creates a Mode
action parameter. To use live mode, set the Mode
action parameter to live
in a task action that uses the stt
and stream
modules, such as StreamToText
. For example:
http://localhost:13000/action=AddTask&Type=StreamToText&Lang=ENUK&Out=Transcript1.ctm&Mode=Live
|