To run speech-to-text, you must install a language pack. There are more than 60 language packs available for Media Server. Language packs can contain hundreds of megabytes of data, so they are not included in the Media Server installation and must be downloaded separately.
A language pack supports a single language and a single audio sample rate. For example, there is a language pack for processing US English (16kHz) and another for US English (8kHz). The 8kHz language packs are for processing telephony audio. For a list of available language packs, see Speech Analysis Supported Languages.
To install a language pack
staticdata/speechtotext/
, where staticdata
is the folder specified by the StaticDataDirectory
parameter in the [Paths]
section of the Media Server configuration file. The default value of this parameter is the staticdata
folder in the Media Server installation directory.ListSpeechLanguagePacks
. The response lists each language pack that is available, along with its supported sample rate.
|