Deprecated: The StreamSpeakerId
task is deprecated for IDOL Server version 11.1.0. Use the SpkIdEvalStream
task instead.
This task is still available for existing implementations, but it might be incompatible with new functionality. The task might be deleted in future.
The StreamSpeakerId
task segments an audio stream by speaker and identifies the speaker in each segment. If the speech does not match a speaker within the classifier file, IDOL Speech Server identifies it as being from an unknown speaker. IDOL Speech Server also identifies periods of non-speech within the audio.
To run the StreamSpeakerId
task, speakers must be trained to IDOL Speech Server.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to StreamSpeakerId . |
Yes |
Ast | The speaker classifier file. | Yes. See Comments. |
CompSelect | The number of components to select for use in scoring. | |
Diag | Whether to generate diagnostic information. | |
DiagFile | The file to write the diagnostic information to. | |
DiscardShort | Exclude segments shorter than a specific duration from further analysis. | |
MinNonSpeech | The minimum size in seconds of non-speech segments. | |
MinSpeech | The minimum size in seconds of speech segments. | |
Out | The file to write the speaker identification results to. | Yes |
Sfreq | The sample frequency of the audio file to process. | |
SidBase | The sid base pack resource to use to determine the base files to use. | |
Sig | The .sig file to use for speaker identification. | |
SpkSegCoef | Applies a weight to bias the decision about where speaker boundaries occur. | |
SpkThreshCoef | A fixed value to use to adjust the speaker identification threshold, to trade off false acceptances against rejections. | |
USMEnabled | Whether to use the USM for speaker identification. |
http://localhost:13000/action=AddTask&Type=StreamSpeakerId&Ast=C:\training\speakers.ast&Out=SpeakID.ctm
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to search the audio stream for speakers contained in the speakers.ast
classifier file and to write the identification results to the SpeakID.ctm
file.
If you do not specify the Ast parameter, the action uses the base ast file, determined by the SidBase resource. This base file does not contain any speaker information, and cannot identify speakers, but it performs gender detection and speaker segmentation.
|