The transcript aligner compares the speech-to-text transcript and the original transcript text to produce an aligned transcript. The aligner either uses words as whole units or breaks them down into phonemes or letters. You can therefore select one of three modes:
words
prons
(phonemes)letters
In addition, the alignment algorithm can also work in one of two polarity modes:
To run the transcript alignment task
Send an AddTask
action to IDOL Speech Server, and set the following parameters:
Type
|
The task name. Set to TranscriptAlign . |
TxtFile
|
The normalized transcript file. |
CtmFile
|
The speech-to-text transcript produced for the audio file. |
Out
|
The file to write the aligned transcript to. |
MatchType
|
The mode–either words , prons , or letters . |
For example:
http://localhost:13000/action=AddTask&Type=TranscriptAlign&TxtFile=C:\data\transcript.txt&CtmFile=C:\misc\speechtext.ctm&Out=AlignedTranscript.ctm&MatchType=words
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to compare the original transcript transcript.txt
with the speech-to-text transcript speechtext.ctm
to produce an aligned transcript, AlignedTranscript.ctm
. The action instructs IDOL Speech Server to use the words
alignment mode.
This action returns a token. You can use the token to:
|