To view the output of a speech-to-text transcription file in XML format, send a GetResults
action. The contents of the .ctm output file return in a series of XML tags. For example:
<stt_transcript> <stt_record> <start>0.000</start> <end>3.390</end> <label><SIL></label> <score>0.987</score> <rank>0</rank> </stt_record> <stt_record> <start>3.390</start> <end>3.780</end> <label>hello</label> <score>0.765</score> <rank>0</rank> </stt_record> <stt_record> <start>3.780</start> <end>3.970</end> <label>there</label> <score>0.875</score> <rank>0</rank> </stt_record> </stt_transcript>
This example shows the XML output for a transcript that contains a silent period (with the label <SIL>
) followed by the words hello there.
The <stt_transcript>
tag represents the start of a recognition sequence. This tag contains <stt_record>
nodes that contain the following information for each recognized word.
<start>
|
The start time (in seconds) of the word. |
<end>
|
The end time (in seconds) of the word. |
<label>
|
The recognized word. |
<score>
|
The confidence value of the recognized word. |
<rank>
|
Not used for speech-to-text (used for different operation results). |
|