The following XML shows a single record produced by text detection.
<record> ... <trackname>DetectText.Result</trackname> <TextDetectionResult> <id>aa056c5e-3cea-4cfa-a090-713384270424</id> <region> <left>264</left> <top>280</top> <width>106</width> <height>18</height> </region> <confidence>100</confidence> </TextDetectionResult> </record>
The record contains the following information:
The id
element provides a unique identifier for each example of detected text. The text detection engine does not track text across video frames, so if you process video and the same text appears in consecutive frames there will be records (with different identifiers) in the result track for each frame. There might be multiple records per frame if text is detected in more than one region.
region
element describes the position of the detected text in the image or video frame (as a rectangle). left
provides the number of pixels between the left side of the image and the left side of the region. top
provides the number of pixels between the top of the image and the top of the region. width
and height
provide the width and height of the region.confidence
element describes the confidence score for the detection, from 0 to 100, where higher values represent greater confidence.
|