Output Records

This section describes the records that are produced when you ingest an image or document file.

Images and Multi-Page Images

The following sample XML shows a record produced when you ingest an image, multi-page image such as a TIFF file, or a presentation file (.PPT, .PPTX, .ODP).

<record>
  <pageNumber>1</pageNumber>
  <trackname>Image_1</trackname>
  <Page>
    <image>
      <imagedata format="PNG">...</imagedata>
      <width>222</width>
      <height>140</height>
      <pixelAspectRatio>1:1</pixelAspectRatio>
    </image>
    <pagetext/>
  </Page>
</record>

The record contains the following information:

Documents

If you ingest a document such as a PDF file, the output might also include the text extracted from text elements:

<record>
  <pageNumber>1</pageNumber>
  <trackname>Image_1</trackname>
  <Page>
    <image>
      <imagedata format="PNG">...</imagedata>
      <width>892</width>
      <height>1260</height>
      <pixelAspectRatio>1:1</pixelAspectRatio>
    </image>
    <pagetext>
      <element>
        <text>Some text</text>
        <region>
          <left>115</left>
          <top>503</top>
          <width>460</width>
          <height>41</height>
        </region>
        <angle>0</angle>
      </element>
      ...
    </pagetext>
  </Page>
</record>

The pagetext element contains information about associated text elements. If the ingested media was a PDF file, each record represents a page. If the ingested media was another type of document the record represents an embedded image and the text that follows it, up to the next embedded image.

Each element element describes a text element and contains the following data:

Information about text elements is used by the OCR analysis engine, which automatically combines the text elements with the text extracted from images, to produce a complete transcript of the text that appears on the page.


_HP_HTML5_bannerTitle.htm