RecAPI
LETTER Struct Reference

The LETTER structure. More...

List of all members.

Public Attributes

WORD left
WORD top
WORD width
WORD height
float pointSize
WORD capHeight
WORD baseLine
WORD zone
WCHAR code
BYTE err
BYTE reserved_b
BYTE cntChoices
BYTE cntSuggestions
DWORD ndxChoices
WORD fontAttrib
WORD ndxFontFace
DWORD info
WORD makeup
BYTE widthULdot
BYTE widthULgap
WORD cellNum
BYTE ndxFGColor
BYTE ndxBGColor
short lang
short lang2
DWORD ndxExt
DWORD ndxSuggestions
LSPC spcInfo

Detailed Description

The LETTER structure.

This is a recognized data structure. As the result of the recognition process the recognition data will consist of this type of structure for each recognized character. This is the most detailed information available about the recognized characters.

See the usage of alternatives and handling of spaces.

Note:
The field pointSize is not a replacement for the field fontSize in the CSDK versions 12.x (that's the cause of the name change):
  • pointSize is filled only on textual PDF inputs. For others, it is used only internally during page formatting (only at RecApiPlus level).
  • capHeight is always available and could be used for replacing fontSize. Rough approximation: fontSize = capHeight * 100 / dpi (fontSize in CSDK 12.7 is calculated in this way).
The position and size information (left, top, width, height, capHeight, baseLine, widthULdot, widthULgap) is expressed in pixel coordinates mapped to the image specified when getting the letters.
The bounding box (left, top, width, height) of a character usually contains that single character only, but sometimes more characters are recognized together in one step, in which case all those characters have the same bounding box.

Member Data Documentation

Y coordinate of the baseline in pixels. In case of vertical text this is X coordinate. In CCJK vertical text the baseline is in the middle of the characters by definition.

Expresses a measure of the capital letter height in pixels. See notes for more info!

Index of the cell in the cell list which contains the character (applicable only for WT_TABLE zones). Index of the text-line form-element object which contains the character (in case of WT_FORM zones).

Number of related choices continuously placed in the external choice string.

Number of related suggestions continuously placed in the external suggestion array.

WCHAR LETTER::code

Character code in UNICODE. This is the first choice of the recognition or UNICODE_REJECTED for rejected characters.

Confidence number expressing both the first guess' recognition certainty (code member) and also the word certainty. For more information see the section confidence reporting.

Font information about the recognized character. Used by the OCR engines. See its possible bits.

Height of the character rectangle in pixels.

DWORD LETTER::info

Additional information about the character. See its possible bits and the macros for easier handling this information.

short LETTER::lang

This is used to declare which language the recognized word belongs to. See Language of a word.

See field lang.

Left boundary of the rectangle containing the character in pixels.

Since the recognition data does not contain extra characters for marking the line ends, paragraphs, pages, etc., these items of information are stored for the particular characters in this field. It can be any binary OR-ed combination of the possible formatting attributes.

Index of the background color within the palette of the recognition data. See kRecGetLetterPalette.

Index of the second choice in the external choice string.

Not documented.

Index of the foreground color within the palette of the recognition data. See kRecGetLetterPalette.

Index of the logical font definition placed in an external font array.

If this LETTER is not a space, this member is the index of the first suggestion in the external suggestion string. (This member makes a union together with spcInfo.)

Font size in points. See notes for more info!

Internal use only.

If this LETTER is a space additional information is available here. (This member makes a union together with ndxSuggestions.)

Top boundary of the rectangle containing the character in pixels.

Width of the character rectangle in pixels.

Width of a dot in pixels if the "underline" is underdots in reality. 0 if simple underline, 0 if nothing. It also gives this information in the case of dotleaders. (See LSPC.)

Width of a gap in pixels if the "underline" is underdots in reality. 0 if simple underline, 0 if nothing. It also gives this information in the case of dotleaders. (See LSPC.)

Index of the zone in the zone list which contains the character.