RecAPI
Recognition Data Handling Module

Letter handling tools. More...

Classes

struct  LSPC
 Additional information about the space character. More...
struct  LETTER
 The LETTER structure. More...

Typedefs

typedef LETTERLPLETTER
 Pointer to a structure LETTER.
typedef const LETTERLPCLETTER
 Const pointer to a structure LETTER.

Enumerations

enum  LETTERSTRENGTH {
  LTS_FINAL,
  LTS_STRONG,
  LTS_MEDIUM,
  LTS_WEAK,
  LTS_SIZE
}
 Possible places where letter array is to be copied to. More...

Functions

RECERR RECAPIKRN kRecGetLetters (HPAGE hPage, IMAGEINDEX iiImage, LPLETTER *ppLetter, LPLONG pLettersLength)
 Getting recognition result.
RECERR RECAPIKRN kRecGetLetterPalette (HPAGE hPage, REC_COLOR **ppColours, LPLONG pNum)
 Getting palette of recognition data.
RECERR RECAPIKRN kRecGetChoiceStr (HPAGE hPage, WCHAR **ppChoices, LPLONG pLength)
 Getting choices.
RECERR RECAPIKRN kRecGetSuggestionStr (HPAGE hPage, WCHAR **ppSuggestions, LPLONG pLength)
 Getting suggestions.
RECERR RECAPIKRN kRecGetFontFaceStr (HPAGE hPage, char **ppFontFaces, LPLONG pLength)
 Getting font faces.
RECERR RECAPIKRN kRecSetLetters (LETTERSTRENGTH towhere, HPAGE hPage, IMAGEINDEX iiImage, LPCLETTER pLetter, LONG LettersLength)
 Putting a letter buffer onto the input of the PLUS2W and PLUS3W engines or the selected output converter.
RECERR RECAPIKRN kRecFreeRecognitionData (HPAGE hPage)
 Freeing recognition data.

LETTER::fontAttrib field elements

Possible values of LETTER::fontAttrib field.

#define R_NO_ITALIC   0x0001
 Not-Italic character. It is not possible for both R_ITALIC and R_NO_ITALIC to be set. If both are unset we do not know whether it is Italic or not.
#define R_ITALIC   0x0002
 Italic character. See also R_NO_ITALIC.
#define R_NO_BOLD   0x0004
 Not-Bold character. It is not possible for both R_BOLD and R_NO_BOLD to be set. If both are unset we do not know whether it is Bold or not.
#define R_BOLD   0x0008
 Bold character. See also R_NO_BOLD.
#define R_SANSSERIF   0x0010
 Sans Serif character. It is not possible for both R_SANSSERIF and R_SERIF to be set. If both are unset we do not know whether it is Serif or not.
#define R_SERIF   0x0020
 Serif character. See also R_SANSSERIF.
#define R_PROPORTIONAL   0x0040
 Proportional character. It is not possible for both R_PROPORTIONAL and R_MONOSPACED to be set. If both are unset we do not know whether it is Monospaced or not.
#define R_MONOSPACED   0x0080
 Monospaced character. See also R_PROPORTIONAL.
#define R_SMALLCAPS   0x0100
 Character in a Small Caps word. The code is always upper case! See also RR_SMALLCAPS_TALL in the field info.
#define R_UNDERLINE   0x0200
 Underlined character.
#define R_STRIKETHROUGH   0x0400
 Struck through character. It is not used. It is only for future versions.
#define R_SUBSCRIPT   0x0800
 Subscript character.
#define R_SUPERSCRIPT   0x1000
 Superscript character.
#define R_DROPCAP   0x2000
 Dropcap character.
#define R_POPCAP   0x4000
 Popcap character.
#define R_INVERTED   0x8000
 Inverted character.

LETTER::info field macros

Macros can be used with LETTER::info field.

#define RH_OCRENGINE(info)   ((RECOGNITIONMODULE)(((info) & RH_OCRENGINE_MASK) >> 5))
 Getting the RECOGNITIONMODULE from the field info. This is the module ID of the engine that actually recognized the given character. With the PLUS engines this is usually RM_RESERVED_M.
#define RH_OCRENGINE_SET(oeng)   (((UINT)(oeng)) << 5)
 Setting the RECOGNITIONMODULE into the field info.
#define RH_OCRTYPE(info)   ((FILLINGMETHOD)(((info) & RH_OCRTYPE_MASK) >> 10))
 Getting the FILLINGMETHOD from the field info.
#define RH_OCRTYPE_SET(otype)   (((UINT)(otype)) << 10)
 Setting the FILLINGMETHOD into the field info.
#define RH_BARTYPE(info)   ((BAR_TYPE)(((info) & RH_BARTYPE_MASK) >> 24))
 Getting the BAR_TYPE from the field info.
#define RH_BARTYPE_SET(btype)   (((UINT)(btype)) << 24)
 Setting the BAR_TYPE into the field info.

Info field bits

Possible flags of LETTER::info field.

#define RR_BULLET   0x00000001
 Bullet character at bullet position.
#define RR_SOFTHYPHEN   0x00000004
 Soft hyphen.
#define RH_OCRENGINE_MASK   0x000003E0
 Mask of RECOGNITIONMODULE.
#define RH_OCRTYPE_MASK   0x00007C00
 Mask of FILLINGMETHOD.
#define RH_GTMTCH   0x00008000
 Internal use only.
#define RR_CONFIDENT_CHAR   0x00010000
 Internal use only.
#define RR_DISABLED_CHAR   0x00020000
 Internal use only.
#define RR_VOTED_CHAR   0x00040000
 Internal use only.
#define RR_NOISY_CHAR   0x00080000
 Internal use only.
#define RR_EXPANDED   0x00100000
 Internal use only.
#define RH_MANGO_ISOLATED_CH   0x00200000
 Internal use only.
#define RH_LA_INTERNAL   0x00400000
 Internal use only.
#define RH_LA_EXTERNAL   0x00800000
 NO LONGER USED.
#define RR_PDMERGE_CHAR   0x00800000
 Internal use only.
#define RH_BARTYPE_MASK   0x3F000000
 Mask of BAR_TYPE.
#define RR_DICTIONARY_WORD   0x40000000
 Dictionary word. It is set when the word is in at least one dictionary of the currently used ones. See Language of a word.
#define RR_SMALLCAPS_TALL   0x80000000
 A tall character among the small capitals. See also R_SMALLCAPS in the field fontAttrib.

LETTER::makeup field elements

Flags of end-position LETTERs (see usage of them in the table here) and direction/orientation flags (see also the section about vertical text support).

#define R_ENDOFLINE   0x0001
 End of line. In a table zone, the end of all the lines of a cell is marked by this flag.
#define R_ENDOFPARA   0x0002
 End of paragraph. This flag is used by BAR module only.
#define R_ENDOFWORD   0x0004
 End of word.
#define R_ENDOFZONE   0x0008
 End of zone.
#define R_ENDOFPAGE   0x0010
 End of page.
#define R_ENDOFCELL   0x0020
 End of table cell.
#define R_ENDOFROW   0x0040
 End of the last line of the last filled cell in a table row.
#define R_INTABLE   0x0080
 Letter is in a table cell.
#define R_TEXT_DIR_MASK   0x0700
 Mask of text direction in makeup field.
#define R_TEXT_ORIENT_MASK   0x0300
 Mask of text orientation in makeup field.
#define R_NORMTEXT   0
 Horizontal text.
#define R_VERTTEXT   0x0100
 Vertical text (CCJK) or neon text (Latin) or upside-down barcode.
#define R_LEFTTEXT   0x0200
 Left rotated / orientation is upward.
#define R_RIGHTTEXT   0x0300
 Right rotated / orientation is downward.
#define R_RTLTEXT   0x0400
 Character from a right-to-left direction word.

Space type values

Possible space types (LSPC).

#define SPC_SPACE   0
 Real space.
#define SPC_TAB   1
 Tabular.
#define SPC_LEADERDOT   2
 Dot leader.
#define SPC_LEADERLINE   3
 Line leader.
#define SPC_LEADERHYPHEN   4
 Hyphen leader.

Macros of alternatives of the LETTER

Macros can be used for processing the alternatives of each LETTER. See also usage of usage of alternatives.

#define GETFIRSTALTERN(stringstart, ndx)   (((const WCHAR*)((stringstart)+(ndx)))+1)
 Getting the first alternative.
#define GETALTERNLENGTH(str)   ((str)[-1])
 Getting the length of the alternative.
#define GETNEXTALTERN(str)   ((str)+GETALTERNLENGTH(str)+2)
 Getting the next alternative.

Defines of confidence handling of the LETTER

See confidence handling and LETTER::err.

#define RE_SUSPECT_WORD   0x80
 The word is declared suspicious by the recognition engine if the dictionary (if any) does not contain it. This flag does not necessarily reflect whether the word is a dictionary word or not.
#define RE_SUSPECT_THR   64
 Suspect threshold: if the lower 7 bits of LETTER::err represent a value at or above this (up to 100) it means low confidence.
#define RE_ERROR_LEVEL_MASK   ~RE_SUSPECT_WORD
 Mask for getting the error level of the current letter.

Detailed Description

Letter handling tools.

Recognized data is stored in the current HPAGE and it is available as an array of LETTER structures providing significantly more information than the character code itself. This type of output offers the most detailed information on recognition. The information stored in a LETTER structure may belong to the character itself (character code, position, size, confidence level, font attributes, font face, choices, color) or to the word containing the character (suggestions, languages). Word-level information is set in the first LETTER of the word.

NOTE: In both the SDK and its documentation, coordinates refer to grid-coordinates - i.e. the top or left borders of pixels. Thus a rectangle does not contain the pixels according to its right and bottom coordinates.

Handling of spaces

Spaces have a special role in the text, thus their handling is also special. There are two kinds of spaces in the recognition result.

One of them is the space-like character. It really appears in the original text and it is represented with a LETTER having a space character in its code field and an LSPC structure containing information about this character. The SPACE and TAB characters and the leaders belong to this type.

The other kind of space is the dummy space. It does not appear in the original text, but it has an individual LETTER object. It indicates the end of the line only when this is also the end of the word (i.e. the last character of the line is not a hyphen). It has a role only when the User writes the recognition result directly from the LETTER array into a pure TXT file without analysing any formatting flags (e.g. font attributes, end of lines, etc.). To handle this case, a space (the dummy space) is inserted between the last word of the line and the first word of the next line.

The LETTER has size information about the represented character. However the width of the dummy space is zero, because it is in fact not in the original text.

Barcode module (BAR) has a special, binary recognition mode, when the recognition result contains binary data (not a text). (See the setting Kernel.OcrMgr.BarBinary for more information.) In this case, the content of the barcode is logically one word in one line, and the result gets a dummy space at the end only for uniformity.

The notion of word in CSDK

The last letter (maybe punctuation character or digit) of a word is the LETTER having an R_ENDOFWORD flag. The beginning of a word is the first non-space character after the previous word (or the very first item of the LETTER array). The flag R_ENDOFLINE does not play a role in determining word boundaries (e.g. hyphenation).

Special cases:

Word-related information (like the language of a word or RE_SUSPECT_WORD, etc.) is specified on all the characters of the word. The only exception is suggestion handling where suggestions are attached to the first character of the word only. (Note that suggestion handling uses a different word notion: space-separated words.)

End-position letters

The letters in ending positions are marked with particular flags. See above section for details about end of word. The end of line flag in a flowing text is generally on the above mentioned dummy space. However, if the last character of a line is a hyphen in a hyphenated word, the flag R_ENDOFLINE is put on the hyphen and the dummy space is missing from this line.

In a table the situation of end-position flags is more difficult. The next figure shows all the possible situations of the R_ENDOFLINE (L), R_ENDOFCELL (C), R_ENDOFROW (R) and R_ENDOFZONE (Z) flags in a table.

text L,C text L,C text L,C,R
text L,C more L
lines in L
a cell L,C,R
two-line L
text L,C
text L,C,R
last filled L
cell L,C,R,Z

Usage of alternatives

The common name for LETTER choices and word suggestions is 'alternatives'. You can use different alternatives similarly. They can be accessed through special WCHAR typed arrays. Every single alternative is a special string with its size in its 0th WCHAR element and an ending zero WCHAR. You can get WCHAR arrays listing of all alternatives in the recognition data - one for choices and one for suggestions. Use the functions kRecGetChoiceStr, and kRecGetSuggestionStr, respectively.

One LETTER contains an index to the list of the alternatives that points to its first alternative and has a counter with the number of its alternatives. All LETTERs can have choices (LETTER::ndxChoices), but only the first LETTER of a word refers to the suggestions (LETTER::ndxSuggestions). The scope of such a suggestion is the space-terminated word. (Note that it can differ from the end of the word notion used by spelling.)

The alternatives of a LETTER can be enumerated using the macros GETFIRSTALTERN, GETNEXTALTERN and GETALTERNLENGTH. See the following sample code on how to use them:

    RECERR err;
    HPAGE hPage;
    LETTER *pLetters;
    WCHAR *pChoices;
    LONG nLetters, choiceStrLen;

    ...
    err = kRecGetLetters(hPage, II_CURRENT, &pLetters, &nLetters);
    if (err != REC_OK)
        ... // Doing some error handling
    ...
    err = kRecGetChoiceStr(hPage, &pChoices, &choiceStrLen);
    if (err != REC_OK)
        ... // Doing some error handling
    for (LONG lettn=0; lettn<nLetters; lettn++)
    {
        ...
        const WCHAR *choice = GETFIRSTALTERN(pChoices, pLetters[lettn].ndxChoices);
        for (BYTE chon=1; chon<pLetters[lettn].cntChoices; chon++)
        {
            ... // Doing some choice handling
            choice = GETNEXTALTERN(choice);
        }
        ...
    }
    ...

Consecutive words can have the same suggestion indices - that is, the given suggestions are common to the group of the given words. This is the case when the suggestion combines two space-separated words into a single one without the space.

Since the first LETTER of a word cannot be a space, spaces do not have suggestions, but they have space information (LSPC) in the same union type (see below for more information about space handling).

Font faces can be accessed in a string of C-type strings. The LETTER indexes into this string at the first character of its font face name.


Enumeration Type Documentation

Possible places where letter array is to be copied to.

Enumerator:
LTS_FINAL 

Letters are put directly onto the input of the output conversion step.

LTS_STRONG 

Letters are put onto the strong input of the PLUS2W and PLUS3W engines.

LTS_MEDIUM 

Letters are put onto the medium input of the PLUS2W and PLUS3W engines.

LTS_WEAK 

Letters are put onto the weak input of the PLUS3W engine.

LTS_SIZE 

Number of LETTER indices (for verifying index validity).


Function Documentation

RECERR RECAPIKRN kRecFreeRecognitionData ( HPAGE  hPage)

Freeing recognition data.

The kRecFreeRecognitionData function destroys the recognized data (memory object) belonging to the hPage page.

Parameters:
[in]hPageHandle of the page having the data to be removed.
Return values:
RECERR
Note:
The effect of this call is the same as if the application had not called the kRecRecognize function.
The specification of this function in C# is:
 RECERR kRecFreeRecognitionData(IntPtr hPage); 
RECERR RECAPIKRN kRecGetChoiceStr ( HPAGE  hPage,
WCHAR **  ppChoices,
LPLONG  pLength 
)

Getting choices.

The kRecGetChoiceStr function makes the alternative letter choices data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of WCHAR structures. For more about its internal structure see the usage of alternatives. A LETTER contains the number of its choices and an index into this array on the first choice (LETTER::cntChoices, LETTER::ndxChoices).

Parameters:
[in]hPageHandle of the page whose recognized data should be accessed.
[out]ppChoicesAddress of a pointer variable to get the array of the recognized alternative characters and ligatures.
[out]pLengthPointer to a variable to hold the length of recognized alternative characters.
Return values:
RECERR
Note:
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetChoiceStr(IntPtr hPage, out char[] ppChoices); 
RECERR RECAPIKRN kRecGetFontFaceStr ( HPAGE  hPage,
char **  ppFontFaces,
LPLONG  pLength 
)

Getting font faces.

The kRecGetFontFaceStr function makes the font face data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of char strings. A LETTER contains an index into this array on its font face (LETTER::ndxFontFace).

Parameters:
[in]hPageHandle of the page whose recognized data should be accessed.
[out]ppFontFacesAddress of a pointer variable to get the UTF-8 string of the recognized font faces.
[out]pLengthPointer to a variable to hold the length of recognized font face string.
Return values:
RECERR
Note:
Font face information is available only at processing PDF files with accessible text layer.
Since this function creates a new memory object, after evaluating the result, the application should call the kRecFree function to free this memory area.
The specification of this function in C# is:
 RECERR kRecGetFontFaceStr(IntPtr hPage, out char[] ppFontFaces); 
RECERR RECAPIKRN kRecGetLetterPalette ( HPAGE  hPage,
REC_COLOR **  ppColours,
LPLONG  pNum 
)

Getting palette of recognition data.

This function makes the palette of the recognition data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. It contains both the foreground and background colors of the letters. The LETTER structure has indices into this array for foreground and background colors (LETTER::ndxFGColor, LETTER::ndxBGColor).

Parameters:
[in]hPageHandle of the page whose recognized data should be accessed.
[out]ppColoursAddress of a pointer variable to get the address of the palette array.
[out]pNumPointer to a variable to hold the number of colors in palette.
Return values:
RECERR
Note:
Palette can contain the special REC_COLOR values REC_DEFAULT_COLOR and REC_UNDEF_COLOR. Background color can be both, they mean white. Foreground color can be REC_DEFAULT_COLOR, which means black.
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetLetterPalette(IntPtr hPage, out uint[] ppColours); 
RECERR RECAPIKRN kRecGetLetters ( HPAGE  hPage,
IMAGEINDEX  iiImage,
LPLETTER ppLetter,
LPLONG  pLettersLength 
)

Getting recognition result.

The kRecGetLetters function makes the recognition data belonging to the hPage page available to the application by creating a new memory object containing the recognized data. This function can be called after a successful kRecRecognize call. The recognized data is available as an array of LETTER structures.

Parameters:
[in]hPageHandle of the page whose recognized data should be accessed.
[in]iiImageIndex of the image in the page, in which the coordinates are needed to be given.
[out]ppLetterAddress of a pointer variable to get the address of the recognized characters.
[out]pLettersLengthPointer to a variable to hold the number of recognized characters.
Return values:
RECERR
Note:
Since this function creates a new memory object containing the recognized data, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetLetters(IntPtr hPage, IMAGEINDEX iiImage, out LETTER[] ppLetter); 
RECERR RECAPIKRN kRecGetSuggestionStr ( HPAGE  hPage,
WCHAR **  ppSuggestions,
LPLONG  pLength 
)

Getting suggestions.

The kRecGetSuggestionStr function makes the word suggestions data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of WCHAR structures. For more about its internal structure see the usage of alternatives. The first LETTER of a word contains the number of word choices and an index into this array on the first suggestion (LETTER::cntSuggestions, LETTER::ndxSuggestions).

Parameters:
[in]hPageHandle of the page whose recognized data should be accessed.
[out]ppSuggestionsAddress of a pointer variable to get the array of the recognized suggestions.
[out]pLengthPointer to a variable to hold the length of recognized suggestions.
Return values:
RECERR
Note:
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
If the letter is a space, it does not have suggestions, but only space info (see LETTER::spcInfo and LSPC).
The specification of this function in C# is:
 RECERR kRecGetSuggestionStr(IntPtr hPage, out char[] ppSuggestions); 
RECERR RECAPIKRN kRecSetLetters ( LETTERSTRENGTH  towhere,
HPAGE  hPage,
IMAGEINDEX  iiImage,
LPCLETTER  pLetter,
LONG  LettersLength 
)

Putting a letter buffer onto the input of the PLUS2W and PLUS3W engines or the selected output converter.

This function can affect the recognition results and/or the content of the output file. The PLUS modules are voting engines combining results of two or three other OCR engines. The voting method of RM_OMNIFONT_PLUS2W has strong and medium inputs, RM_OMNIFONT_PLUS3W uses an additional weak one as well. You can replace one input (parameter towhere) with your alternative engine result by calling the function kRecSetLetters. The voting method uses your letter buffer as it generates the final OCR result. Stronger input may have greater effect on the recognition result, so you should consider which level you select for your letter buffer.

Passing the letter buffer on the level LTS_FINAL the OCR method does not run, because in this level kRecSetLetters works similarly as in previous versions of CSDK, i.e. the letters are given directly to the input of the selected output converter.

Parameters:
[in]towhereThis parameter specifies one of the three possible inputs of the Voting Engine, on which the engine receives the letter buffer.
[in]hPageHandle of the HPAGE the Voting Engine works on.
[in]iiImageIndex of the image in the page whose coordinate system you have used in defining the boundary box for LETTER.
[in]pLetterThe letter buffer to be given to the engine.
[in]LettersLengthSize of the letter buffer.
Return values:
RECERR
Note:
After putting letters on the selected levels (even on LTS_FINAL), you should call kRecRecognize.
More than one input way of the PLUS engines can be replaced with subsequent calls for kRecSetLetters. Even in such a case, kRecRecognize should be called only once.
The following fields of input LETTERs are unused during kRecRecognize: cntChoices, ndxChoices, cntSuggestions, ndxSuggestions, reserved_b, ndxFGColor, ndxBGColor, ndxFontFace, ndxExt and OCRENGINE bits (RH_OCRENGINE_MASK) of info. These fields are cleared and the original order of LETTERs may be altered after using this function.
The specification of this function in C# is:
 RECERR kRecSetLetters(LETTERSTRENGTH towhere, IntPtr hPage, IMAGEINDEX iiImg, LETTER[] lpLetter);