RecAPI
Zone Handling Module

Zone handling tools. More...

Classes

struct  ZONE
 ZONE structure. More...
struct  ZONEDATA
 ZONEDATA structure. More...

Modules

 Table Recognition Module
 

Table detection and working with tables.


Typedefs

typedef ZONELPZONE
 Pointer to a structure ZONE.
typedef const ZONELPCZONE
 Const pointer to a structure ZONE.
typedef ZONEDATALPZONEDATA
 Pointer to a structure ZONEDATA.
typedef const ZONEDATALPCZONEDATA
 Const pointer to a structure ZONEDATA.

Enumerations

enum  FILLINGMETHOD {
  FM_DEFAULT = 0,
  FM_OMNIFONT,
  FM_DRAFTDOT9,
  FM_BARCODE,
  FM_OMR,
  FM_HANDPRINT,
  FM_BRAILLE,
  FM_DRAFTDOT24,
  FM_OCRA,
  FM_OCRB,
  FM_MICR,
  FM_BARCODE2D,
  FM_DOTDIGIT,
  FM_DASHDIGIT,
  FM_RESERVED_2,
  FM_CMC7,
  FM_NO_OCR,
  FM_SIZE
}
 Filling methods. More...
enum  RECOGNITIONMODULE {
  RM_AUTO = 0,
  RM_OMNIFONT_MTX,
  RM_OMNIFONT_MOR,
  RM_DOT,
  RM_BAR,
  RM_OMR,
  RM_HNR,
  RM_RER,
  RM_BRA,
  RM_MAT,
  RM_RESERVED_P,
  RM_OMNIFONT_PLUS2W,
  RM_OMNIFONT_FRX,
  RM_OMNIFONT_PLUS3W,
  RM_ASIAN,
  RM_RESERVED_M,
  RM_RESERVED_A,
  RM_SIZE
}
 Recognition modules (Engines) More...
enum  ZONETYPE {
  WT_FLOW,
  WT_TABLE,
  WT_GRAPHIC,
  WT_AUTO,
  WT_IGNORE,
  WT_FORM,
  WT_VERTTEXT,
  WT_LEFTTEXT,
  WT_RIGHTTEXT
}
 Basic zone types. More...
enum  IMG_DECOMP {
  DCM_AUTO,
  DCM_LEGACY,
  DCM_STANDARD,
  DCM_FAST
}
 Page parse method. More...

Functions

RECERR RECAPIKRN kRecSetDecompMethod (int sid, IMG_DECOMP Algorithm)
 Setting the decomp method.
RECERR RECAPIKRN kRecGetDecompMethod (int sid, IMG_DECOMP *pAlgorithm)
 Getting the decomp method.
RECERR RECAPIKRN kRecSetNongriddedTableDetect (int sid, INTBOOL bEnable)
 Setting the non-gridded table detection.
RECERR RECAPIKRN kRecGetNongriddedTableDetect (int sid, INTBOOL *bEnable)
 Getting the non-gridded table detection setting.
RECERR RECAPIKRN kRecSetForceSingleColumn (int sid, INTBOOL bForceSingle)
 Specify the Force Single Column mode.
RECERR RECAPIKRN kRecGetForceSingleColumn (int sid, INTBOOL *pbForceSingle)
 Getting the setting of Force Single Column mode.
RECERR RECAPIKRN kRecLocateZones (int sid, HPAGE hPage)
 Page parsing.
RECERR RECAPIKRN kRecSetPageDescription (int sid, DWORD PageDesc)
 Setting page description data.
RECERR RECAPIKRN kRecGetPageDescription (int sid, DWORD *pPageDesc)
 Getting page description data.
RECERR RECAPIKRN kRecGetZoneCount (HPAGE hPage, int *pnZones)
 Getting the user zone count.
RECERR RECAPIKRN kRecGetZoneInfo (HPAGE hPage, IMAGEINDEX iiImg, LPZONE pZone, int nZone)
 Getting user zone information.
RECERR RECAPIKRN kRecGetZoneLayout (HPAGE hPage, IMAGEINDEX iiImg, LPRECT *ppRects, int *pnRects, int iZone)
 Getting user zone shape information.
RECERR RECAPIKRN kRecGetZoneNodeArray (HPAGE hPage, IMAGEINDEX iiImg, LPPOINT *ppPoints, int *pnNodes, int iZone)
 Getting the polygon of the user zone.
RECERR RECAPIKRN kRecDeleteAllZones (HPAGE hPage)
 Deleting all user zones.
RECERR RECAPIKRN kRecDeleteZone (HPAGE hPage, int nZone)
 Deleting a user zone.
void RECAPIKRN kRecInitZone (LPZONE pZone)
 Initializing a ZONE variable.
RECERR RECAPIKRN kRecInsertZone (HPAGE hPage, IMAGEINDEX iiImg, LPCZONE pZone, int nZone)
 Inserting a user zone.
RECERR RECAPIKRN kRecAddZoneRect (HPAGE hPage, IMAGEINDEX iiImg, const RECT *pRect, int nZone)
 Adding a rectangle to a user zone.
RECERR RECAPIKRN kRecSubZoneRect (HPAGE hPage, IMAGEINDEX iiImg, const RECT *pRect, int nZone)
 Subtracting a rectangle from a user zone.
RECERR RECAPIKRN kRecCopyOCRZones (HPAGE hPage)
 Copying the OCR zone list to a user zone list.
RECERR RECAPIKRN kRecLoadZones (HPAGE hPage, LPCTSTR pFileName)
 Loading user zones.
RECERR RECAPIKRN kRecSaveZones (HPAGE hPage, LPCTSTR pFileName)
 Saving the user zone list.
RECERR RECAPIKRN kRecUpdateZone (HPAGE hPage, IMAGEINDEX iiImg, LPCZONE pZone, int nZone)
 Updating a user zone.
RECERR RECAPIKRN kRecSetZoneLayout (HPAGE hPage, IMAGEINDEX iiImg, LPCRECT pRects, int nRects, int nZone)
 Updating the user zone shape information.
RECERR RECAPIKRN kRecGetOCRZoneCount (HPAGE hPage, int *pnOCRZones)
 Getting the OCR zone count.
RECERR RECAPIKRN kRecGetOCRZoneInfo (HPAGE hPage, IMAGEINDEX iiImg, LPZONE pOCRZone, int nOCRZone)
 Getting OCR zone information.
RECERR RECAPIKRN kRecGetOCRZoneData (HPAGE hPage, IMAGEINDEX iiImg, LPZONEDATA pOCRZoneData, int nOCRZone)
 Getting additional information about OCR zones.
RECERR RECAPIKRN kRecGetOCRZoneLayout (HPAGE hPage, IMAGEINDEX iiImg, LPRECT *ppRects, int *pnRects, int nZone)
 Getting OCR zone shape information.
RECERR RECAPIKRN kRecGetOCRZoneNodeArray (HPAGE hPage, IMAGEINDEX iiImg, LPPOINT *ppPoints, int *pnNodes, int iZone)
 Getting the polygon of the OCR zone.
RECERR RECAPIKRN kRecSaveOCRZones (HPAGE hPage, LPCTSTR pFileName)
 Saving the OCR zone list.
RECERR RECAPIKRN kRecUpdateOCRZone (HPAGE hPage, IMAGEINDEX iiImg, LPCZONE pZone, int nZone)
 Updating the OCR zone.

Bitmasks of checking control

Defining spell checking behavior by zones. See ZONE::chk_control.

#define CHK_LANGDICT_PROHIBIT   0x00000001
 Prohibit the use of the Language dictionary.
#define CHK_USERDICT_PROHIBIT   0x00000002
 Prohibit the use of the user dictionary.
#define CHK_CHECKCBF_PROHIBIT   0x00000004
 Deprecated.
#define CHK_VERTDICT_PROHIBIT   0x00000008
 Prohibit the use of the Vertical dictionary.
#define CHK_IGNORE_WHITESPACE   0x00000010
 Ignore white space characters (SPACE and TAB characters) during checking. This field should be used together with the CHK_PASS_LINES flag.
#define CHK_IGNORE_CASE   0x00000020
 Case insensitive UD-checking. */.
#define CHK_PASS_LINES   0x00000040
 Instructs the selected RECOGNITIONMODULE to pass entire lines to the checker, instead of words. Do not use this attribute in conjunction with spell checking.
#define CHK_CORRECTION_DISABLED   0x00000080
 Retained only for compatibility.
#define CHK_INCLUDE_PUNCTUATION   0x00000100
 Checking will consider punctuation characters on the boundaries of the strings as well.
#define CHK_CORRECT_PROPERNAMES   0x00000200
 Retained only for compatibility.
#define CHK_LANGDICT_USED   0x00010000
 "After recognition flag": the Language dictionary was enabled during the checking process (spell checking was activated for the zone).
#define CHK_USERDICT_USED   0x00020000
 "After recognition flag": the user dictionary was enabled during the checking process (UD-checking was activated for the zone).
#define CHK_CHECKCBF_USED   0x00040000
 Deprecated.
#define CHK_VERTDICT_USED   0x00080000
 "After recognition flag": a Vertical dictionary was enabled during the checking process.

Page Descriptor defines

Defining behavior of auto-zoning outside user zones. See the usage of page descriptor.

#define LZ_COLUMN_MASK   0x000000ff
 This can be used for masking the LZ_COLUMN flag.
#define LZ_COLUMN_NO   0x00000001
 This does not find text zones on the page.
#define LZ_COLUMN_ONE   0x00000002
 The page contains one column (single column mode).
#define LZ_COLUMN_AUTO   0x00000004
 This finds text zones on the page automatically.
#define LZ_COLUMN_FIND   0x00000008
 Internal use only.
#define LZ_TABLE_MASK   0x0000ff00
 This can be used for masking the LZ_TABLE flag.
#define LZ_TABLE_NO   0x00000100
 This does not find tables on the page.
#define LZ_TABLE_ONE   0x00000200
 The whole page is one table.
#define LZ_TABLE_AUTO   0x00000400
 This finds table zones automatically.
#define LZ_GRAPHICS_MASK   0x00ff0000
 This can be used for masking the LZ_GRAPHICS flag.
#define LZ_GRAPHICS_NO   0x00010000
 This does not find graphics on the page.
#define LZ_GRAPHICS_ONE   0x00020000
 The whole page is one graphic.
#define LZ_GRAPHICS_AUTO   0x00040000
 This finds graphic zones automatically.
#define LZ_FORM   0x01000000
 The page contains an unfilled form. Do not create any user zones if you use LZ_FORM! See Form Recognition Module as well.
#define LZ_FREEFORM   0x02000000
 This can be used for recognition of free forms. This is when a page contains a filled, gridded form, and the best possible OCR accuracy is desired, without creating formatted output. In this case the gridded form is decomposed into smaller text zones optimized for OCR. The zones are not sorted by reading order. It cannot be combined with LZ_FORM. It is used only by DCM_STANDARD method, and not used by the Asian auto-zoning algorithms.

Detailed Description

Zone handling tools.

The zone is a rectangular area or the union of specifically located rectangular areas in the page. The upper limit of its dimensions is full page size. It also contains a feature of interest to the user.

The union of rectangles must have a so-called pizzabox shape: the top of each rectangle in the union must touch the bottom of the upper rectangle (i.e. the bottom of the upper one and the top of the lower one is exactly the same). A rectangle can touch at most one rectangle above and one below.

Zones that cannot have a pizzabox shape include:

A pizzabox-shaped zone is a compound and irregular zone.
The image data covered by each zone is handled and processed (typically recognized) separately, according to zone-specific parameters.

NOTE: In both the SDK and its documentation coordinates refer to grid-coordinates - i.e. the top or left borders of pixels. Thus a rectangle does not contain the pixels according to its right and bottom coordinates.

Any HPAGE can contain two types of zones in zone lists:

The user zones are defined by the User. The OCR zones are created by the page parser process, which detects the OCR zones and fills the OCR zone list. When there are user zones, the page parser creates one or more OCR zones from each user zone and it may process the area outside of user zones, as controled by the page descriptor (see below).

IMPORTANT: The CSDK does not support overlapping non-graphical user zones. A graphical user zone (WT_GRAPHIC) can overlap non-graphical ones. Furthermore, the auto zoning algorithm may create graphical OCR zones overlapping non-graphical ones.

The type of OCR zones never can be WT_AUTO and WT_IGNORE. The created OCR zones always inherit the attributes (e.g.: filter, filling method, etc.; see ZONE) of the user zone inside which they were created. If an OCR zone is created outside user zones, its attributes will be set to default for filling method, recognition module, filters and spell checking related properties.

The recognition process (kRecRecognize) works on OCR zones.

The number of zones in the zone lists can be queried at any time using the functions kRecGetZoneCount and kRecGetOCRZoneCount. All functions that use an index to determine the zone to be queried or modified may receive the index -1. This refers to the last zone in the given zone list. Exception: kRecInsertZone : Assign the value -1 to have the new zone inserted at the end of the zone list. From then on the value -1 refers to this inserted zone.

Zones can be added to the appropriate zone list of any given HPAGE in three different ways:

  1. add zones automatically (auto-zoning) - use it to add OCR zones
  2. add zones manually (by specifying the zone coordinates and attributes - use it to add user zones, or
  3. add zones from a zone file (a storage file) - use it to add user zones.

Auto-zoning

Automatic page-layout decomposition process (auto-zoning) can be activated directly by calling the kRecLocateZones function for finding text blocks on the image. It creates an entire OCR zone list for the given HPAGE.

OmniPage Capture SDK v20 offers three different algorithms to be applied during auto-zoning: use the kRecSetDecompMethod function to specify the Page parser algorithm to be applied during auto-zoning. For details, see also IMG_DECOMP.

When you use auto-zoning, each resulting zone is initialized with

All zones created by this function will have

If auto-zoning uses the method DCM_STANDARD, the process will also attempt finding horizontal and vertical rule lines. If there are user zones, auto-zoning searches for rule lines in WT_AUTO and WT_TABLE zones and also outside the user zones - when the page descriptor allows it (see Page Descriptor defines). If there are no user zones, rule line detection is performed. Rule lines are stored in the page, in a line list. The recognition process (PID_RECOGNITION1 et al.) modifies this line list and it retains only the lines that are outside the OCR zones. After zoning, the line list does not contain dotted, dashed or double style lines (RLSTYLE). This information only becomes available after the recognition process.

OCR zones may be changed by the recognition process (kRecRecognize), because some post-processing operations have such effects. For example, when non-gridded table detection (kRecSetNongriddedTableDetect) runs during the recognition process.

Any zone can be locally overridden with the functions kRecUpdateZone and kRecUpdateOCRZone. These allow you to change the attributes of a zone in the zone list. Note that the fields ZONE::rectBBox, and ZONE::type cannot be modified by kRecUpdateOCRZone.

Adding zones manually

You can choose to search for zones automatically, and/or create your own zones: user zones. To add simple zones to the zone list manually, use the kRecInsertZone function. To add a rectangle to or subtract a rectangle from an existing user zone, use the functions kRecAddZoneRect or kRecSubZoneRect.

Adding zones from a zone file

The third way of creating zones is to have zones read from a file (called a zone file, or in OmniPage terminology, a zone template file) that contains the attributes of previously saved zones. Zones created this way will also be user zones. An integrating application can save the current user zone definitions to a zone file any time with the kRecSaveZones function. The application can load them from a zone file with the kRecLoadZones function.

NOTE: When a zone file is loaded, any previous zones are removed from the page.

If the application calls the kRecRecognize recognizing function on a page with an empty zone list, the page-layout decomposition function is called automatically.

It is recommended to create homogeneous user zones as much as possible, because they may give better results. It is especially important in the case of Asian languages (either CCJK, Arabic, Thai or Hebrew). WT_AUTO zones can be inhomogeneous.

To get information about any particular zone in the image zone list, invoke the kRecGetZoneInfo and kRecGetOCRZoneInfo functions. These functions are useful to find out more about zones created by auto-zoning.

NOTE: When you update a table-type zone with the kRecUpdateZone function, the cell-detection algorithm will not be activated, resulting in improper table detection within the zone. See the description of creation of table information.

Any changes in user zone list (kRecInsertZone, kRecDeleteZone, kRecDeleteAllZones, kRecAddZoneRect, kRecSubZoneRect, kRecLoadZones, kRecUpdateZone, kRecSetZoneLayout) will make OCR zones invalid; the OCR zone list will be emptied and regenerated.

Usage of Page Descriptor

The page description describes the possible layout elements (text, table, graphics and form) on the page outside of the user zones. These layout elements are found by the page-parse and the recognition processes. The page description has no effect inside the user zones. The LZ_COLUMN / LZ_TABLE / LZ_GRAPHICS flags specify how to find text / table / graphic zones.

A valid page description is either a combination of the LZ_COLUMN_column, LZ_TABLE_table and LZ_GRAPHICS_graphics flags, (where column, table and graphics can be NO, ONE and AUTO) or the LZ_FORM flag (LZ_FORM cannot be combined with other flags). The default page description is LZ_COLUMN_NO | LZ_TABLE_NO | LZ_GRAPHICS_NO. This means that page-parse does not create OCR zones outside of the user zones. If no user zones were specified LZ_COLUMN_NO | LZ_TABLE_NO | LZ_GRAPHICS_NO is equivalant to LZ_COLUMN_AUTO | LZ_TABLE_AUTO | LZ_GRAPHICS_AUTO.

IMPORTANT NOTE: if the page descriptor is set to LZ_FORM, there must not be any zones on the page.

If LZ_TABLE_ONE is set LZ_COLUMN_column and LZ_GRAPHICS_graphics are not considered. If LZ_GRAPHICS_ONE is set LZ_COLUMN_column and LZ_TABLE_table are not considered. If both LZ_TABLE_ONE and LZ_GRAPHICS_ONE are set the zoning works as if only LZ_TABLE_ONE was set.

The DCM_LEGACY and DCM_FAST mode zoning can handle only the following cases:

any other cases cause an error (API_ERROR_ERR).

Page descriptor can be specified by the function kRecSetPageDescription.

Other related information

Table detection and handling by Table Recognition Module are also parts of this module.

The working of the Zone Handling Module can also be adjusted with settings in some points.

Detection of filling method of zones

Detection of filling method of zones can be performed manually calling the function kRecDetectFillingMethod. It works on zones with FM_DEFAULT. If the default filling method (kRecSetDefaultFillingMethod) is set to FM_DEFAULT, the filling method detection is called automatically at the beginning of recognition process.

If filling method detection cannot determine a type in a given zone, it leaves FM_DEFAULT in the field fm of the zone.

During recognition if both the default filling method and the field ZONE::fm are FM_DEFAULT, the engine supposes FM_OMNIFONT filling method for such zones.

Vertical text support

For Western languages, including Greek and languages using the Cyrillic alphabet, automatic detection of left or right rotated vertical text is available, including detection inside table cells.

Auto-detected vertical text zones outside tables take the flags WT_LEFTTEXT or WT_RIGHTTEXT while table cells detected as containing vertical text now include this in the CELL_INFO data.

This auto-detection runs on images with no inserted user zones, or on page portions designated for auto-zoning. Text direction can be forced by inserting user zones into page images containing Western or Cyrillic languages, using the following flags: normal (WT_FLOW), left rotated vertical (WT_LEFTTEXT) or right rotated vertical (WT_RIGHTTEXT). Vertical text user zones must be rectangular; they can be placed anywhere on the page and can cover multi-line texts. To force vertical text handling in a table cell, the required flag should be set in the new text type field inside CELL_INFO.

Automatic left and right text detection can be disabled by switching the Kernel.Decomp.FindRotatedText setting off. Switch this setting off if the processed document does not contain vertical text, because the vertical text detection (and recognition) may increase processing time.

Vertical text can also be auto-detected for CCJK languages. Alternatively, it can be explicitly set by inserting zones with the flags WT_FLOW for horizontal left-to-right text or WT_VERTTEXT for vertical text with top-to-bottom character flow and right-to-left line flow. As for Western languages, CCJK vertical text zones must be rectangular. If an irregular zone is changed to vertical text content, its shape snaps to a bounding rectangle, removing the irregularity. It is not possible to pass text direction information for table cells to the Asian OCR module – it will perform auto-detection.

In the recognition results, the LETTER structure makeup field contains two additional bits to store the text direction. See also which output converters and formatting levels of the RecAPIPlus support the different types of vertical texts.


Enumeration Type Documentation

Filling methods.

This enumerates the possible content types of the zones from the Engine's perspective. Each zone must have one of the filling methods listed here. It can be done by specifying the fm field of the zones defined on the image.

Note:
Each zone needs to be associated with a RECOGNITIONMODULE (RM) and a FILLINGMETHOD (FM). This is needed because some RMs support more than one FM, and some FMs are accepted by more than one RM. It is important to ensure that the RM-FM pair is suitable. For possible RM-FM pairs see Filling method - Recognition module combinations.
Enumerator:
FM_DEFAULT 

The default zone filling method. The actual zone filling method for all zones of this type will be inquired just before recognition, according to the setting previously specified with a separate call to the kRecSetDefaultFillingMethod function.

FM_OMNIFONT 

The omnifont zone filling method. It denotes a machine printed text with any typeface not highly stylized. All platforms.

FM_DRAFTDOT9 

The 9-pin draft dot-matrix zone filling method. It denotes a 9-pin draft dot-matrix printout. Supported on: Windows.

FM_BARCODE 

The 1D barcode zone filling method. It denotes a one-dimensional barcode within the zone.

FM_OMR 

The optical mark zone filling method. It denotes a zone with one or more checkboxes that are judged to be marked or unmarked.

FM_HANDPRINT 

The hand-written zone filling method. It denotes hand-written text within the zone. Supported on: Windows, Mac OS X.

FM_BRAILLE 

This filling method is NOT available.

FM_DRAFTDOT24 

The 24-pin draft dot-matrix zone filling method. It denotes a 24-pin draft dot-matrix printout. All platforms.

FM_OCRA 

The OCR-A zone filling method.

FM_OCRB 

The OCR-B zone filling method.

FM_MICR 

The magnetic ink character filling method. Supported on: Windows, Mac OS X.

FM_BARCODE2D 

The 2D barcode zone filling method.

FM_DOTDIGIT 

The dot-digit zone filling method. Supported on: Windows.

FM_DASHDIGIT 

The dash-digit zone filling method. Supported on: Windows.

FM_RESERVED_2 

Internal use only.

FM_CMC7 

The CMC7 font zone filling method. Supported on: Windows, Mac OS X.

FM_NO_OCR 

No recognition will be attempted.

FM_SIZE 

Number of zone filling methods.

enum IMG_DECOMP

Page parse method.

This enum lists the possible values of the Page parser algorithm settings of the Engine. This setting makes it possible to specify one of the three different page parser algorithms for Latin-alphabet languages, or one of the two different algorithms for CCJK languages. In the latter case DCM_LEGACY and DCM_FAST are the same. This setting has no effect for Arabic, Thai and Hebrew OCR.

Enumerator:
DCM_AUTO 

Automatic. The Engine determines at runtime which algorithm will be applied. This depends on the trade-off setting: DCM_LEGACY algorithm is applied when trade-off is TO_FAST, otherwise DCM_STANDARD is applied (Default.)

DCM_LEGACY 

Simple page decomposition. This is faster but less accurate than Standard.

DCM_STANDARD 

The most accurate page parsing module.

DCM_FAST 

For non-Asian languages this algorithm can be treated as a simplified variant of DCM_LEGACY: this algorithm does not try to find tables on the page. It is the fastest (but least accurate) parsing algorithm available. Use it only for simple page parsing tasks! For CCJK languages this is the same as DCM_LEGACY.

Recognition modules (Engines)

This enumerates the different recognition modules of the Engine available to the integrating application. All zones must have an assigned recognition module in their rm fields before processing.

Note:
Only those recognition modules that have been supplied, licensed and successfully installed in the Engine configuration will be available.
The RM_AUTO setting can be specifically placed in the zone's rm field. It is also generated by the kRecLocateZones (auto-zoning) function and, under specific circumstances, by the function kRecDetectFillingMethod.
Each zone needs to be associated with a RECOGNITIONMODULE (RM) and a FILLINGMETHOD (FM). This is needed because some RMs support more than one FM, and some FMs are accepted by more than one RM. It is important to ensure that the RM-FM pair is suitable. For possible RM-FM pairs see Filling method - Recognition module combinations.
Enumerator:
RM_AUTO 

Specifies that the Engine will automatically attempt to select the most suitable recognition module for the zone. This will be determined just before recognition, according to the zone's filling method and, if necessary, other settings (most typically the Character Set).

RM_OMNIFONT_MTX 

Specifies the M/TEXT omnifont recognition module. Supported on: Windows.

RM_OMNIFONT_MOR 

Specifies the multi-lingual MOR omnifont recognition module. Supported on: Windows, Linux.

RM_DOT 

Specifies the DOT 9-pin draft dot-matrix recognition module. Supported on: Windows.

RM_BAR 

Specifies the BAR barcode recognition module. Supported on: Windows, Linux, Embedded Linux, Android.

RM_OMR 

Specifies the OMR optical mark recognition module. Supported on: Windows, Linux, Embedded Linux, Android.

RM_HNR 

Specifies the HNR handprinted numeral recognition module. Supported on: Windows.

RM_RER 

Specifies the third-party RER reRecognition Handprint recognition module. Supported on: Windows, Linux, Mac OS X.

RM_BRA 

This recognition module is NOT supported.

RM_MAT 

Specifies the MAT matrix matching recognition module. Supported on: Windows.

RM_RESERVED_P 

Internal use only. Must NOT be selected.

RM_OMNIFONT_PLUS2W 

2-way voting engine. All platforms.

RM_OMNIFONT_FRX 

FRX engine. All platforms.

RM_OMNIFONT_PLUS3W 

3-way voting engine. Supported on: Windows, Linux.

RM_ASIAN 

Asian engine from Tsinghua All platforms.

RM_RESERVED_M 

Internal use only. Must NOT be selected.

RM_RESERVED_A 

Internal use only. Must NOT be selected.

RM_SIZE 

Number of available recognition modules.

enum ZONETYPE

Basic zone types.

Enumerator:
WT_FLOW 

Flowed text. This zone type means that the zone contains textual information arranged horizontally without a table type structure inside. Inside a user zone of this type kRecLocateZones creates one OCR zone of the same type. It can be in OCR zones and user zones. It can also be used for horizontally appearing CCJK characters.

WT_TABLE 

Table type zone. This type means that the zone contains a table, i.e. rows and columns, with or without a grid. Such zones will be handled differently from flowed text type zones. Inside a user zone of this type kRecLocateZones creates one OCR zone of the same type. The Engine will try to reconstruct as much of the original table text layout of the zone as the final output document format supports.

WT_GRAPHIC 

Graphic type zone. This type of zone contains graphics, i.e. this zone will not be recognized at all and all other recognition related settings will be ignored. The only reason to have such a zone is to save or export the image inside it. Inside a user zone of this type kRecLocateZones creates one OCR zone of the same type.

WT_AUTO 

Inside a user zone of this type kRecLocateZones performs a parsing algorithm and it may create several OCR zones of any types except WT_AUTO, WT_IGNORE and WT_FORM. This type can only be in user zones.

WT_IGNORE 

Ignore zone. kRecLocateZones does not create OCR zones inside a user zone of this type.

WT_FORM 

Form zone. Logical Form Recognition will run within this zone. It indicates an unfilled form and it should be set in the user zone before running kRecLocateZones. kRecLocateZones creates one OCR zone of the same type, the created OCR zone contains a description of the form objects. See also Form Recognition Module.

WT_VERTTEXT 

Vertical text. For CCJK characters only.

WT_LEFTTEXT 

Left rotated text. For Latin, Greek and Cyrillic characters only.

WT_RIGHTTEXT 

Right rotated text. For Latin, Greek and Cyrillic characters only.


Function Documentation

RECERR RECAPIKRN kRecAddZoneRect ( HPAGE  hPage,
IMAGEINDEX  iiImg,
const RECT *  pRect,
int  nZone 
)

Adding a rectangle to a user zone.

This function adds a new rectangle to an existing user zone. It creates their union. Because the new rectangle can overlap previous rectangles the function recalculates the rectangle list of the zone. The resulting union must be pizza-box except in the case of OMR zones. Of course, table zones and vertical text zones cannot receive new rectangles.

Parameters:
[in]hPageHandle of the page.
[in]iiImgThe index of the image whose coordinate system you have used in defining the rectangle to be added. It is recommended to use II_CURRENT coordinates if possible.
[in]pRectThe rectangle to be added to the given user zone.
[in]nZoneThe index of the user zone the new rectangle is added to.
Return values:
RECERR
Note:
The insertion of a new irregular zone can be performed only by rectangles. The first rectangle must be inserted by kRecInsertZone. The additional rectangles must be added by kRecAddZoneRect or removed by kRecSubZoneRect. The rectangle list of the given zone is automatically updated (the overlapping rectangles are recalculated, when subtracting the remaining (maybe non-rectangular) area is broken into rectangular areas, etc.), thus the ZONE passed to the function kRecInsertZone must have only the bounding box of the irregular zone’s first rectangle (it must not be the bounding box of the whole zone).
The CSDK does not support overlapping user zones, thus newly added rectangles must be disjoint with other zones (of course, except the nZone-th one).
Note that the index of the first zone in the zone list is zero (0).
Any changes in the user zone list provokes the deletion of the OCR zone list.
The specification of this function in C# is:
 RECERR kRecAddZoneRect(IntPtr hPage, IMAGEINDEX iiImg, RECT pRect, int nZone); 
RECERR RECAPIKRN kRecCopyOCRZones ( HPAGE  hPage)

Copying the OCR zone list to a user zone list.

This function copies the OCR zones in the place of user zones. It leaves the OCR zones intact, but deletes the former user zones. It can be used to delete/insert/change user zones based on the OCR zones detected by a previous kRecLocateZones.

Parameters:
[in]hPageHandle of the page.
Return values:
RECERR
Note:
The specification of this function in C# is:
 RECERR kRecCopyOCRZones(IntPtr hPage); 
RECERR RECAPIKRN kRecDeleteAllZones ( HPAGE  hPage)

Deleting all user zones.

This function deletes all zones both in the user and the OCR zone list for the page.

Parameters:
[in]hPageHandle of the page.
Return values:
RECERR
Note:
The specification of this function in C# is:
 RECERR kRecDeleteAllZones(IntPtr hPage); 
RECERR RECAPIKRN kRecDeleteZone ( HPAGE  hPage,
int  nZone 
)

Deleting a user zone.

This function deletes a zone from the user zone list of the page.

Parameters:
[in]hPageThe handle of the page.
[in]nZoneIndex of the user zone to be deleted.
Return values:
RECERR
Note:
Note that the index of the first zone in the zone list is zero (0).
Any changes in the user zone list provokes deletion of the OCR zone list.
The specification of this function in C# is:
 RECERR kRecDeleteZone(IntPtr hPage, int nZone); 
RECERR RECAPIKRN kRecGetDecompMethod ( int  sid,
IMG_DECOMP pAlgorithm 
)

Getting the decomp method.

This function inquires the current Page parser algorithm setting of the Engine.

Parameters:
[in]sidSettings Collection ID.
[out]pAlgorithmThe current page parser algorithm.
Return values:
RECERR
Note:
This function gets the value of the setting Kernel.Decomp.Method. This setting can be changed by kRecSetDecompMethod.
The specification of this function in C# is:
 RECERR kRecGetDecompMethod(int sid, out IMG_DECOMP decompAlg); 
RECERR RECAPIKRN kRecGetForceSingleColumn ( int  sid,
INTBOOL *  pbForceSingle 
)

Getting the setting of Force Single Column mode.

This function inquires the current setting of the Force Single Column mode.

Parameters:
[in]sidSettings Collection ID.
[out]pbForceSingleAddress of a Boolean variable to hold the current Force Single Column mode setting.
Return values:
RECERR
Note:
This function gets the value of the setting Kernel.Decomp.ForceSingleColumn. This setting can be changed by kRecSetForceSingleColumn.
The specification of this function in C# is:
 RECERR kRecGetForceSingleColumn(int sid, out bool bEnable); 
RECERR RECAPIKRN kRecGetNongriddedTableDetect ( int  sid,
INTBOOL *  bEnable 
)

Getting the non-gridded table detection setting.

This function specifies whether the non-gridded table detection feature of the Engine is enabled.

Parameters:
[in]sidSettings Collection ID.
[out]bEnableThe value of the current non-gridded table detection setting.
Return values:
RECERR
Note:
This function gets the value of the setting Kernel.Decomp.EnableNonGrid. This setting can be changed by kRecSetNongriddedTableDetect.
The specification of this function in C# is:
 RECERR kRecGetNongriddedTableDetect(int sid, out bool bEnable); 
RECERR RECAPIKRN kRecGetOCRZoneCount ( HPAGE  hPage,
int *  pnOCRZones 
)

Getting the OCR zone count.

This function gets the number of zones in the OCR zone list of the page.

Parameters:
[in]hPageHandle of the page.
[out]pnOCRZonesAddress of an integer variable to get the number of zones.
Return values:
RECERR
Note:
The specification of this function in C# is:
 RECERR kRecGetOCRZoneCount(IntPtr hPage, out int ZoneCount); 
RECERR RECAPIKRN kRecGetOCRZoneData ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPZONEDATA  pOCRZoneData,
int  nOCRZone 
)

Getting additional information about OCR zones.

This function can be used for getting additional information about any OCR zone in the OCR zone list of the page.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system is to be used to report the zone's boundary box.
[out]pOCRZoneDataPointer to a variable for storing the requested zone-data information.
[in]nOCRZoneIndex of the zone in the zone list, from which the information is requested.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
The specification of this function in C# is:
 RECERR kRecGetOCRZoneData(IntPtr hPage, IMAGEINDEX iiImg, out ZONEDATA pOCRZoneData, int nOCRZone); 
RECERR RECAPIKRN kRecGetOCRZoneInfo ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPZONE  pOCRZone,
int  nOCRZone 
)

Getting OCR zone information.

This function can be used for getting information about any zone in the OCR zone list of the page.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system is used to report the zone's boundary box.
[out]pOCRZonePointer to a variable for storing the requested zone information.
[in]nOCRZoneIndex of the zone in the zone list, from which the information is requested.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
The specification of this function in C# is:
 RECERR kRecGetOCRZoneInfo(IntPtr hPage, IMAGEINDEX iiImage, out ZONE pZone, int nZone); 
RECERR RECAPIKRN kRecGetOCRZoneLayout ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPRECT *  ppRects,
int *  pnRects,
int  nZone 
)

Getting OCR zone shape information.

This function can be used for getting information about the shape of any zone in the OCR zone list of the hPage page. For more information about the possible shape of the zones see the definition of pizza-box.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system is used to report the shape information.
[out]ppRectsPointer to an array of RECTs for storing the requested shape information.
[out]pnRectsPointer to number variable for storing the number of rectangles in the ppRects array.
[in]nZoneIndex of the zone in the zone list, from which the information is requested.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
Since this function creates a new memory object containing the shape RECTs, after evaluating the result, the application should call the kRecFree function to free this memory area.
The specification of this function in C# is:
 RECERR kRecGetOCRZoneLayout(IntPtr hPage, IMAGEINDEX iiImg, out RECT[] ppRects, int nZone); 
RECERR RECAPIKRN kRecGetOCRZoneNodeArray ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPPOINT *  ppPoints,
int *  pnNodes,
int  iZone 
)

Getting the polygon of the OCR zone.

This function retrieves the polygon made up of the OCR zone's vertices. This can be useful for an application with a GUI for drawing irregular zones.

Parameters:
[in]hPageThe handle of the page.
[in]iiImgIndex of the image in the page whose coordinate system is used to report the points.
[out]ppPointsThe pointer of the array of polygon points. This array is allocated by the function and can be freed via calling the function kRecFree.
[out]pnNodesThe pointer of an integer retrieving the number of polygon vertices.
[in]iZoneThe index of the OCR zone in question.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
You cannot call this function on OMR zones (see FM_OMR, RM_OMR), because they may have non-pizzabox shapes.
If iiImg is II_ORIGINAL the polygon may have slanting edges.
Since this function creates a new memory object containing the nodes, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetOCRZoneNodeArray(IntPtr hPage, IMAGEINDEX iiImg, out POINT[] ppPoints, int nZone); 
RECERR RECAPIKRN kRecGetPageDescription ( int  sid,
DWORD *  pPageDesc 
)

Getting page description data.

This function gets the current page description data.

Parameters:
[in]sidSettings Collection ID.
[out]pPageDescThe actual Page Descriptor.
Return values:
RECERR
Note:
This function gets the value of the setting Kernel.Decomp.PageDesc. This setting can be changed by kRecSetPageDescription.
The specification of this function in C# is:
 RECERR kRecGetPageDescription(int sid, out PAGEDESCRIPTION pPageDesc); 
RECERR RECAPIKRN kRecGetZoneCount ( HPAGE  hPage,
int *  pnZones 
)

Getting the user zone count.

This function gets the number of zones in the user zone list for the page.

Parameters:
[in]hPageHandle of the page.
[out]pnZonesAddress of an integer variable to get the number of zones.
Return values:
RECERR
Note:
The specification of this function in C# is:
 RECERR kRecGetZoneCount(IntPtr hPage, out int pnZones); 
RECERR RECAPIKRN kRecGetZoneInfo ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPZONE  pZone,
int  nZone 
)

Getting user zone information.

This function can be used for getting information about any zone in the user zone list of the page.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page, whose coordinate system is used to report the zone's boundary box.
[out]pZonePointer to a variable for storing the requested zone information.
[in]nZoneIndex of the zone in the zone list, from which the information is requested.
Return values:
RECERR
Note:
Note that the index of the first zone in the zone list is zero (0).
The specification of this function in C# is:
 RECERR kRecGetZoneInfo(IntPtr hPage, IMAGEINDEX iiImg, out ZONE pZone, int nZone); 
RECERR RECAPIKRN kRecGetZoneLayout ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPRECT *  ppRects,
int *  pnRects,
int  iZone 
)

Getting user zone shape information.

This function can be used for getting information about the shape of any zone in the user zone list of the hPage page. For more information about the possible zone shapes see the definition of pizza-box.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page, whose coordinate system is used to report the requested zone shape.
[out]ppRectsPointer to an array of RECTs for storing the requested shape information.
[out]pnRectsPointer to a number variable for storing the number of rectangles in the ppRects array.
[in]iZoneIndex of the zone in the zone list, from which the information is requested.
Return values:
RECERR
Note:
Note that the index of the first zone in the zone list is zero (0).
Since this function creates a new memory object containing the shape RECTs, after evaluating the result, the application should call the kRecFree function to free this memory area.
The specification of this function in C# is:
 RECERR kRecGetZoneLayout(IntPtr hPage, IMAGEINDEX iiImg, out RECT[] ppRects, int nZone); 
RECERR RECAPIKRN kRecGetZoneNodeArray ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPPOINT *  ppPoints,
int *  pnNodes,
int  iZone 
)

Getting the polygon of the user zone.

This function retrieves the polygon made up of the vertices of the user zone. This can be useful for an application with a GUI when drawing irregular zones.

Parameters:
[in]hPageThe handle of the page.
[in]iiImgIndex of the image in the page, whose coordinate system is used to report the points.
[out]ppPointsThe pointer of the array of polygon points. This array is allocated by the function and can be freed calling the function kRecFree.
[out]pnNodesThe pointer of an integer retrieving the number of polygon vertices.
[in]iZoneThe index of the user zone in question.
Return values:
RECERR
Note:
Note that the index of the first zone in the zone list is zero (0).
You cannot call this function on OMR zones (see FM_OMR, RM_OMR), because they may have non-pizzabox shapes.
If iiImg is II_ORIGINAL the polygon may have slanting edges due to the deskew operation.
Since this function creates a new memory object containing the nodes, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetZoneNodeArray(IntPtr hPage, IMAGEINDEX iiImg, out POINT[] ppPoints, int nZone); 
void RECAPIKRN kRecInitZone ( LPZONE  pZone)

Initializing a ZONE variable.

This function initializes a ZONE variable to default values.

Parameters:
[in]pZonePointer to the zone structure to be initialized.
Return values:
none
Note:
All the fields of the structure are set to a default value:
        type = WT_FLOW;
        fm = FM_DEFAULT;
        rm = RM_AUTO;
        filter = FILTER_DEFAULT;
        chk_control = 0;
        chk_fn = NULL;
        chk_sect = "";
        userdata = 0;
The bounding box (ZONE::rectBBox) is set to 0, the real value must be set after this call.
In a C++ program it might be useful to create a CZone class based on ZONE and call kRecInitZone in the constructor.
The specification of this function in C# is:
 RECERR kRecInitZone([In, Out] ZONE zone); 
RECERR RECAPIKRN kRecInsertZone ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPCZONE  pZone,
int  nZone 
)

Inserting a user zone.

This function inserts a new zone in the user zone list of the page. After inserting the zone, the zone list will be recalculated automatically. For information about insertion of irregular zones see notes.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system you have used in defining the boundary box for the new zone. It is recommended to use II_CURRENT coordinates if possible.
[in]pZonePointer to the zone data to be inserted.
[in]nZoneIndex in the user zone list where the new zone should be inserted. Use zero (0) to insert the zone as the first element of the zone list. To insert a zone as the last element of the zone list, use -1.
Return values:
RECERR
Note:
The insertion of a new irregular zone can be performed only by rectangles. The first rectangle must be inserted by kRecInsertZone. The additional rectangles must be added by kRecAddZoneRect or removed by kRecSubZoneRect. The rectangle list of the given zone is automatically updated (the overlapping rectangles are recalculated; when zones are subtracted, the remaining (maybe non-rectangular) area is broken into rectangular areas, etc.), thus the ZONE passed to the function kRecInsertZone must have only the bounding box of the irregular zone’s first rectangle (it must not be the bounding box of the whole zone).
When inserting a zone, the application should initialize all the fields of the ZONE structure. The kRecInitZone function could be used for this purpose. The ZONE::rectBBox field does not have a meaningful default value, so it must always be directly initialized after kRecInitZone. If other fields are not changed then a WT_FLOW zone is created this way with default features.
The ZONE::type field of the zone to be inserted can be any ZONETYPE.
For successful zone insertion:
  • the whole zone area must lie within the page's OCR image area,
  • the whole zone area must lie within the given image's area,
  • the top left coordinates of the zone must have lower values than those of the bottom right ones,
  • the zone must not overlap another zone.
It is the programmer's responsibility to specify a valid recognition module-filling method pair, otherwise any incorrectly set zones will have no recognition result. For more details, see the topic Filling method - Recognition module combinations.
Any changes in the user zone list provokes deletion of the OCR zone list.
Insertion of a zone (containing machine printed uppercase flowed text):
    HPAGE hPage;
    ZONE zone;
    kRecInitZone(&zone);
    zone.rectBBox.left = 0;
    zone.rectBBox.top = 0;
    zone.rectBBox.right = 100;
    zone.rectBBox.bottom = 200;
    zone.fm = FM_OMNIFONT;
    zone.rm = RM_OMNIFONT_MOR;
    zone.filter = (CHR_FILTER)(FILTER_UPPERCASE | FILTER_DIGIT);
    kRecInsertZone(hPage, II_CURRENT, &zone, -1);
The specification of this function in C# is:
 RECERR kRecInsertZone(IntPtr hPage, IMAGEINDEX iiImg, [In] ZONE pZone, int nZone); 
RECERR RECAPIKRN kRecLoadZones ( HPAGE  hPage,
LPCTSTR  pFileName 
)

Loading user zones.

This function loads the user zone list from a zone file. The function attaches the zone list to the page.

Parameters:
[in]hPageHandle of the page.
[in]pFileNameName of the zone file to be loaded.
Return values:
ZONE_SIZE_WARNAt least one zone has been truncated, because it extends beyond the image
ZONE_SIZE_ERRAt least one zone has not been loaded, because it extends beyond the image
RECERROther errors
Note:
Calling this function removes any zone list (OCR zone list as well) already attached to the page.
This function can load zone files created by a previous version of OmniPage CSDK v20.
The specification of this function in C# is:
 RECERR kRecLoadZones(IntPtr hPage, string pFileName); 
RECERR RECAPIKRN kRecLocateZones ( int  sid,
HPAGE  hPage 
)

Page parsing.

This function analyzes the page layout structure of the image (auto-zoning). It finds text or graphic blocks on the page, builds an OCR zone list and then connects it to the page. The user zone list is not changed. It activates the PID_DECOMPOSITION process.

Parameters:
[in]sidSettings Collection ID.
[in]hPageHandle of the page containing the OCR image to be analyzed.
Return values:
RECERR
Note:
OmniPage Capture SDK uses different auto-zoning algorithms for Latin-alphabet and Asian text. Ensure that the correct language has been set before calling kRecLocateZones (see: kRecSetLanguages, kRecManageLanguages).
OmniPage Capture SDK offers three different algorithms to be applied during auto-zoning of Latin-alphabet text, and two different algorithms during auto-zoning of CCJK text; use the kRecSetDecompMethod function to specify the Page parser algorithm. This setting has no effect for Arabic, Thai and Hebrew languages. For details, see also IMG_DECOMP.
If hPage does not contain II_BW image, an implicit secondary image conversion step is also performed automatically to convert the image to a despeckled B/W image (which will be stored in II_BW image of hPage) on which the auto-zoning function operates. The parameters for this conversion can be specified through the functions kRecSetImgBinarizationMode, kRecSetImgDespeckleMode, kRecSetImgBrightness, kRecSetImgThreshold, kRecSetImgResolEnhancement and kRecSetRMTradeoff.
If user zones are used together with kRecLocateZones they should be inserted before calling this function.
kRecLocateZones creates the OCR zones and puts them into the OCR zone list. Any previously inserted OCR zone is deleted first. The generated OCR zones are used by the recognition process and they may be modified by this.
The type of user zone determines how kRecLocateZones finds OCR zones inside them. See ZONETYPE for a description of individual types.
The OCR zones detected outside User zones take the value FILTER_DEFAULT, meaning that the zone will take on the global filter setting (which is by default FILTER_ALL to have no filtering).
The OCR zones detected outside User zones will always have the FM_DEFAULT and RM_AUTO values in their ZONE::fm and ZONE::rm fields, respectively. The "meaning" of the FM_DEFAULT filling method can be specified with the kRecSetDefaultFillingMethod function, while the "meaning" of the RM_AUTO recognition module can be specified with the kRecSetDefaultRecognitionModule function.
Since the function's zone location algorithm is optimized for searching text blocks, it is not recommended for locating barcodes. This situation can be handled with user zones.
Note that the kRecGetOCRZoneCount function can be used for inquiring the number of OCR zones in the zone list. The application can get the OCR zone parameters by the kRecGetOCRZoneInfo function. After modifying some zone attributes the kRecUpdateOCRZone function can be used to apply the changes.
kRecDetectFillingMethod is a useful function when you do not know the type of page to be processed.
This function can fill the line list of the HPAGE, but not all line situations are handled in the same way. See RLINE for more information.
The specification of this function in C# is:
 RECERR kRecLocateZones(int sid, IntPtr hPage); 
RECERR RECAPIKRN kRecSaveOCRZones ( HPAGE  hPage,
LPCTSTR  pFileName 
)

Saving the OCR zone list.

This function saves the current OCR zone list of the page into a zone file. The zone file can be loaded later by kRecLoadZones as user zones.

Parameters:
[in]hPageHandle of the page.
[in]pFileNameName of the zone file to be created.
Return values:
RECERR
Note:
This function overwrites a pre-existing zone file with the same name.
The zone coordinates are saved into the zone file in twips (1/1440 inches).
The specification of this function in C# is:
 RECERR kRecSaveOCRZones(IntPtr hPage, string pFileName); 
RECERR RECAPIKRN kRecSaveZones ( HPAGE  hPage,
LPCTSTR  pFileName 
)

Saving the user zone list.

This function saves the current user zone list of the page into a zone file.

Parameters:
[in]hPageHandle of the page.
[in]pFileNameName of the zone file to be created.
Return values:
RECERR
Note:
This function overwrites a pre-existing zone file with the same name.
The zone coordinates are saved into the zone file in twips (1/1440 inches).
The specification of this function in C# is:
 RECERR kRecSaveZones(IntPtr hPage, string pFileName); 
RECERR RECAPIKRN kRecSetDecompMethod ( int  sid,
IMG_DECOMP  Algorithm 
)

Setting the decomp method.

This function specifies the Page parser algorithm setting of the Engine. This setting is applied whenever the auto-zoning algorithm is activated (PID_DECOMPOSITION process).

Parameters:
[in]sidSettings Collection ID.
[in]AlgorithmThe page parser algorithm to be set.
Return values:
RECERR
Note:
If this function is not called, the default value, DCM_AUTO is applied.
This function sets the value of the setting Kernel.Decomp.Method. This setting can be retrieved by kRecGetDecompMethod.
The specification of this function in C# is:
 RECERR kRecSetDecompMethod(int sid, IMG_DECOMP decompAlg); 
RECERR RECAPIKRN kRecSetForceSingleColumn ( int  sid,
INTBOOL  bForceSingle 
)

Specify the Force Single Column mode.

This function specifies the Force Single Column mode for the page-layout PID_DECOMPOSITION process. It prevents the Engine's de-columnization from detecting columns and placing their contents one below the other. It is useful for conserving the columnar structure in tables.

Parameters:
[in]sidSettings Collection ID.
[in]bForceSingleForce Single Column mode to be set (default is FALSE).
Return values:
RECERR
Note:
It is highly recommended to switch this on sparingly, only for cases where the image has a single tabular structure and chances are that the automatic decomposition would fail in analyzing the layout properly.
Single Column mode is handled in a different way in DCM_STANDARD mode. In this case, the "single column" is only a recommendation to the Decomposition Module, which may still generate multi-column results if this seems necessary (e.g. if the page parser finds more than one column, there are text and picture side by side, or texts are separated by vertical lines, etc.)
This mode takes effect only when the page descriptor has its default value (kRecSetPageDescription).
This function sets the value of the setting Kernel.Decomp.ForceSingleColumn. This setting can be retrieved by kRecGetForceSingleColumn.
The specification of this function in C# is:
 RECERR kRecSetForceSingleColumn(int sid, bool bEnable); 
RECERR RECAPIKRN kRecSetNongriddedTableDetect ( int  sid,
INTBOOL  bEnable 
)

Setting the non-gridded table detection.

This function sets the Non-gridded table detection setting of the Engine. Tables with visible grid lines (gridded tables) in an original page can usually be detected successfully by the auto-zoning function. In contrast, tables without visible cell separators in the original are harder to identify as tables, because they might also be word lists or data arranged in columns. The OmniPage CSDK offers an algorithm for detecting such non-gridded tables more confidently. This feature of the Engine can only be used in conjunction with an auto-zoning step. The algorithm is based on the result of the character recognition and runs on the OCR zones created by auto-zoning (including the zones created from a WT_AUTO User zone).

Parameters:
[in]sidSettings Collection ID.
[in]bEnableThe value to be set for the non-gridded table detection setting (the default is TRUE).
Return values:
RECERR
Note:
When this feature is enabled, a special algorithm tries to find non-gridded table(s) on the image. This algorithm is based on the existing OCR zone list of the image and also on the recognized characters, i.e. this algorithm is activated as one of the last steps in the character recognition. Because of this, its effect might be visible only after a kRecRecognize or kRecProcessPages call.
The running of this algorithm is strongly controlled, since it may change the OCR zone list belonging to the image.
It is recommended to disable the Non-gridded table detection algorithm when the DCM_FAST page parser algorithm has been selected (see: kRecSetDecompMethod).
This function sets the value of the setting Kernel.Decomp.EnableNonGrid. This setting can be retrieved by kRecGetNongriddedTableDetect.
The specification of this function in C# is:
 RECERR kRecSetNongriddedTableDetect(int sid, bool bEnable); 
RECERR RECAPIKRN kRecSetPageDescription ( int  sid,
DWORD  PageDesc 
)

Setting page description data.

The page description data controls how the page parse (see kRecLocateZones and kRecRecognize) runs on the page. Page Descriptor defines describe the different behaviors of page parse. If the program has information about the image, it can help the page parse to achieve better layout results.

Parameters:
[in]sidSettings Collection ID.
[in]PageDescThe Page Descriptor. It contains a set of LZ_ flags
Return values:
RECERR
Note:
This function sets the value of the setting Kernel.Decomp.PageDesc. This setting can be retrieved by kRecGetPageDescription.
The specification of this function in C# is:
 RECERR kRecSetPageDescription(int sid, PAGEDESCRIPTION PageDesc); 
RECERR RECAPIKRN kRecSetZoneLayout ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPCRECT  pRects,
int  nRects,
int  nZone 
)

Updating the user zone shape information.

This function updates the shape information of any zone in the user zone list. See also the definition of pizza-box for more information about zone shapes.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system you have used in defining the shape to be updated.
[in]pRectsArray of RECTs for storing the shape information.
[in]nRectsThe number of RECTs in the shape information array.
[in]nZoneIndex of the zone to be updated.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
Any changes in the user zone list provokes deletion and regeneration of the OCR zone list.
The specification of this function in C# is:
 RECERR kRecSetZoneLayout(IntPtr hPage, IMAGEINDEX iiImg, RECT[] pRects, int nZone); 
RECERR RECAPIKRN kRecSubZoneRect ( HPAGE  hPage,
IMAGEINDEX  iiImg,
const RECT *  pRect,
int  nZone 
)

Subtracting a rectangle from a user zone.

This function subtracts a rectangle from an existing user zone. The function recalculates the rectangle list of the zone. The resulting list must describe a pizza-box shape. Subtraction cannot be performed on table zones and vertical text zones.

Parameters:
[in]hPageHandle of the page.
[in]iiImgThe index of the image whose coordinate system you have used in defining the rectangle to be subtracted. It is recommended to use II_CURRENT coordinates if possible.
[in]pRectThe rectangle to be subtracted from the given user zone.
[in]nZoneThe index of the user zone the rectangle is subtracted from.
Return values:
RECERR
Note:
The insertion of a new irregular zone can be performed only by rectangles. The first rectangle must be inserted by kRecInsertZone. The additional rectangles must be added by kRecAddZoneRect or removed by kRecSubZoneRect. The rectangle list of the given zone is automatically updated (the overlapping rectangles are recalculated; when subtracting, the remaining (maybe non-rectangular) area is broken into rectangular areas, etc.), thus the ZONE passed to the function kRecInsertZone must have only the bounding box of the irregular zone’s first rectangle (it must not be the bounding box of the whole zone).
Note that the index of the first zone in the zone list is zero (0).
Any changes in the user zone list provokes deletion of the OCR zone list.
The specification of this function in C# is:
 RECERR kRecSubZoneRect(IntPtr hPage, IMAGEINDEX iiImg, RECT pRect, int nZone); 
RECERR RECAPIKRN kRecUpdateOCRZone ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPCZONE  pZone,
int  nZone 
)

Updating the OCR zone.

This function updates the zone data of any zone in the OCR zone list. The fields ZONE::rectBBox, ZONE::type, ZONE::chk_fn (must be NULL) and chk_sect (must be empty string) cannot be modified.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page, whose coordinate system you have used in defining the zone's boundary box.
[in]pZonePointer to a zone structure with the necessary updating zone data.
[in]nZoneIndex of the zone to be updated.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
This function can modify only the following fields of the specified OCR zone: userdata, fm, rm, filter, chk_control. Other fields of the passed zone structure are not considered.
The preferred method of updating a zone is the following:
  • call the kRecGetOCRZoneInfo to ask the current parameters of the zone,
  • modify the zone parameters,
  • call the kRecUpdateOCRZone function.
It is the programmer's responsibility to specify a valid recognition module-filling method pair, otherwise any incorrectly set zones will have no recognition result. For more details see the topic Filling method - Recognition module combinations.
The specification of this function in C# is:
 RECERR kRecUpdateOCRZone(IntPtr hPage, IMAGEINDEX iiImg, [In] ZONE pZone, int nZone); 
RECERR RECAPIKRN kRecUpdateZone ( HPAGE  hPage,
IMAGEINDEX  iiImg,
LPCZONE  pZone,
int  nZone 
)

Updating a user zone.

This function updates the zone data of any zone in the user zone list.

Parameters:
[in]hPageHandle of the page.
[in]iiImgIndex of the image in the page whose coordinate system you have used in defining the zone's boundary box.
[in]pZonePointer to a zone structure with the necessary updating zone data.
[in]nZoneIndex of the zone to be updated.
Return values:
RECERR
Note:
The index of the first zone in the zone list is zero (0).
When updating a zone, the application should initialize all the fields of pZone either by the kRecInitZone or, in most cases, the kRecGetZoneInfo functions.
If the new ZONE::rectBBox is equal to the old one, the layout of the zone (kRecGetZoneLayout) is not changed. Otherwise it is modified to a simple rectangular zone equal to the new rectBBox (i.e. the one or more previous rectangles building up the zone are deleted and the only one rectBBox is stored).
The preferred method of updating a zone is the following:
  • call the kRecGetZoneInfo to ask the current parameters of the zone,
  • modify the zone parameters,
  • call the kRecUpdateZone function.
It is the programmer's responsibility to specify a valid recognition module-filling method pair. Otherwise any incorrectly set zones will have no recognition result. For more details see the topic Filling method - Recognition module combinations.
When updating a table-type zone the "cell-detection" algorithm will not be activated, which results in improper table detection within the zone. You can detect cells within a table zone by calling kRecLocateTable.
Any changes in the user zone list provokes deletion of the OCR zone list.
The specification of this function in C# is:
 RECERR kRecUpdateZone(IntPtr hPage, IMAGEINDEX iiImg, [In] ZONE pZone, int nZone);