RecAPI
|
The Engine can load a number of recognition modules. It is license-dependent whether a given recognition engine is accessible or not. For information about licensing, please see the General Information help system. The User can control the engine running in a given ZONE. This allows the integrating application to perform "multi-module" recognition on any page.
NOTE: The modules MTX, DOT, MAT, HNR are supported on: Windows, RER is supported on: Windows, Linux, Mac OS X.
The enum RECOGNITIONMODULE lists all the possible recognition modules; these are tightly integrated to the Engine:
The RER recognition module is third-party component.
The rm
field of any user zone should contain one of the element of the above mentioned enum type. There is a special one: RM_AUTO means that the Engine will choose the module most likely to be appropriate. It does this first of all by consulting the filling method set for the zone.
The FILLINGMETHOD describes the type of data expected in the zone, e.g. a barcode, a handprinted or a machine generated text. A degree of auto-detection is available for filling method, with the kRecDetectFillingMethod function, useful when the precise filling method used on incoming documents may not be known in advance. It is the User's responsibility to specify a valid recognition module-filling method pair. Any incorrectly set zone will have no recognition result.
RM_AUTO
reads the filling method; if only one recognition module is suitable, it is used. When there is a choice, RM_AUTO
uses various checks (character set, image size, etc.) to select the best one. Thus, it protects against an invalid FM-RM pair.
If the recognition module is not present when it is needed, the recognition function (kRecRecognize) returns with API_MODULEMISSING_ERR, and there will be no recognized data for the zone concerned. To avoid risk of this, we recommend checking the presence and correct installation of the necessary recognition modules by calling kRecGetModulesInfo right after the Engine's initialization.
The recognition process is typically initiated by the function kRecRecognize. The other way is to call the function kRecProcessPages. The former processes an already loaded page, the latter gets one or more image files as input.
Both methods work on the zones of the page. Internally, the recognition process operates on B/W images. If the loaded image is not a B/W one, the required conversion is performed during the loading or preprocessing.
The kRecProcessPages
function is a so-called one-touch processing function or one-step function, because it performs the image loading (either from files or from scanner), the preprocessing, the loading of user zones (if specified), the recognition and the output conversion started by only one User calling. Furthermore, this one calling is enough to process more than one image. Of course, the one-touch function uses the usual settings.
The document level processing also has one-touch solutions.
IMPORTANT NOTES
The default settings of OmniPage 20 (Nuance's desktop application) and OmniPage Capture SDK 20 are not the same. In default, RecAPI of the CSDK does not run in the most accurate mode, but in a less accurate and faster mode, which is a good compromise between the speed and the accuracy. But it can be easily switched into the most accurate mode modifying the value of the setting Kernel.OcrMgr.PreferAccurateEngine to true. This most accurate mode of the CSDK is equivalent to the default of the desktop application. See also kRecSetDefaultRecognitionModule and its notes.
The recognition result is stored in the letter array of the HPAGE. The Recognition Data Handling Module is available for its maintenance. Each recognized character is represented by a LETTER structure containing all the accessible information about the given character (character code, position, size, confidence of the recognition, additional possible tips, font information, formatting information, etc.). This recognition data is directly accessible by kRecGetLetters; it is also the input for the output conversion.
The LETTER
structure contains an err
field for reporting the confidence of the recognized character. This is a combined value and if its value is 64 or greater the character is considered as suspicious. Of course, this is only a recommended value. For more information, see confidence issues. Another tool for reporting accurately the opinion of the recognition engine(s) is the use of alternatives. The running recognition engine may have more than one tip for each character. In this case, LETTER
provides access to the higher-order choices of the character code. For more information, see the usage of alternatives.
RECERR rc; ... // Load image. HPAGE hPage; rc = kRecLoadImgF(0, "testimage.tif", &hPage, 0); // Preprocess image. rc = kRecPreprocessImg(0, hPage); // Locate zones. rc = kRecLocateZones(0, hPage); // Recognize image. rc = kRecRecognize(0, hPage, NULL); // Get recognition result. LETTER *pLetters; long nLetters; rc = kRecGetLetters(hPage, II_CURRENT, &pLetters, &nLetters); // Print recognition result. for(int i=0;i<nLetters;i++) { if (pLetters[i].code == UNICODE_REJECTED) putwchar(L'~'); else if (pLetters[i].code == UNICODE_MISSING) putwchar(L'^'); else putwchar(pLetters[i].code); if (pLetters[i].err >= RE_SUSPECT_THR) putwchar(L'*'); } // Free up recognition results given back by the kRecGetLetters function. rc = kRecFree(pLetters); // Free up page. rc = kRecFreeImg(hPage); ...
There are different issues to be taken into account when you want to improve accuracy. Typically they also have consequences for the processing speed.
Character set limitation and the checking module both influence accuracy, but in different ways. Both, either or none of them can be used; the integrator should decide which balance is best. Their effects, when used separately, can be summarized as follows:
Limiting the character set gives the program greatest decision power, using the checking module to only flag errors is safest, but requires more post-processing outside the Engine to check all non-conforming cases.
A typical balance would be to impose broad restrictions by limiting the character set, e.g. specifying the permissible languages, but using the checking module for detailed control over parts of the recognized text where it's important that the original data be recognized and passed for checking precisely as it was written (e.g. for an ID Code incorporating a check-digit function). This later checking should make it possible to determine whether any error is due to optical recognition errors or was originally invalid.
The Language, Character Set and Code Page Handling Module is responsible for this area.
You can improve text recognition accuracy by narrowing the range of characters valid for recognition. This way the Engine does not always have to choose its solutions from more than 550 characters in the Engine's Total Character Set. (The multi-language omnifont MOR recognition module supports all of these characters; other recognition modules recognize fewer of them.) The character set concept is documented in detail in the topic Character Set in the Engine. Broadly, the set is compiled as follows:
FILTER_PLUS
) can be set in the zone's filter field. The possible local filter values are the same as the global ones, with an extra one: FILTER_DEFAULT. If this is the only one set, the zone inherits the global filter setting.The integrating application can retrieve timing and other statistical information about the last processed image. This may include:
The application calls kRecGetStatistics for this purpose. The fields of the STATISTIC structure will be filled with the relevant information on return.
The structure contains the latest accessible information about each listed statistics field, i.e. if recognition has not run yet on the current HPAGE, but has run on the previous one, the structure contains the data of the previous recognition.
All timing data is measured in milliseconds.