RecAPI
OMR optical mark recognition module
Module name: OMR
Module identifier: RM_OMR
Filling methods supported FM_OMR
Filters supported: ignores all filter settings
Trade-off supported: none
Knowledge base file: none
Training file supported: no

This module is included only in the Professional Recognition Kit (not the OCR kit). To make this technology available in your application, it must be covered by your distribution licensing. See the topic on Licensing in the General Information help system.

Application areas

This recognition module is used for recognizing optical marks (checkmarks). Typical application areas are in questionnaires, ballot papers, educational tests and reporting or ordering sheets, where the documents to be processed are form-like and filled by respondents, usually by hand.

IMPORTANT NOTE: Autozoning can not find OMR zones, therefore manually (programmatically) created user zones (see FM_OMR and RM_OMR) or pre-defined form templates (see how to use form templates) can be used.

Accuracy issues

Checkmark zones are bounded by printed frames, which are visible on the input document, but may be visible or invisible in the image passed to the recognition module, due to the use of dropout colors during scanning. The accuracy of this module can be improved by

  • specifying in advance how the module will handle frame detection (frames visible, frames invisible, auto-detect)
  • specifying the "marking sensitivity" of the module, i.e. how strongly an OMR zone must be marked to count as filled. There are four choices.

The values "frames visible" and "frames invisible" give higher accuracy than "auto-detect". This recognition module is not influenced by the recognition trade-off setting.

Sometimes the OMR zones of the User may cut the frames (e.g.: at using the same zone file on all the images about the same form). The module can step over the border of the zone for processing the whole frame, but it is not the default running. This working method can be provoked modifying the setting Kernel.Ocr.OMR.ZoneCorrection to true.

See also the topic Instructions to respondents below.

Conditions

The frame can be a rectangle, a circle, an ellipse, etc.; it can be shaded. It may be visible or invisible in the image sent for recognition. The dimension of the frame should be at least 45-50 pixels in each direction, that is 3.5 to 4 mm (0.2 inch) in the case of 300 dpi resolution.

  • If there is a frame visible in the image and the filled-in-error feature is disabled:
    it can be filled in by any shape such as an X, a tick, non-solid hatching, horizontal or vertical lines, etc.). The recommended filling shape is an X or a tick. A small number of contiguous black pixels falling within the checkmark area will lead to a value "filled". The recommended scanner brightness setting is slightly darker than 50%.
  • If there is no frame visible in the image or the filled-in-error feature is enabled:
    it should be filled so that the checkmark shape could not be mixed with a frame or half-frame, i.e. no lines parallel to the invisible zone borders. The filling shape should be a clear X or a tick. The dimensions of the OMR mark should be at least 45-50 pixels in both directions, which is 3.5-4 mm (0.2 inch) in the case of 300 dpi resolution.

This module has been tested on an image with more than 1300 OMR zones.

Output

An OMR (optical mark) zone is unique in that its output always consists of precisely one digit. It can be defined to be one of two or one of three values. When there are two possible values, these are zero (0) for unfilled, one (1) for filled. When three values are possible, the additional value is two (2) for "filled-in-error" (see below).

The safest way to link the output values with the checkboxes, which generated them, is through the LETTER structure output, which contains the zone number and the coordinates (zone, left, top, width, height). This can also help prevent checkmark data being confused with barcode values or other non-checkmark data coming from the same page.

If a page contains mainly or only checkmark data, the output converters "Text - Tab Delimited", "Text - Comma Delimited" or "Excel 97, 2000" can be used to load the data into a spreadsheet program for further analysis and presentation.

Filled-in-error

The filled-in-error feature allows the application to handle checkboxes that were filled by mistake. This feature is available only with the KernelAPI.

The respondent in this case should completely blacken the frame or checkmark area before marking a new choice. It is not essential that the area be completely blackened, but it must be significantly darker and denser than a "filled" (checked) zone. A "filled-in-error" zone generates an output value 2. This feature functions only if two conditions are met:

  • kRecSetOmrParams must be called with the value pFill set to TRUE.
  • Sets of OMR zones relating to one question must be grouped.

The filled-in-error feature functions only on grouped zones. Recognition results for each zone in a group will be 0, 1 or 2. There should be only one filled zone per group plus optionally one filled-in-error. When designing a checkmark document, all zones in a group should have the same checkbox style and size. Up to 32 OMR zones can be grouped.

Grouping OMR zones

In CSDK versions earlier than v15 the OMR zones could be grouped by modifying the seq field in the zone structure. From v15 this is not necessary, due to the notion of pizzabox zones. OMR zones can be groupped by collecting them in one pizzabox zone even they are not touching (i.e. one criterion of pizzabox shape is not fulfilled).

Instructions to respondents

OMR processing requires a high degree of accuracy. The two-value detection is inherently accurate; three-value detection is more difficult. Good document design and clear instructions to respondents are very important in getting high accuracy. Printing model samples of ideally filled and filled-in-error checkboxes in the instructions is recommended. Respondents should be urged to fill in the document with a dark blue or black pen. Pencils are to be avoided, as are pens with an ink color close to a dropout color on the scanner to be used.)

Note:
See OMR Recognition Engine Module.