RecAPI
|
Module name: | FRX |
Module identifier: | RM_OMNIFONT_FRX |
Filling methods supported: | FM_OMNIFONT |
Filters supported: | all filter elements |
Trade-off supported: | none |
Knowledge base files: | none |
Training file supported: | yes (supported on: Windows, Linux) |
The PLUS2W and PLUS3W recognition modules require the presence of this module. This module is supplied in both the Professional Recognition Kit and the OCR Kit. Its inclusion in your application must be covered by a distribution license. See the topic on Licensing in the General Information help system.
Its associated files are:
baltic.shp | Frx shape pack (code page) file. |
cyrillic.shp | Frx shape pack (code page) file. |
greek.shp | Frx shape pack (code page) file. |
latin1.shp | Frx shape pack (code page) file. |
latin2.shp | Frx shape pack (code page) file. |
turkish.shp | Frx shape pack (code page) file. |
charsettable.chr | |
asciieng.lng | Frx language dictionary. Used in case of multi-language selection. |
czech.lng | Frx language dictionary data file. |
danish.lng | Frx language dictionary data file. |
dutch.lng | Frx language dictionary data file. |
english.lng | Frx language dictionary data file. |
finnish.lng | Frx language dictionary data file. |
french.lng | Frx language dictionary data file. |
german.lng | Frx language dictionary data file. |
greek.lng | Frx language dictionary data file. |
hungar.lng | Frx language dictionary data file. |
italian.lng | Frx language dictionary data file. |
norsk.lng | Frx language dictionary data file. |
polish.lng | Frx language dictionary data file. |
port.lng | Frx language dictionary data file. |
russian.lng | Frx language dictionary data file. |
spanish.lng | Frx language dictionary data file. |
swedish.lng | Frx language dictionary data file. |
turkish.lng | Frx language dictionary data file. |
This module recognizes machine printed text; i.e. from printed publications, laser or ink-jet printers and electric typewriters. Output from mechanical typewriters in good condition may also be acceptable. It should also be used for letter or near letter quality (NLQ, LQ) output from dot-matrix printers.
This module supports the recognition of Latin, Greek and Cyrillic alphabets with enough accented letters to recognize the 54 languages (Languages and modules).
The characters are listed in category and alphanumeric order, together with their Code Page values, in Characters and Code Pages.
The language support of this module is based on the module's internal code pages, which contain characters from a related group of languages. The internal code pages of this module are American/European (Latin 1, 1252), Baltic (1257), Central-European (Latin 2, 1250), Cyrillic (1251), Greek (1253) and Turkish (1254).
The module supports multi-language selection for recognition, though it may not recognize languages from different language groups properly. It supports only language combinations within the same Code Page. For example, it properly processes the English, German and Italian language combination, since all these languages belong to the Latin 1 (1252) code page. However, when specifying e.g. both the French and Czech languages, RM_OMNIFONT_FRX may fail to properly recognize some accented characters in the Czech alphabet, since these languages are not in the same code page. The following table contains the languages by code pages supported by FRX.
Latin 2 (1250) | Polish, Czech, Hungarian, Romanian, Albanian, Croatian, Wend (Sorbian), Slovak, Slovenian |
Cyrillic (1251) | Russian, Ukrainian, Byelorussian, Bulgarian, Macedonian, Serbian |
Latin 1 (1252) | English, German, French, Spanish, Italian, Dutch, Swedish, Norwegian, Finnish, Danish, Portuguese, Portuguese (Brasilian), Catalan, Afrikaans, Aymara, Basque, Breton, Faroese, Friulian, Gaelic, Galician, Eskimo, Icelandic, Indonesian, Latin, Malaysian, Pidgin English, Swahili, Tahitian, Welsh, Frisian, Zulu |
Greek (1253) | Greek |
Turkish (1254) | Turkish, Kurdish (written in Latin alphabet) |
Baltic (1257) | Estonian, Hawaiian, Latvian, Lithuanian |
The omnifont recognition module can detect and transmit character attributes: bold, italic or underlined text (or any combination of them). It can also detect and transmit character size, and can classify font types into three broad categories: serif, sans serif and monospaced.
Please consult the topic Performance comparison for information on the balance between speed and accuracy for the most common engine combinations and trade-off settings.