RecAPI
|
The following table summarizes all the Code Pages supported by the Engine. One of these must be specified as the Code Page of the final output document.
The default code page on Windows is the code page of the current OS, on Linux and Mac it is UTF-8. See also auto code page.
CODE PAGE NAME | WIDTH | DESCRIPTION | IMPLEMENTATION |
Windows ANSI | 8-bit | Code Page 1252 | Hard-coded |
Windows Greek | 8-bit | Code Page 1253 | Hard-coded |
Windows Eastern | 8-bit | Code Page 1250 | Hard-coded |
Windows Turkish | 8-bit | Code Page 1254 | Hard-coded |
Windows Baltic | 8-bit | Code Page 1257 | Hard-coded |
Windows Cyrillic | 8-bit | Code Page 1251 | Hard-coded |
Windows Esperant | 8-bit | Non Standard Win | From file, derived from CP 1252 |
Wondows Sami | 8-bit | Sami | From file, derived from CP 1252 |
Latin 1 | 8-bit | ISO 8859-1 | From file, derived from CP 1252 |
Code Page 437 | 8-bit | DOS Latin US | Hard-coded |
Greek-ELOT | 8-bit | DOS Greek | Hard-coded |
Greek-MEMOTEK | 8-bit | DOS Greek | Hard-coded |
Code Page 850 | 8-bit | DOS Latin 1 | From file, derived from CP 437 |
Code Page 852 | 8-bit | DOS Latin 2 | From file, derived from CP 437 |
Code Page 860 | 8-bit | DOS Portuguese | From file, derived from CP 437 |
Code Page 863 | 8-bit | DOS French-Canadian | From file, derived from CP 437 |
Code Page 865 | 8-bit | DOS Nordic | From file, derived from CP 437 |
Code Page 866 | 8-bit | DOS Cyrillic CIS | From file, derived from CP 437 |
CWI Magyar | 8-bit | DOS Hungarian | From file, derived from CP 437 |
Magyar Ventura | 8-bit | DOS Hungarian | From file, derived from CP 437 |
IVKAM C-S | 8-bit | Czech & Slovak | From file, derived from CP 437 |
Mazowia Polish | 8-bit | DOS Polish | From file, derived from CP 437 |
Sloven & Croat | 8-bit | 7 bits used | From file, derived from CP 437 |
Turkish | 8-bit | DOS Turkish | From file, derived from CP 437 |
Icelandic | 8-bit | DOS Icelandic | From file, derived from CP 437 |
Macintosh | 8-bit | Mac Western | Hard-coded |
Mac INSO Latin 2 | 8-bit | MAC CE | Hard-coded |
Mac Central EU | 8-bit | PT 202 | Hard-coded |
Mac Primus CE u | 8-bit | MAC CE | Hard-coded |
Maltese | 8-bit | Malta; 7 bits used | From file, derived from CP 437 |
OCR | 8-bit | Non Standard Win | From file, derived from CP 437 |
Unicode | 16-bit | multilingual | Hard-coded |
WordPerfect | 16-bit | multilingual | Hard-coded |
WordPerfect Old | 16-bit | multilingual | Hard-coded |
Roman 8 | 8-bit | For HP printers | Hard-coded |
UTF-8 | 16-bit | multilingual | Hard-coded |
Big5 | 16-bit | Traditional Chinese (supports ETen extension and HKscs non-standard mode) | From file |
EUC-CN | 16-bit | Simplified Chinese | From file |
EUC-JP | 16-bit | Japanese | From file |
EUC-TW | 16-bit | Traditional Chinese | From file |
GB 18030 | 16-bit | Simplified Chinese | From file |
GBK | 16-bit | Simplified Chinese | From file |
HKSCS-2004 | 16-bit | Traditional Chinese (HKscs standard mode, supports ETen extension | From file |
Shift_JIS | 16-bit | Japanese | From file |
UHC | 16-bit | Korean (extended EUC-KR) | From file |
For programming, the current Code Page setting of the Engine can be set by kRecSetCodePage and inquired by kRecGetCodePage. The exact list of available Code Pages can be inquired by the functions kRecGetFirstCodePage and kRecGetNextCodePage.
Many derivative Code Pages are offered. Their definitions can be found in the Code Page Definition files, RECOGN.SET
, LATIN1.SET
and SAMI.SET
. Each derivative is based on CP 1252 or CP 437, and the appropriate section in the Code Page Definition file specifies only the changed character positions.
"Unicode"
or "UTF-8"
. If no offered Code Page fulfils your needs, you can develop your own derivative 8-bit Code Page by adding a new Code Page Definition file to the Engine Binary directory. This new file should have a .SET
file extension and it should contain a separate section for your custom Code Page. Five Code Pages are available as the basis for customized Code Pages:
The characters belonging to the custom Code Page should be given in UNICODE and follow the other layout conventions found in the RECOGN.SET
basic Code Page Definition file.
In some cases, the Code Page setting of the Engine must be specified together with the Output Text Format for the final output document. With other output formats, specifying the Code Page is superfluous, since these output converters ignore the Code Page setting (e.g. MS Word).