Character codes do not always map directly to a standard symbol set and so plain text extracted from a document may not match the characters in the original document.
Sometimes this is because the font has been downloaded by the type of printer driver that generates arbitrary character codes. In this case, use the RedTitan EEfonts program to set up a character recognition database so that EscapeE can do the text conversion.
If there are just a few "wrong" characters extracted from a document, it may be because the original document used some non-standard character codes that EscapeE has not recognized. In this situation you can assign appropriate character codes manually: see below.
| 1. | Open the original document and right-click on the first character to be assigned a new code. |
| 2. | Select Character recognition... (if all characters in the document are recognized this option is grayed-out). |
| 3. | The 'Character mapping' dialog opens, showing the code currently assigned as a hexadecimal number. Type the new hex code to be assigned to the character in the left-hand Translated code box. The new character will be shown in right-hand box 'Translated code' box. |
| 4. | Click OK to accept the assignment and close the dialog.
Or
Click Next: the dialog shows the current code of the next character on the page. |
| o | You may enter a new code for this character as above or |
When you have finished assigning characters you may
| o | click OK to close the dialog and accept all the assignments made while it was open or |
| o | click Cancel to close the dialog; any assignments made while it was open will be ignored. |
|
A new character code must be assigned to one instance of each unrecognized character in each font used by any text that you would like to extract.
Suppose, for example, that you to need change a dollar sign to a euro sign in an invoice. If the 'prices' were listed using 10pt Courier but 'totals' in 14pt Arial bold, a dollar character must be assigned in both fonts to extract 'prices' and 'totals' data in plain text.
Links
About symbol sets
TXT export options