|•||Choose Plain text output format from the Options|Configuration dialog (f8) then click the Options... button on the 'General' page, or |
choose Plain text format from the File|Export dialog Ctrl E) – see TXT file export. Then click Options...
|•||Click Apply to accept the 'Options' page configuration.|
EscapeE resumes displaying the 'General' page or
the Export dialog.
|1.||In the 'Space Fill for Text Export' section, to fill fixed pitch text with spaces you may set up the options:|
|o||Left align or Right align|
|o||Space width: type in the width (in current units) of the column to be filled.|
|2.||In the 'Text extraction' section, you may choose to:|
|o||Define the inter-line spacing to be used when outputting the extracted text. Enter a number in the Line height box (units are set up on the 'Viewing' page – see Configuring the view). If this box left blank then the vertical spacing is taken from the font of the text found in the original document.|
|o||Define the maximum vertical difference between the baselines of two words for them to be deemed on the same line: enter a value in the Maximum same line Y difference box.|
|o||To define the minimum horizontal distance between two characters for it to be deemed a word break: enter a value in the Minimum space width box. If the gap is more than this value then one or more spaces will be inserted in the extracted text. The default size is 70% of the 'Space width'.|
|o||To ignore downloaded space character's width and use the cell width instead, tick Space width = cell width.|
|o||To ignore downloaded character widths and use widths calculated to fit the raster instead, select Calculate character widths.|
|o||To use the top of the character cell as the vertical reference point rather than the baseline, check Align using top of cell. This may be advisable when the baseline reference changes mid-string, as is sometimes the case, for example, with superscript characters.|
|o||When sweeping out areas for fields and clips, if any part of a character's shape is included in the area then that text is normally selected too. Ticking Criterion is text baseline rather than text extent when selecting changes this to only include text when the baseline of a character is in the swept area.|
|3.||Ensure a suitable Symbol set is chosen to output the text; choose from Windows (19U), 16-bit Unicode, UTF8 Unicode, or Unchanged (i.e. same as input file). See also About Symbol sets.|
|4.||To use the character recognition database to convert the text back to a readable form from files where arbitrary character encodings have been used, select Assign character codes using the TTLIB database: see EEfonts.|
|o||Use converted codes when exporting Using the character codes as translated by EEfonts means that the text in such a file, exported to PDF or PCL, is searchable. See also Note below.|
|o||Select Use glyph number if character is unrecognized to perform character code assignment using the glyph IDs in a download TrueType font.|
|5.||Pure TXT format documents consist only of lines of text, so if you export a multi-page document as plain text, the pagination is normally lost. You may opt to retain the page structure instead by selecting Insert a Form Feed for each page.|
|6.||The underlining of text may be ignored by ticking Ignore underlining. This is the default; clear check-box to retain underlining instead.|
In addition, when you have finished setting up the options, you may also choose to create a Shortcut icon that uses all the options you have set by clicking Shortcut... (see Shortcuts - the easy way to construct a command line) or click the Save button to retain these settings after you close the program.
Some printer drivers use arbitrary character codes when downloading fonts, so that any text extracted directly from the PCL file would not be readable. By using EEfonts to set up a character recognition database, EscapeE is able to convert the characters back into usable text, see Assign character codes using TTLIB database. On export to PCL or PDF, ticking Use converted codes when exporting enables such files to be read and their text searched. Problems could arise if there are any character codes in these files which are not present in the database. Ticking Use glyph number if character is unrecognized assigns the IDs of the glyphs in downloaded TrueType fonts to such characters.
Exporting files to plain TeXT