mftraining • man page

mftraining (1)

Leading comments

    Title: mftraining
   Author: [see the "AUTHOR" section]
Generator: DocBook XSL Stylesheets v1.78.1 <http://docbook.sf.net/>
     Date: 06/12/2015
   Manual: \ \&
   Source: \ \&
 Language: English

(The comments found at the beginning of the groff file "man1/mftraining.1".)

NAME

mftraining - feature training for Tesseract

SYNOPSIS

mftraining -U unicharset -O lang.unicharset FILE...

DESCRIPTION

mftraining takes a list of .tr files, from which it generates the files inttemp (the shape prototypes), shapetable, and pffmtable (the number of expected features for each character). (A fourth file called Microfeat is also written by this program, but it is not used.)

OPTIONS

-U FILE

: (Input) The unicharset generated by unicharset_extractor(1)

-F font_properties_file

(Input) font properties file, each line is of the following form, where each field other than the font name is 0 or 1:

*font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*

-X xheights_file

(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ]

*font_name* *xheight*

-D dir

: Directory to write output files to.

-O FILE

: (Output) The output unicharset that will be given to combine_tessdata(1)

COPYING

AUTHOR

The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).

mftraining • man page

mftraining • man page

mftraining (1)

Leading comments

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO

COPYING

AUTHOR

Installed via

Man Section

extra • Version

References

Referenced By