Question

I have encountered a problem while setting up the font properties file to train tesseract v 3.01 ocr engine. according to the 3.01v you are required to setup a font properties file. The format of the font_properties file is such that

and 0 or 1 flags must be used to indicate the properties. does any one know what fixed, serif or fraktur means?

and when I run it with my font_properties file it throws the following errorenter image description here]![enter image description here

Thank you

Was it helpful?

Solution

No input files to Tesseract training should have spaces in their names.

The entry in font_properties should match the fontname part of the name of the image file; e.g., if font_properties has uknumberplate, then the filename of your image should be eng.uknumberplate.exp0.tif.

OTHER TIPS

Fixed (or monospaced), Serif, and Fraktur are standard font descriptors - you can look up what they mean on Wikipedia.

Regarding your error, ensure you have formatted your font_properties file properly correctly, as outlined in the Training Tesseract 3 tutorial below. If you're only training one font, the file should contain one line, in your case

times_new_roman 0 0 0 1 0

You haven't included what you've put in your font_properties file, but note that your font name should not have spaces!

http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3

you have to put font_properties.txt in the command, but then an exception is thrown in windows, but it finds the font properties file.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top