Question

I'm following these instructions for training the Tesseract OCR engine for a new font.

However, when trying to make the box file, I get an error. This is the command I use:

H:\Documents\TesseractTraining>tesseract eng.helvetica.exp0.tif eng.helvetica.exp0   batch.nochop makebox

And here is the error message:

Tesseract Open Source OCR Engine v3.02 with Leptonica
TIFFstream: Sorry, can not handle image.
Unsupported image type.

Some googling suggests that there might be an error with the Leptonica installation. I don't even know if Leptonica is installed on my computer and the webpage is quite confusing with several READMEs (one called "README" and one called "Documentation"), none of them simple enough for me to understand how I would make it work on Windows. I have the Express Edition of Visual Studio 2008, so I can't use the command prompt suggested.

So, my question is: Does anybody know what might be wrong and how I fix it?

Was it helpful?

Solution

Looks like you got a bad image. You can use jTessBoxEditor tool to create TIFF images suitable for training purpose.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top