Question

I am new to tesseract library and I set it up on Ubuntu 12.04.

I am using this data set to be recognized. When I was feeding these images to tesseract as it is (without any preprocessing) using this code I was getting 70-75% approx. accuracy.

I want accuracy to be 90+% so I did some preprocessing steps I followed to enhance the image are

Steps for Preprocessing

  1. Applied bottom hat operator with structured element of circle of radius 12
  2. Complement of image to make background white and text as black
  3. Enhance the contrast of resultant image
  4. Erode the image.

after these steps I get pretty clear images can be seen here. But now when I feed these images to tessearct using that same code accuracy get reduced to < 50% I dont know why? Is it because of tesseract do some preprocessing as well? if yes then how can I restrict tesseract from doing that preprocessing. If not then why it is giving me bad results when image is pretty clear now? Pardon me if I have asked some basic question.

Was it helpful?

Solution 2

Well I was feeding grayscale(8bpp) image to tesseract after preprocessing so after getting that grayscale image tesseract is trying to binarize i.e. convert it to black and white, that was giving me bad results, I still don't know why.

But after that I tried to first convert my scale image in to b/w or 1bpp image and then I fed that image to tesseract I got relatively much better results.

OTHER TIPS

Regarding your question why tesseract delivers better results when using a binary image instead of a gray image as input for tesseract:

Tesseract will do an internal binarization of the gray scale image with various methods (haven't figured out right know what method for binarization is used exactly, some times local adaptiv threshold, some times global OTSU threshold is mentioned in the internet). Sure is, that tesseract performs character recognition on a binary image and that the preprocessing of tesseract can still fail at specific problems (hasn't got good layout analyzes for example). So if you do the preprocessing part yourself and give tesseract as input image only a binary image with text and disable all layout analyzes in tesseract you could achieve better results than letting tesseract doing all for you. Since it is an open source free utility, it has some known drawbacks, which has to be accepted.

If you use tesseract as command line tool, this thread is very useful for the parameter. tesseract command line page segmantation

If you use the source code of tesseract in developing your own C++ Code, you have to initialze tesseract with the correct parameter. Parameter are described here at the tesseract API side. tesseract API

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top