Question

Shortly, I want to make the pre-processing procedures before OCR with the suggestion comes from ABBYY 's technology. There are two parts in article:

  • Background Filtering: separate text strings from background. enter image description here
  • Adaptive Binarization: make lines and words will be correctly detected and higher recognition accuracy will be reached. And they try to impact on characters. enter image description here enter image description here

I wonder are there any ways to achieve them by using opencv? Any suggestions or sample codes would be appreciated.

Was it helpful?

Solution

I would encourage you to use this code: http://liris.cnrs.fr/christian.wolf/software/binarize/ In particular wolf's binarization, it works really well in practice and it needs very little change to c++ code if you want to use it with opencv. Basically you have to pass the pointer to your image data to this function.

OTHER TIPS

Here is a couple of papers, hope it'll be useful:

Paper from XEROX: http://www.xrce.xerox.com/content/download/6708/51560/file/Binarising-camera-images-for-OCR.pdf

And another good paper about image preprocessing for ocr: http://wbieniec.kis.p.lodz.pl/research/files/07_memstech_ocr.pdf

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top