
Shortly, I want to make the pre-processing procedures before OCR with the suggestion comes from ABBYY 's technology. There are two parts in article:

  • Background Filtering: separate text strings from background. enter image description here
  • Adaptive Binarization: make lines and words will be correctly detected and higher recognition accuracy will be reached. And they try to impact on characters. enter image description here enter image description here

I wonder are there any ways to achieve them by using opencv? Any suggestions or sample codes would be appreciated.

Was it helpful?


I would encourage you to use this code: In particular wolf's binarization, it works really well in practice and it needs very little change to c++ code if you want to use it with opencv. Basically you have to pass the pointer to your image data to this function.


Here is a couple of papers, hope it'll be useful:

Paper from XEROX:

And another good paper about image preprocessing for ocr:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top