Finding word's bounding box on a low quality image

https://stackoverflow.com/questions/21397246

03-10-2022
|

Question

I'm trying to get a bounding box for the word "ЛИЛИЯ" in this image, using opencv.

I am already experimenting with cv::findContours() and different thresholding alogrithms for couple of days, but can not get any satisfying results.

So, what do I know about this word:

letters are of similar size;
letters' height is in range: 40px — 90px;
word is oriented horizontaly (±5˚);
there is one and only one word on this image;
this word does not intersect image's border (it's fully visible);
different parts of image may have different luminosity;
hotspots (totally white areas) may be present on an image.

English is not my native language, so I'm sorry if the question is not properly explained. If someone needs more images to answer this question, I have at least a dozen more.

Solution

Check out stroke width transform. That is used to text detection.

OTHER TIPS

You can preprocess your image with adaptiveThreshold. You should use a blocksize a little bit bigger than your biggest character. I tried on your image with 91 and it gave good results. Then you can use FindContours and filter the blobs/contours using their height. Note that the letters will still be connected one to another so you cannot really filter using the width.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow