Question

What is a fast and reliable way to threshold images with possible blurring and non-uniform brightness?

Example (blurring but uniform brightness):

enter image description here

Because the image is not guaranteed to have uniform brightness, it's not feasible to use a fixed threshold. An adaptive threshold works alright, but because of the blurriness it creates breaks and distortions in the features (here, the important features are the Sudoku digits):

enter image description here

I've also tried using Histogram Equalization (using OpenCV's equalizeHist function). It increases contrast without reducing differences in brightness.

The best solution I've found is to divide the image by its morphological closing (credit to this post) to make the brightness uniform, then renormalize, then use a fixed threshold (using Otsu's algorithm to pick the optimal threshold level):

enter image description here

Here is code for this in OpenCV for Android:

Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));
Mat closed = new Mat(); // closed will have type CV_32F
Imgproc.morphologyEx(image, closed, Imgproc.MORPH_CLOSE, kernel);
Core.divide(image, closed, closed, 1, CvType.CV_32F);
Core.normalize(closed, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.threshold(image, image, -1, 255, Imgproc.THRESH_BINARY_INV
    +Imgproc.THRESH_OTSU); 

This works great but the closing operation is very slow. Reducing the size of the structuring element increases speed but reduces accuracy.

Edit: based on DCS's suggestion I tried using a high-pass filter. I chose the Laplacian filter, but I would expect similar results with Sobel and Scharr filters. The filter picks up high-frequency noise in the areas which do not contain features, and suffers from similar distortion to the adaptive threshold due to blurring. it also takes about as long as the closing operation. Here is an example with a 15x15 filter:

enter image description here

Edit 2: Based on AruniRC's answer, I used Canny edge detection on the image with the suggested parameters:

double mean = Core.mean(image).val[0];
Imgproc.Canny(image, image, 0.66*mean, 1.33*mean);

I'm not sure how to reliably automatically fine-tune the parameters to get connected digits.

enter image description here

Was it helpful?

Solution

Using Vaughn Cato and Theraot's suggestions, I scaled down the image before closing it, then scaled the closed image up to regular size. I also reduced the kernel size proportionately.

Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(5,5));
Mat temp = new Mat(); 

Imgproc.resize(image, temp, new Size(image.cols()/4, image.rows()/4));
Imgproc.morphologyEx(temp, temp, Imgproc.MORPH_CLOSE, kernel);
Imgproc.resize(temp, temp, new Size(image.cols(), image.rows()));

Core.divide(image, temp, temp, 1, CvType.CV_32F); // temp will now have type CV_32F
Core.normalize(temp, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);

Imgproc.threshold(image, image, -1, 255, 
    Imgproc.THRESH_BINARY_INV+Imgproc.THRESH_OTSU);

The image below shows the results side-by-side for 3 different methods:

Left - regular size closing (432 pixels), size 19 kernel

Middle - half-size closing (216 pixels), size 9 kernel

Right - quarter-size closing (108 pixels), size 5 kernel

enter image description here

The image quality deteriorates as the size of the image used for closing gets smaller, but the deterioration isn't significant enough to affect feature recognition algorithms. The speed increases slightly more than 16-fold for the quarter-size closing, even with the resizing, which suggests that closing time is roughly proportional to the number of pixels in the image.

Any suggestions on how to further improve upon this idea (either by further reducing the speed, or reducing the deterioration in image quality) are very welcome.

OTHER TIPS

Alternative approach:

Assuming your intention is to have the numerals to be clearly binarized ... shift your focus to components instead of the whole image.

Here's a pretty easy approach:

  1. Do a Canny edgemap on the image. First try it with parameters to Canny function in the range of the low threshold to 0.66*[mean value] and the high threshold to 1.33*[mean value]. (meaning the mean of the greylevel values).
  2. You would need to fiddle with the parameters a bit to get an image where the major components/numerals are visible clearly as separate components. Near perfect would be good enough at this stage.
  3. Considering each Canny edge as a connected component (i.e. use the cvFindContours() or its C++ counterpart, whichever) one can estimate the foreground and background greylevels and reach a threshold.

    For the last bit, do take a look at sections 2. and 3. of this paper. Skipping most of the non-essential theoretical parts it shouldn't be too difficult to have it implemented in OpenCV.

    Hope this helped!

Edit 1:

Based on the Canny edge thresholds here's a very rough idea just sufficient to fine-tune the values. The high_threshold controls how strong an edge must be before it is detected. Basically, an edge must have gradient magnitude greater than high_threshold to be detected in the first place. So this does the initial detection of edges.

Now, the low_threshold deals with connecting nearby edges. It controls how much nearby disconnected edges will get combined together into a single edge. For a better idea, read "Step 6" of this webpage. Try setting a very small low_threshold and see how things come about. You could discard that 0.66*[mean value] thing if it doesn't work on these images - its just a rule of thumb anyway.

We use Bradleys algorithm for very similar problem (to segment letters from background, with uneven light and uneven background color), described here: http://people.scs.carleton.ca:8008/~roth/iit-publications-iti/docs/gerh-50002.pdf, C# code here: http://code.google.com/p/aforge/source/browse/trunk/Sources/Imaging/Filters/Adaptive+Binarization/BradleyLocalThresholding.cs?r=1360. It works on integral image, which can be calculated using integral function of OpenCV. It is very reliable and fast, but itself is not implemented in OpenCV, but is easy to port.

Another option is adaptiveThreshold method in openCV, but we did not give it a try: http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#adaptivethreshold. The MEAN version is the same as bradleys, except that it uses a constant to modify the mean value instead of a percentage, which I think is better.

Also, good article is here: https://dsp.stackexchange.com/a/2504

You could try working on a per-tile basis if you know you have a good crop of the grid. Working on 9 subimages rather than the whole pic will most likely lead to more uniform brightness on each subimage. If your cropping is perfect you could even try going for each digit cell individually; but it all depends on how reliable is your crop.

Ellipse shape is complex to calculate if compared to a flat shape. Try to change:

Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));

to:

Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(19,19));

can speed up your enough solution with low impact to accuracy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top