Question

I have an image, taken from a live webcam, and I want to be able to detect a specific object in the image and extract that portion of it to do some further processing.

Specifically, the image would be of a game board, let's say for the purposes of this question that it's a Sudoku game board.

My initial approach was to look for contrasting areas and work it out from there, but I seem to end up with a lot of potential edges (many erroneous) and no real clue as to how to work out which ones are the ones I actually want!

Are there any algorithms, libraries, code samples, or even just bright ideas out there, as to how I would go about finding and extracting the relevant part of the image?

Was it helpful?

Solution

use the free AForge.Net image processing library for this. there's a ton of cool stuff to play with.

OTHER TIPS

You need to perform filters operation and masks on image.

I think there are no simple ways to just fetch object from the image, you need to use edge-detection algorithms, clipping, and set the criteria for valid objects/image.

You can also use image thresholding to detect object. You may want to look at below Image processing library.

  1. Filters API for C, C++, C#, Visual Basic .NET, Delphi, Python
  2. http://www.catenary.com/
  3. CIMG richer than above library however it is written in C++

One of the (I guess many possible) approaches:

  1. Find a filter that "gets/calculates" straight lines (edges, etc.) from a given image.

  2. Now you have the collection (array) of all the lines (xStart,yStart & xEnd,yEnd). You can easily calculate all the line-lengths from the coordinates.

  3. Now, considering that you can always (!) expect "one-biggest-square / rectangle" inside the image, it would be quite easy to find and calculate the wanted-sudoku-rectangle region and crop it from the image to do some further processing.

EDIT: Solving/programming that kind of problems is always challenging BUT really interesting at the same time :).

You might try using the Hough Transform.

I would start by using a corner detector (The Harris detector works nice) to find the intersections and corners of the sudoku grid.

Then I would use those points to do an image rectification to transform the image to have the grid as rectangular as possible. Now you should have no trouble finding each square to do OCR.

Image rectification is not simple and entails quite a lot of math.

Be prepared to do some reading :)

If the images of the game boards are already close to rectangular you can of course skip the rectification part and directly use the corner points to find your squares for OCR.

A lot of people have been suggesting to use Neural Networks. I am quite certain that throwing a neural network on this problem is totally unneccessary. NNs are (sometimes) good if you need to classify objects where the definition of the object is vague. "Find cars in image" is a problem which could have use for a Neural Network since cars can look very different but have some features the same. Thus, given enough data, you can train your NN to detect cars. In this problem you have something that is very regular and always looks almost the same, so a NN will not make anything easier or better.

Use aforge colorfiltering

There are many filtering method provided for c#, mainly I prefer aforge filters, for this purpose they have few filters, they are

* ColorFiltering
* ChannelFiltering
* HSLFiltering
* YCbCrFiltering
* EuclideanColorFiltering

See here

Take a look at: https://github.com/dajuric/accord-net-extensions

The library "joins" the free AForge.NET and Accord.NET library and adds image-processing and object tracking-algorithms. Samples included :)

You could try first to find the bold line intersections and use them as registration marks.

This would be a good start because:

  • They're pretty uniformly shaped
  • You know how many there are
  • You know where (roughly) they should be in relation to each other
  • Can tolerate scale variations

So:

  1. Apply an edge filter
  2. Scan a mask* of what the ideal + should look like across the image, recording all that are a good match
  3. Choose the set that matches your expectations best, according to location relative to one another
  4. You now also know where the numbers should be, so you can easily extract them.

* A more sophisticated solution would be to use a Neural Net instead of a mask to recognise the intersections. This might be worth it since your're probably going to use one for the OCR of the numbers.

Without rejecting any of the other ideas, step 1 really should be the detection of the image rotation. You can do this by determining the local gradient at each point and creating a histogram thereof. This will have 4 major components at 90 degree offsets. Ideally, these would be 0, 90, 180 and 270 but if they're not you should rotate your image. E.g. in the sample image you should start with a rotation over about 8 degrees CW.

You should google for CamShift or Blob tracking or Particle filters. They are all usefull for your problem. And most of them are shipped with OpenCV and it's C# wrapper AForge.NET. You will find some nice demos on Youtube showing how they work.

Good luck

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top