What characteristics of a binary image of a number 0-9 should be used with the k nearest neighbour algorithm?

https://stackoverflow.com/questions/22161705

19-10-2022
|

Question

Fooling around with OCR.

I have a set of binary images of numbers 0-9 that can be used as training data, and another set of unknown numbers within the same range. I want to be able to classify the numbers in the unknown set by using the k nearest neighbour algorithm.

I've done some studying on the algorithm, and I've read that the best approach is to take quantity characteristics and plot each training data in a feature space with those characteristics as the axes, and for each image in the unknown set do the same, and using the k nearest neighbour algorithm find the closest points, something like what is done here.

What characteristics would be best suited to something like this?

Solution

In a simple case, as phs mentioned in his comment, pixel intensities are used. The images are resized to a standard size like 20x20, 10x10 etc, and express the whole image as a vector of 400 or 100 elements respectively.

Such an example is shown here: Simple Digit Recognition OCR in OpenCV-Python

Or you can look for features like moments, centroid, area, perimeter, euler number etc.

If your image is grayscale, you can for Histogram of Oriented Gradients. Here is an example with SVM. You can try adapting it to the kNN : http://docs.opencv.org/trunk/doc/py_tutorials/py_ml/py_svm/py_svm_opencv/py_svm_opencv.html#svm-opencv

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow