Question

I'm working on a C# OCR program (project for my own learning purposes, nothing commercial-quality) that will recognize Hebrew characters. I plan to do this by separating the glyphs from the images and then applying template matching methods.

Where I'm at

I've got it now so that I can separate individual glyphs out of images. Each glyph is represented with a 2D array of pixels. For instance, the character "bet" looks something like:

..........
.*******..
.......*..
.......*..
.********.
..........

where "." represents an empty space and "*" represents a filled-in pixel.

I'm now to the point where I'm going to apply a template matching algorithm to identify what glyph this 2D array of pixels represents (in this case, it should match the "bet" template).

The issue

I'm having trouble finding a simple explanation of a good template matching algorithm (most of what I find are theses or links to code libraries), and was wondering if someone knew of any I might study.

I'd like to emphasize that I want to do this by hand and not simply use a library. I am willing to study how a library solves the problem, however, if it's not split into fifteen bajillion different pieces. :)

I'd also be willing to hear if there's any better methods for doing what I'm trying to do.

Was it helpful?

Solution

Generate a number for each template , since it is array of pixels and if you associate each pixel with a number( like 0,2,4,8,16 etc) and empty pixel is 0 and filled pixel is 1.

Then for each glyph also calculate the total and match them.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top