Question

I have extracted features from many images of isolated characters (such as gradient, neighbouring pixel weight and geometric properties. How can I use HMMs as a classifier trained on this data? All literature I read about HMM refers to states and state transitions but I can't connect it to features and class labeling. The example on JAHMM's home page doesn't relate to my problem. I need to use HMM not because it will work better than other approaches for this problem but because of constraints on project topic.

There was an answer to this question for online recognition but I want the same for offline and in a little more detail

EDIT: I partitioned each character into a grid with fixed number of squares. Now I am planning to perform feature extraction on each grid block and thus obtain a sequence of features for each sample by moving from left to right and top to bottom.

  1. Would this represent an adequate "sequence" for an HMM i.e. would an HMM be able to guess the temporal variation of the data, even though the character is not drawn from left to right and top to bottom? If not suggest an alternate way.

  2. Should I feed a lot of features or start with a few? how do I know if the HMM is underforming or if the features are bad? I am using JAHMM.

  3. Extracting stroke features is difficult and cant be logically combined with grid features? (since HMM expects a sequence generated by some random process)

Was it helpful?

Solution

I've usually seen neural networks used for this sort of recognition task, i.e. here, here here, and here. Since a simple google search turns up so many hits for neural networks in OCR, I'll assume you are set in using HMMs (a project limitation, correct?) Regardless, these links can offer some insight into gridding the image and obtaining image features.

Your approach for turning a grid into a sequence of observations is reasonable. In this case, be sure you do not confuse observations and states. The features you extract from one block should be collected into one observation, i.e. a feature vector. (In comparison to speech recognition, your block's feature vector is analogous to the feature vector associated with a speech phoneme.) You don't really have much information regarding the underlying states. This is the hidden aspect of HMMs, and the training process should inform the model how likely one feature vector is to follow another for a character (i.e. transition probabilities).

Since this is an off-line process, don't be concerned with the temporal aspects of how characters are actually drawn. For the purposes of your task, you've imposed a temporal order on the sequence of observations with your the left-to-right, top-to-bottom block sequence. This should work fine.

As for HMM performance: choose a reasonable vector of salient features. In speech recog, the dimensionality of a feature vector can be high (>10). (This is also where the cited literature can assist.) Set aside a percentage of the training data so that you can properly test the model. First, train the model, and then evaluate the model on the training dataset. How well does classify your characters? If it does poorly, re-evaluate the feature vector. If it does well on the test data, test the generality of the classifier by running it on the reserved test data.

As for the number of states, I would start with something heuristically derived number. Assuming your character images are scaled and normalized, perhaps something like 40%(?) of the blocks are occupied? This is a crude guess on my part since a source image was not provided. For an 8x8 grid, this would imply that 25 blocks are occupied. We could then start with 25 states - but that's probably naive: empty blocks can convey information (meaning the number of states might increase), but some features sets may be observed in similar states (meaning the number of states might decrease.) If it were me, I would probably pick something like 20 states. Having said that: be careful not to confuse features and states. Your feature vector is a representation of things observed in a particular state. If the tests described above show your model is performing poorly, tweak the number of states up or down and try again.

Good luck.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top