Building HMM for Kinect based gesture recognition

https://stackoverflow.com/questions/16004052

04-04-2022
|

Pregunta

I have a conceptional problem. I'm creating a program that uses Kinect for gesture recognition. I have some gesture data divided on categories (circles, swipes, etc.). For now I analyze only one hand. I record all the frames (30fps).

(*) For making my data discrete and position independent, I calculate angles between consecutive points.

Now I want to create hidden Markov models for each gesture type.

Now I need to determine a number of states for my HMM. How to do that? I thought about finding the longest gesture (in time). E.g. I have 3 gestures, first 1,2s, second 1,4s and third 1,5s. So 1,5s is the longest one. Now I want to apply (*) for each frame every 250 miliseconds (4 samples within a second). Because my longest gesture is 1,5s long, so NumberOfStatesForHMM = 1500ms / 250ms = 6 states - and this should be pretty optimal?

I'm not sure how should I define states for HMM:/ If my idea above is correct, how to count transition probabilities when there are (e.g.) 6 states and one gesture ends after 1s, so I analyze 4 states (probabilities of transitions from states 4 to 5 and 5 to 6 are equal to 0?).

I read THIS paper, but I'm not quite sure how to solve my problem...

Solución

I have worked on a similar dynamic hand gesture recognition project (although using a simpler webcam and not a Kinect). In my case, I categorized my gestures into classes say, Left, Right, Circular-Clockwise, Circular-AntiClockwise...etc. Since you would be taking angles between consecutive points into account, that would be your Observation Sequence. As for the states, there may not necessarily always be a logical relation between your States and Observation. I was working with 8 gestures. Now, I had about 12 observation symbols for each input pattern but the no. of states for each class was different. For example: Left : 2 States Right : 3 States Circle clockwise : 4 States etc.

The advantage was that from the State Sequence output I got from Viterbi algo., I could directly get the maximum state number and hence, my Class. Also, during the learning phase, my Baum-Welch implementation automatically learnt the classes depending on the no. of states. You could refer to my blog post [which has a description of my approach to recognizing gestures using HMM as I did in my project] for addition information. I hope it helps you.

Here is the Link

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow