Question

I have an application which decides whether a human is handwaving,running or walking. The idea is i have segmented an action,say handwave,to its poses. Let's say

Example;

for human1:pose7-pose3-pose7-..... represents handwave
for human3:pose1-pose7-pose1-..... represents handwave
for human7:pose1-pose1-pose7-..... represents handwave
for human20:pose3-pose7-pose7-..... represents handwave

for human1 pose11-pose33-pose77-..... represents walking
for human2 pose31-pose33-pose77-..... represents walking
for human3 pose11-pose77-pose77-..... represents walking
for human20 pose11-pose33-pose11-..... represents walking

and i used above vectors for training SVM and Neural Net in Matlab..

Now I test with it test images. Again I have segmented poses for test images.

For the vector sizes of test and train sets in MATLAB; SVM and Neural Net requires same vector sizes. To make it work;
If I append 0 (assume it like pose0-which is an invalid pose) , to make sizes equal I have really good performance.
If I copy initial poses at the beginning and append them to the end until sizes are equal performance decreases.

For example;

train set: pose1-pose2-pose4-pose7-pose2-pose4-pose7
(1st method)test set: pose3-pose1-pose4-0-0-0-0 or
(2nd method)test set: pose3-pose1-pose4-pose3-pose1-pose4-pose3

I would expect to have better classification with 2nd method since appended values are actual values for poses. But pose0 is not a real pose.

Do you have any ideas ? Regards

Was it helpful?

Solution

In your case, your data consists of a collection of instances each with a number of features (poses slot, as in PoseSlot1,PoseSlot2,...,PoseSlotN), and the class value (hand-waving, running or walking).

Your problem is that the number of features is not the same for all classes, ie running has 7 poses while walking has 3 poses for example.

The standard way of dealing with this sort of issue is to mark these empty slots by a missing value, assuming that your machine learning algorithm can handle missing values.

f1     f2    f3    f4    f5    f6    f7    class
-------------------------------------------------
pose1,pose2,pose4,pose7,pose2,pose4,pose7,running
pose3,pose1,pose4,    ?,    ?,    ?,    ?,walking

Now, the first method you used of appending pose0 is a simplification to using ? for missing value (similar to adding a new pose to denote a missing value, instead of an explicit ? value)

The other way of repeating values actually creates a problem rather than solving one if you think about it.. you are in effect creating correlated features, and as you know, most machine learning algorithms works best on an independent set of features (usually solved by performing a feature selection as pre-processing step)

OTHER TIPS

I don't think that its unreasonable to get better performance from your first method. I assume that you mean better performance as in better classification. The reason for this I presume would be that the handwaving sequences are normally shorter. Thus, when you fill with "invalid" poses you make it a lot easier to distinguish the different actions by means of whether they include invalid poses than what actual poses they include.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top