Question

I'm going to detect objects using a method proposed by Navneet Dalal, Bill Triggs, and Cordelia Schmid in 2006 (Human Detection Using Oriented Histograms of Flow and Appearance)

In that case they first calculate appearance descriptor using HOG algorithm and motion descriptor using Optical flow. Then they combined these two descriptors to get the final descriptor as I understand. But I couldn't find how they combined that.

So my question is how to combine appearance and motion descriptor to get the final descriptor. (I'm going to use leaner SVM for training and opencv to implement)

Was it helpful?

Solution

It is mentioned in the paper, page 12:

The combined-feature detectors above are monolithic – they concatenate the motion and appearance features into a single large feature vector and train a combined classifier on it.

So, you just make one feature vector by concatenation of the two descriptors. Other mentioned possibility is Mixture of Experts:

In our experiments these effects mitigate the losses due to separate training and the linear Mixture of Experts classifier actually performs slightly better than the best monolithic detector. For now the differences are marginal (less than 1%), but the Mixture of Experts architecture provides more flexibility and may ultimately be preferable. The component classifiers could also be combined in a more sophisticated way, for example using a rejection cascade [1, 22, 21] to improve the runtime.

You can read about this method, for example, here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top