What you could do is add a processing step to find the locally strongest response from SVM. Let me explain.
What you appear to be doing right now:
for each sliding window
W
, recordcategory[W] = SVM.hardDecision(W)
Hard decision means it return a boolean or integer, and for 2-category classification could be written like this:
hardDecision(W) = bool( softDecision(W) > 0 )
Since you mentioned OpenCV, in CvSVM::predict
you should set returnDFVal
to true :
returnDFVal – Specifies a type of the return value. If true and the problem is 2-class classification then the method returns the decision function value that is signed distance to the margin, else the function returns a class label (classification) or estimated function value (regression).
from the documentation.
What you could do is:
- for each sliding window
W
, recordscore[W] = SVM.softDecision(W)
- for each W, compute and record:
neighbors = max(score[W_left], score[W_right], score[W_up], score[W_bottom])
local[W] = score[W] > neighbors
powerful[W] = score[W] > threshold
.- for each
W
, you have a positive iflocal[W] && powerful[W]
Since your classifier will have a positive response for windows cloth (in space and/or appearance) to your true positive, the idea is to record the scores for each window, and then only keep positives which
- are a locally maximum score (greater that its neighbors) -->
local
- are strong enough -->
powerful
You could set threshold to 0 and adjust it until you get satisfying results. Or you could calibrate it automatically using your training set.