Question

In their paper describing Viola-Jones object detection framework (Robust Real-Time Face Detection by Viola and Jones), it is said:

All example sub-windows used for training were variance normalized to minimize the effect of different lighting conditions.

My question is "What kind of tool did they use to normalize the images?"

I'm NOT looking for the specific tool that Viola & Jones used but a similar one that produces almost the same output. I've been following a lot of haar-training tutorials(trying to detect a hand) but not yet able to output a good detector(xml).

I've tried contacting the authors, but still no response yet.

Was it helpful?

Solution

One possible way is to apply plain and simple normalization assuming normal distribution to all elements.

First find the average (Mu) and standard deviation (S):

Mu = 1/N * Sum(a[i][j]) for each i,j 
S  =  sqrt(1/(N-1) *  Sum((a[i][j] - Mu)^2)) for each i,j
       (in here N is the number of pixels, 20*20 in the viola jones case)

From this, we can normalize the value of each pixel using standard normal distribution formula (by standardizing all values):

a'[i][j] = (a[i][j] - Mu) / S

Another method is vector normalization, which basically says:

  • Find the length of the vector: |a| = sqrt(sum (a[i][j]*a[i][j])) for each i,j
  • Assign: a'[i][j] = a[i][j] / |a|
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top