So the numbers from the feature vector should be no larger than 0.2 value.
No. The paper says that SIFT descriptors are:
- normalized (with L2 norm)
- truncated using
0.2
as a threshold (i.e. loop over the normalized values and truncate when appropriate) - normalized again
So in theory any SIFT descriptor component is between [0, 1]
, even though in practice the effective range observed is smaller (see below).
The question is, how these values have been converted in a Mat object?
They are converted from floating-point values to unsigned char
-s.
Here's the related section from OpenCV modules/nonfree/src/sift.cpp
calcSIFTDescriptor
method:
float nrm2 = 0;
len = d*d*n;
for( k = 0; k < len; k++ )
nrm2 += dst[k]*dst[k];
float thr = std::sqrt(nrm2)*SIFT_DESCR_MAG_THR;
for( i = 0, nrm2 = 0; i < k; i++ )
{
float val = std::min(dst[i], thr);
dst[i] = val;
nrm2 += val*val;
}
nrm2 = SIFT_INT_DESCR_FCTR/std::max(std::sqrt(nrm2), FLT_EPSILON);
for( k = 0; k < len; k++ )
{
dst[k] = saturate_cast<uchar>(dst[k]*nrm2);
}
With:
static const float SIFT_INT_DESCR_FCTR = 512.f;
This is because classical SIFT implementations quantize the normalized floating point values into unsigned char
integer through a 512 multiplying factor, which is equivalent to consider that any SIFT component varies between [0, 1/2]
, and thus avoid to loose precision trying to encode the full [0, 1]
range.