How does the SiftDescriptorExtractor from OpenCV convert descriptor values?

Question

So the numbers from the feature vector should be no larger than 0.2 value.

No. The paper says that SIFT descriptors are:

normalized (with L2 norm)
truncated using 0.2 as a threshold (i.e. loop over the normalized values and truncate when appropriate)
normalized again

So in theory any SIFT descriptor component is between [0, 1], even though in practice the effective range observed is smaller (see below).

The question is, how these values have been converted in a Mat object?

They are converted from floating-point values to unsigned char-s.

Here's the related section from OpenCV modules/nonfree/src/sift.cpp calcSIFTDescriptor method:

float nrm2 = 0;
len = d*d*n;
for( k = 0; k < len; k++ )
    nrm2 += dst[k]*dst[k];
float thr = std::sqrt(nrm2)*SIFT_DESCR_MAG_THR;
for( i = 0, nrm2 = 0; i < k; i++ )
{
    float val = std::min(dst[i], thr);
    dst[i] = val;
    nrm2 += val*val;
}
nrm2 = SIFT_INT_DESCR_FCTR/std::max(std::sqrt(nrm2), FLT_EPSILON);
for( k = 0; k < len; k++ )
{
    dst[k] = saturate_cast<uchar>(dst[k]*nrm2);
}

With:

static const float SIFT_INT_DESCR_FCTR = 512.f;

This is because classical SIFT implementations quantize the normalized floating point values into unsigned char integer through a 512 multiplying factor, which is equivalent to consider that any SIFT component varies between [0, 1/2], and thus avoid to loose precision trying to encode the full [0, 1] range.