Question

I have the following function of code on my program

template <typename _Tp>
void lbp::OLBP_(const Mat& src, Mat& dst) {
    assert(src.rows > 3);
    dst = Mat::zeros(src.rows-2, src.cols-2, CV_8UC1);

    _Tp
        *row_m1,
        *row = (_Tp*)src.ptr<_Tp>(0),
        *row_p1 = (_Tp*)src.ptr<_Tp>(1);

    for(int i=1; i<src.rows-1; i++) {
        unsigned char *dst_row = dst.ptr<unsigned char>(i-1);
        row_m1 = row;
        row = row_p1;
        row_p1 = (_Tp*)src.ptr<_Tp>(i+1);

        for(int j=1;j<src.cols-1;j++) {
            _Tp center = row[j];
            unsigned char code = 0;
            code |= (row_m1 [j-1] > center) << 7;
            code |= (row_m1 [j]   > center) << 6;
            code |= (row_m1 [j+1] > center) << 5;
            code |= (row    [j+1] > center) << 4;
            code |= (row_p1 [j+1] > center) << 3;
            code |= (row_p1 [j]   > center) << 2;
            code |= (row_p1 [j-1] > center) << 1;
            code |= (row    [j-1] > center) << 0;
            dst_row[j-1] = code;
        }
    }
}

Basically what it does is, for each pixel in src generates a code in dst. That "code" is a binary number where each one of its 8 bits correspond to each one of the 8 pixel's neighbors, a 1 if the neighbor is larger than the pixel in the center and 0 otherwise. (see Local Binary Patterns in Wikipedia)

The code works fine and its going to be called a very large number of times. Running a profiler I found out that most of the execution time was spent on this particular function, particularly on the lines where I access the pixel values:

//The code is templated but it should compile to something similar to this
unsigned char *pointer = src.ptr<unsigned char>(row); // THIS CONSUMES MOST OF THE TIME!
pointer[column]; // get the actual pixel value

Is there any room for improvement on the way pixels are accessed? I'm particularly interested in scenarios like this where the function requires random access to the image and not so the sequential case.

Was it helpful?

Solution

In fact, yes. You can avoid calling Mat::ptr when writing data into the dst image:

...
for(int i=1; i<src.rows-1; i++) {
  unsigned char *dst_row = dst.ptr<unsigned char>(i-1); << THIS CAN BE AVOIDED      
...

You can simply initialize a pointer to the data buffer in dst after creating it and then filling it without calling ptr, i.e.:

dst = Mat::zeros(src.rows-2, src.cols-2, CV_8UC1);
unsigned char *dst_data = dst.data;
int pix_ctr = 0;

// .. and then, inside the `for` loop, after calculating the LBP code.

dst_data[pix_ctr++] = code;

This is possible only if the data stored in dst is contiguous (in this case you know it because you just initialized it by calling to Mat::zeros). You can check it by calling dst.isContinuous().

(In fact, you could get rid of all the Mat::ptr inside the loop if the input data in src is contiguous. In that case, advance your pixel pointers without ptr by using the image dimensions).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top