How do you marshal Python cv2.cv.LoadImage tostring data into C IplImage->imageData struct

https://stackoverflow.com/questions/22585184

19-06-2023
|

Question

The root of this question is: What is the bit-by-bit format of the OpenCV IplImage->imageData property?

Background: I'm using Python's ctypes to allow pythonic access to a low-level C library that uses OpenCV. I've been able to get almost all the functions accessible from python, but I'm stuck on this one that demands the data of the old OpenCV struct known as IplImage, specifically the imageData property. I can't figure out how IplImage->imageData is organized versus how python's cv2.cv.LoadImage's iplimage type, which has ostensibly the same data as the C struct, but it appears to be organized differently.

So for example, I have a 4-pixel image that is 2x2 pixels. Top left pixel is 100% RED. Top right pixel is 100% GREEN. Bottom left pixel is 100% BLUE, Bottom right pixel is 100% white.

In python the information looks like this:

import cv2

img = cv2.cv.LoadImage('rgbw.png')
pixels = []
for ch in img.tostring():
  pixels.append(ord(ch))
print pixels

[0, 0, 255, 0, 255, 0, 255, 0, 0, 255, 255, 255]

Which makes sense to me: The first three values [0, 0, 255] represent B:0, G:0, R:255, the red pixel. The second, is green, the third is the lower-left, blue, and the last lower right is white.

I marshal this into the library and it library behaves fine, but it doesn't appear to "see" anything in the imageData (I get a return code that means "I saw nothing" when clearly this data is comprehensible when I pass it into the library using the C api directly.

So of course I suspect the C IplImage->imageData has the data organized completely differently, so I look in the debugger and find to my surprise that not only is the data different, but I can't understand it: here it is, starting with a cvLoadImage("rgbw.png") assigning it to an IplImage struct called 'image'.

Breakpoint 1, main (argc=2, argv=0x7fffffffe418) at IplImageInfo.cpp:44
44          printf("imageData %s\n", image->imageData);
(gdb) x/16ub image->imageData
0x618c90:   0   0   255 0   255 0   0   0
0x618c98:   255 0   0   255 255 255 0   0
(gdb)

So comparing it byte-by-byte, adding zeros for comparison's sake:

Python:

000 000 255 | 000 255 000 | 255 000 000 | 255 255 255

C: (printing the first 16 bytes, not 12, which is what I'd expect, see below)

000 000 255 | 000 255 000 | 000 000 255 | 000 000 255 | 255 255 000 | 000

Notice the first six bytes are the same in both. But then, what's going on? We have another TWO RED pixels, then ... a Cyan pixel? Another thing, this file is 12 bytes in size (4 pixels, 3 bytes each). When I print out the image->imageSize property from C, I get 16, not 12. So something is rotten I don't get it. Clearly there's something wrong with my model of what's in imageData. Can you explain it?

Solution

The python code I was using was missing some logic required. This logic doesn't apply in the Python interface and there's no clue in Python how this works in the C library. Basically, IplImage (and I believe Mat, too; the C++ successor to the old IplImage struct) pads out rows of pixels in the imageData property to be divisible by 4 by adding that number of empty (0-value) bytes. So the code I had, which was this:

import cv2

img = cv2.cv.LoadImage('rgbw.png')
pixels = []
for ch in img.tostring():
  pixels.append(ord(ch))
print pixels

[0, 0, 255, 0, 255, 0, 255, 0, 0, 255, 255, 255]

Was missing this logic. I solved this as the following:

import cv2

img = cv2.cv.LoadImage('rgbw.png')
height = img.height
width = img.width
raw_data = img.tostring()

# iplImage->imageData requires rows to be padded with zero bytes at the end 
# so they be divisible by 4
pad_bytes_per_row = width % 4

# create the ctypes structure
ubyte_array_type = c_ubyte * (len(raw_data) + (height * pad_bytes_per_row))
ubyte_array = ubyte_array_type()
index = 0
for ch in raw_data:
    ubyte_array[index] = ord(ch)
    index += 1
    if 0 == index % width: # end of row
        pad_index = 0
        while pad_index < pad_bytes_per_row:
            ubyte_array[index] = 0
            pad_index += 1
            index += 1

Now ubyte_array is populated with the correct information from the python API of opencv. Note this would be the same if you were using a numpy_array.tostring() method for the data and wanted to use that to populate the Mat object. Hope this helps someone.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow