Create UIImage from Leptonica's Pix structure

https://stackoverflow.com/questions/9050010

20-04-2021
|

Question

I want to use Leptonica library in my iOS app to process images.

Does anybody knows how can I create UIImage from the raw data in Leptonica's Pix structure:

/*-------------------------------------------------------------------------*
 *                              Basic Pix                                  *
 *-------------------------------------------------------------------------*/
struct Pix
{
    l_uint32             w;           /* width in pixels                   */
    l_uint32             h;           /* height in pixels                  */
    l_uint32             d;           /* depth in bits                     */
    l_uint32             wpl;         /* 32-bit words/line                 */
    l_uint32             refcount;    /* reference count (1 if no clones)  */
    l_int32              xres;        /* image res (ppi) in x direction    */
                                      /* (use 0 if unknown)                */
    l_int32              yres;        /* image res (ppi) in y direction    */
                                      /* (use 0 if unknown)                */
    l_int32              informat;    /* input file format, IFF_*          */
    char                *text;        /* text string associated with pix   */
    struct PixColormap  *colormap;    /* colormap (may be null)            */
    l_uint32            *data;        /* the image data                    */
};
typedef struct Pix PIX;

Thanks!

Solution

First, you might want to check out: Convert Leptonica Pix Object to QPixmap ( or other image object )

What we want is to find common formats that both Pix and UIImage support, convert from Pix to that common format, and then convert from the common format to UIImage.

From looking at the Leptonica library, it looks like the common supported formats are GIF, JPEG, TIFF, BMP, and PNG. JPEG will be lossy, and GIF and PNG will both result in additional work by the CPU (there will be an additional encode/decode cycle when we convert from Pix to UIImage). For these reasons, I chose TIFF in the example below. If it doesn't work, I would go with PNG.

The plan is as follows:

1) Convert from Pix to a byte buffer
2) Take the byte buffer and store it into an NSData
3) Pass that data into NSImage

It looks like the pixWriteMem() function is what we need for #1 (provided that support for it was compiled into the library).

From looking at the example code included with the library, it looks like we are responsible for freeing the output of pixWriteMem() - hence, we will pass YES into NSData's freeWhenDone: argument.

Something like this (warning: untested code):

UIImage *GetImageFromPix(Pix *thePix)
{
    UIImage *result = nil;

    l_uint8 *bytes = NULL;
    size_t size = 0;

    if (0 == pixWriteMem(&bytes, &size, thePix, IFF_TIFF)) {
        NSData *data = [[NSData alloc] initWithBytesNoCopy:bytes length:(NSUInteger)size freeWhenDone:YES];
        result = [UIImage imageWithData:data];
        [data release];
    }

    return result;
}

OTHER TIPS

Writing out to an intermediary file format. and reading back in, is a simple but inefficient method for converting from a Pix in-memory data structure to a UIImage data structure (or any other of the many containers for images in memory).

It is particularly inefficient computationally if the intermediate file representation is compressed (e.g., png), because the image data has to undergo compression before writing it out and decompression to an uncompressed raster after reading it back in.

The efficient method to convert a struct Pix to a struct X is to fill in the metadata fields in X (the image size, depth, resolution, text, etc), generate a colormap for struct X if the image is colormapped, and convert the image raster data from the Pix convention to the X convention. This last is the only tricky part, because you need to consider the following for each of the two in-memory raster representations:

(1) Padding for raster lines (Pix is padded to 4 bytes)
(2) Storage of multi-component pixels (Pix stores each component sequentially within each pixel)
(3) Size of 3-component pixels, such as rgb (Pix uses 4 bytes: rgba)
(4) Byte order for multi-byte pixels (Pix uses macros that determine the rgba byte order)
(5) Pixel order: for Pix, from left to right in the image, they are stored in order from the MSB to the LSB in each 32 bit word

A specification for struct Pix is given in the leptonica src file pix.h.

Here an implementation (32 bpp -> UIImage)

- (UIImage *)imageFromPix:(Pix *)pix
{
    l_uint32 width = pixGetWidth(pix);
    l_uint32 height = pixGetHeight(pix);
    l_uint32 bitsPerPixel = pixGetDepth(pix);
    l_uint32 bytesPerRow = pixGetWpl(pix) * 4;
    l_uint32 bitsPerComponent = 8;
    if (pixSetSpp(pix, 4) == 0) {
        bitsPerComponent = bitsPerPixel / pixGetSpp(pix);
    }

    l_uint32 *pixData = pixGetData(pix);

    CGDataProviderRef provider = CGDataProviderCreateWithData(NULL, pixData, bytesPerRow * height, NULL);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    CGImage *cgImage = CGImageCreate(width, height,
                                     bitsPerComponent, bitsPerPixel, bytesPerRow,
                                     colorSpace, kCGBitmapByteOrderDefault,
                                     provider, NULL, NO, kCGRenderingIntentDefault);

    CGDataProviderRelease(provider);
    CGColorSpaceRelease(colorSpace);

    UIImage *image = [UIImage imageWithCGImage:cgImage];
    return image;
}

If you want to convert 1 bpp image(thresholded for exapmle)

- (UIImage *)imageFrom1bppPix:(Pix *)pix
{
    Pix *pix32 = pixUnpackBinary(pix, 32, 0);

    UIImage *image = [self imageFromPix:pix32];

    pixDestroy(&pix32);

    return image;
}

There's an implementation for conversion between UIImage and Pix objects in the Tesseract-OCR-iOS repo.

See the following methods in G8Tesseract.m:

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow