Question

I'm running into failures when scanning laser or ink-jet printed QR codes which I'm guessing is to due a combination of less than solid ink/toner coverage and the image processing parameters I'm using in each of two conversion steps that I have control over. I've come up with a "solution" of sorts, but I don't know how robust it is and I'd like to understand both what's going on and whether there's a more reliable approach.

I don't have control over the PDF files I'll be receiving as input and I don't know of any tools to reliably let me examine images within the PDF samples I have, but I'm pretty sure the samples are grey level images of some sort. Here, for example, is a screen snapshot of a corner of one of the QR code cells using Acrobat Reader with its magnification cranked way up. I'm assuming that the fuzzy circles inside the black represent a single "light" pixel within the image.

enter image description here

To turn the PDF into a PNG file, I use ghostscript with sDevice=pnggrey. After ZXing was unable to recognize the QR codes with an output resolution of 200 dpi, I changed the output resolution to 100 dpi and it succeeded, presumably because by averaging the light pixels with a greater number of surrounding black pixels, the resulting level was below the threshold that was subsequently used to turn the image into the monochrome image that ZXing works on.

The subsequent image processing steps are from the ZXing.rb Ruby gem and consist of calls to LuminanceSource and GlobalHistogramBinarizer from ZXing. The latter uses a single, global black point for binarization and suggests a HybridBinarizer for desktop applications, but since the images were failing on the online version of ZXing as well, I didn't bother experimenting with `HybridBinarizer'.

200 dpi (failing) image:

200 dpi image

100 dpi (succeeding) image:

enter image description here

Update: I tried scan converting the PDF at 600 dpi and that succeeded as well. Intrigued, I then took a picture of the printed image with my mobile phone and here's a blown up version of one of the modules in the resulting JPEG:

enter image description here

As expected, the mobile phone QR code app had no problem recognizing the image from just about any angle and distance.

Update 2 Here is a look at a couple of codes from the (succeeding) 600dpi (PNG) image:

enter image description here

No correct solution

OTHER TIPS

The problem is really the difference between dithering and grayscale. The library has a deep assumption that the images are photo-like, and that black-white boundaries look like a continuous change from black to white through shades of gray. Images where the black-white border has been dithered into a checkerboard pattern of black and white to represent gray don't work well since these are interpreted as many tiny black-white transitions when there is just one.

The best thing is to avoid the dithering stage wherever it occurs in the pipeline, but it may be out of your control. If so, then your best bet is down-sampling or blur filter, which is what you've already seen. The 100 dpi image works since the speckled region collapses.

I don't think the binarizers will make much difference to this situation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top