Perceptual Image Downsampling

https://stackoverflow.com/questions/1780277

21-09-2019
|

Question

So here is my problem:

I have an image, that image is large (high resolution) and it needs to be small (much lower resolution).

So I do the naive thing (kill every other pixel) and the result looks poor.

So I try to do something more intelligent (low pass filtering using a Fourier transform and re-sampling in Fourier space) and the result is a little better but still fairly poor.

So my question, is there a perceptually motivated image down-sampling algorithm (or implementation)?

edit: While I am aware of a number of resampling techniques, my application is more concerned with preserving the perceptual features, rather than producing smooth images.

edit2: it is safe to assume I have some level of familiarity with digital signal processing, convolutions, wavelet transforms, etc

Solution

Bicubic interpolation is generally regarded as good enough, but there is no perfect solution, it depends on people and on the properties of the picture being resampled.

Related links:

I didn't even know that sharpness was also called acutance.

Aliasing is a problem that can occur when downsampling naively.

OTHER TIPS

Read this:

http://www.dspguide.com/

OK, that's quite a read. But understanding filter design would be handy.

In general, the process for scaling an image from W1 x H1 to W2 x H2 where W1, W2, H1, H2 are integers, is to find new W3, H3 so that W1 and W2 are integer factors of W3 and H1 and H2 are integer factors of H3, and then pad the original image with zeros (used to space the pixels of the original image) so that it's now W3 x H3 in size. This introduces high frequencies due to discontinuities in the image, so you apply a low-pass filter to the image, and then decimate the filtered image to its new size (W2 x H2). Sounds like you might be trying to do this already, but the filtering can be done in the time domain so that the Fourier transform isn't really necessary.

In practice, the process I just described is optimized (you'll note that when applying a convolution filter to the upscaled image most of the terms will be 0, so you can avoid most of the multiplication operations in your algorithm, for example. And since you end up throwing away many of the filtered results, you don't need to calculate those, so you end up with a handful of multiplications and additions for each pixel in the target image, basically. The trick is to figure out which coefficients to use.)

libswscale in the ffmpeg project does something like this, I believe. Check it out:

http://gitorious.org/libswscale

As others pointed out, (and you apparently noticed) decimating the image introduces aliasing artifacts. I can't be sure about your resampling implementation, but the technique has interesting gotchas depending on the window size you use and other implementation details.

Pascal is right. Depends on the image, and on what you want. Some factors:

preserving sharp edges
preserving colours
algorithm speed

This is your method.

Some others:

Note that sometimes resampling down can get you a sharper result than, say, using a lower resolution camera, because there will be edges in the high-resolution image that cannot be detected by a lower-res device.

Side note: Many algorithms (especially Nearest Neighbour) can be optimised if you are scaling down by an integer (e.g. dividing by 4 or 6).

Recommended ImageMagick "general purpose" downsampling methods are discussed here: http://www.imagemagick.org/Usage/filter/nicolas/#downsample

You could try a content aware resizing algorithm. See: http://www.seamcarving.com/

Paint Mono (an OS fork of Paint.NET) implements Supersampling algorithm for image downsampling here: http://code.google.com/p/paint-mono/source/browse/trunk/src/PdnLib/Surface.cs?spec=svn59&r=59#1313

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow