Question

I'm trying to develop an image focusing algorithm for some test automation work. I've chosen to use AForge.net, since it seems like a nice mature .net friendly system.

Unfortunately, I can't seem to find information on building autofocus algorithms from scratch, so I've given it my best try:

take image. apply sobel edge detection filter, which generates a greyscale edge outline. generate a histogram and save the standard dev. move camera one step closer to subject and take another picture. if the standard dev is smaller than previous one, we're getting more in focus. otherwise, we've past the optimal distance to be taking pictures.

is there a better way?

update: HUGE flaw in this, by the way. as I get past the optimal focus point, my "image in focus" value continues growing. you'd expect a parabolic-ish function looking at distance/focus-value, but in reality you get something that's more logarithmic

update 2: okay, so I went back to this and the current method we're exploring is given a few known edges (okay, so I know exactly what the objects in the picture are), I do a manual pixel intensity comparison. as the resulting graph gets steeper, I get more in focus. I'll post code once the core algorithm gets ported from matlab into c# (yeah, matlab.. :S)

update 3: yay final update. came back to this again. the final code looks like this:

step 1: get image from the list of images (I took a hundred photos through the focused point)

step 2: find an edge for the object I'm focusing (In my case its a rectangular object that's always in the same place, so I crop a HIGH and NARROW rectangle of one edge)

step 3: get the HorizontalIntensityStatistics (Aforge.net class) for that cropped image.

step 4: get the Histogram (gray, in my case)

step 5: find the derivative of the values of the histogram

step 6: when your slope is the largest, is when you're in the most focused point.

Was it helpful?

Solution

It may be a bit simplistic for your needs, but I've had good results with a simple algorithm that looks at the difference to neighbouring pixels. The sum of the difference of pixels 2-away seems to be a reasonable measure of image contrast. I couldn't find the original paper by Brenner in the 70's but it is mentioned in http://www2.die.upm.es/im/papers/Autofocus.pdf

Another issue is when the image is extremely out of focus, there is very little focus information, so it's hard to tell which way is 'moving closer' or to avoid a local maximum.

OTHER TIPS

You can have a look at the technique used in the NASA Curiosity Mars Rover.

The technique is described in this article

EDGETT, Kenneth S., et al. Curiosity’s Mars Hand Lens Imager (MAHLI) Investigation. Space science reviews, 2012, 170.1-4: 259-317.

which is available as a PDF here.

Quoting from the article:

7.2.2 Autofocus

Autofocus is anticipated to be the primary method by which MAHLI is focused on Mars. The autofocus command instructs the camera to move to a specified starting motor count position and collect an image, move a specified number of steps and collect another image, and keep doing so until reaching a commanded total number of images, each separated by a specified motor count increment. Each of these images is JPEG compressed (Joint Photographic Experts Group; see CCITT (1993)) with the same compression quality factor applied. The file size of each compressed image is a measure of scene detail, which is in turn a function of focus (an in-focus image shows more detail than a blurry, out of focus view of the same scene). As illustrated in Fig. 23, the camera determines the relationship between JPEG file size and motor count and fits a parabola to the three neighboring maximum file sizes. The vertex of the parabola provides an estimate of the best focus motor count position. Having made this determination, MAHLI moves the lens focus group to the best motor position and acquires an image; this image is stored, the earlier images used to determine the autofocus position are not saved.

Autofocus can be performed over the entire MAHLI field of view, or it can be performed on a sub-frame that corresponds to the portion of the scene that includes the object(s) to be studied. Depending on the nature of the subject and knowledge of the uncertainties in robotic arm positioning of MAHLI, users might elect to acquire a centered autofocus sub-frame or they might select an off-center autofocus sub-frame if positioning knowledge is sufficient to determine where the sub-frame should be located. Use of sub-frames to perform autofocus is highly recommended because this usually results in the subject being in better focus than is the case when autofocus is applied to the full CCD; further, the resulting motor count position from autofocus using a sub-frame usually results in a more accurate determination of working distance from pixel scale.

The following is Figure 23:

Autofocus in NASA Curiosity Mars Rover

This idea was suggested also in this answer: https://stackoverflow.com/a/2173259/15485

This might be useful. It's how camera's AF system actually works - Passive Autofocus

Contrast measurement

Contrast measurement is achieved by measuring contrast within a sensor field, through the lens. The intensity difference between adjacent pixels of the sensor naturally increases with correct image focus. The optical system can thereby be adjusted until the maximum contrast is detected. In this method, AF does not involve actual distance measurement at all and is generally slower than phase detection systems, especially when operating under dim light. As it does not use a separate sensor, however, contrast-detect autofocus can be more flexible (as it is implemented in software) and potentially more accurate. This is a common method in video cameras and consumer-level digital cameras that lack shutters and reflex mirrors. Some DSLRs (including Olympus E-420, Panasonic L10, Nikon D90, Nikon D5000, Nikon D300 in Tripod Mode, Canon EOS 5D Mark II, Canon EOS 50D) use this method when focusing in their live-view modes. A new interchangeable-lens system, Micro Four Thirds, exclusively uses contrast measurement autofocus, and is said to offer performance comparable to phase detect systems.

I haven't built one myself, but my first thought would be to do a 2D DFT on a portion of the image. When out of focus, high frequencies will disappear automatically.

For a lazy prototype, You could try to compress a region of the image with JPEG (high quality), and look at the output stream size. A big file means a lot of detail, which in turn implies the image is in focus. Beware that the camera should not be too noisy, and that you can't compare file sizes across different scenes of course.

While the sobel is a decent choice, I would probably choose to do an edge magnitude calculation on the projections in x and y directions over several small representative regions. Another .NET friendly choices based on OpenCV is @ http://www.emgu.com/wiki/index.php/Main_Page.

I wonder if the standard deviation is the best choice: If the image gets sharper, the sobel filter image will contain brighter pixels at the edges, but at the same time fewer bright pixels, because the edges are getting thinner. Maybe you could try using an average of the 1% highest pixel values in the sobel image?

Another flavor for focus metric might be:

Grab several images and average them ( noise reduction). Then FFT the averaged image and use the high to low frequencies energy ratio. The higher this ration the better focus. A Matlab demo is available (excluding the averaging stage) within the demos of the toolbox :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top