Question

I write my own gaussian filter but it is really slow.

OpenCV's Gaussian algorithm is much faster, 20 times than my gaussian filter. I want to rewrite OpenCV's Gaussian algorithm in my project, and I don't want to include opencv in my project.

However,

Can anyone give me the algorithm description, opencv's source code seems too hard to understand?

Was it helpful?

Solution

The Gaussian filter has a property that makes it very easy to speed up: the filter can be applied in both dimensions independently. You define a one-dimensional filter that operates vertically, and another that operates horizontally, and apply them both; this produces the same effect as a single filter applied in two dimensions.

Beyond that, you'll probably need to look at the SIMD instructions e.g. SSE3 available for your processor.

OTHER TIPS

To answer the second part of your question, a Gaussian blur is simply the a 3-d gaussian surface applied as a convolution kernel over the image. Wikipedia has a great reference on the algorithm itself, but basically, you take the values of a Gaussian curve and convert that into a square matrix, and multiply it by every single pixel in your image, e.g.:

Kernel:               
[0 1 2 0 0
1 4 6 4 1      X   Iterate over every single pixel in the image
2 6 10 6 2
1 4 6 4 1
0 1 2 1 0]

(Note that this is just a sample kernel, there are very specific eqns which, depending on your Gaussian variables, you'll get different results)

To answer the performance part of your question, the overall speed of this algorithm would depend on a few things, assuming a constant sized image. Lets say the image is NxM pixels, and the convolution kernel is PxP pixels. You're going to have to do PPN*M operations. The greater P, the more operations you're going to have to do for a given image. You can get crafty with the algorithm you use here, doing very specific row or columnar based math.

Implementation is also very important. If you want to be extremely efficient, you'll probably want to use the most advanced instructions that your architecture offers. If you're using an Intel x86 chip, you'll probably want to look at getting a license for Intel performance primitives (IPP) and calling those instructions directly. IIRC, OpenCV does make use of IPP when its available...

You could also do something very smart and work with all scaled integers if the floating point performance on your given architecture is poor. This would probably speed things up a bit, but I would look at other options first before going down this road.

Try checking here. You want to figure out the discrete gaussian matrix ahead of time, then convolve it with the image.

If your convolution kernel is relatively large and you are implementing direct convolution, the performance difference may be because OpenCV is implementing convolution using a fast Fourier transform (FFT).

I hate to be pedantic, but you are asking for an algorithm, that is, a precise sequence of steps needed to accomplish a task. You already have the gaussian algorithm. So the key point of your question is when you ask for something faster, which is not the same as asking for an algorithm.

To answer the faster question - you want to know how OpenCV optimizes its code, which is a highly technical and broad subject. I would hazard a guess by saying it uses assembly language, and GPU-specific functions. I'd start by learning assembly, and researching the CUDA package to take advantage of your GPU.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top