I am trying to use MATLAB to convolve an image with a Gaussian filter using two methods: separable convolution using the 1D FFT and non-separable convolution using the 2D FFT. I'm expecting the separable convolution to be faster. However, it is not for small images, but for larger ones where the 2D is faster. I'm not sure if it's a problem with my implementation or if it's because I don't have the concept quite right.
Here is my code:
img1 = randi([1,256],128,128);
% Create a Gaassian filter
rf1 = fspecial('gaussian', [1 128], 1.0);
cf1 = transpose(rf1);
gf1 = cf1 * rf1;
rc1 = round(conv2(img1, gf1, 'same'));
rc1 = fft2dconv(img1, gf1);
rc2 = fft1dconv(img1, rf1, cf1);
function o = fft1dconv(img, rowf, colf)
% Zero-Pad
imgsize = size(img);
rsize = size(rowf);
csize = size(colf);
img = padarray(img, [imgsize(1)/2, imgsize(2)/2]);
rowf = padarray(rowf, [2*imgsize(1)-rsize(1), 2*imgsize(2)-rsize(2)], 'post');
colf = padarray(colf, [2*imgsize(1)-csize(1), 2*imgsize(2)-csize(2)], 'post');
% Seperable convolution using 1D FFT
tic;
result = fft(transpose(fft(img))) .* fft(transpose(fft(colf)));
result = result .* fft(transpose(fft(rowf)));
o = transpose(round(real(ifft2(result))));
toc;
% Remove Pad
o = o(imgsize(1)+1:2*imgsize(1),imgsize(2)+1:2*imgsize(2));
end
function o = fft2dconv(img, filter)
%zero-pad
imgsize = size(img);
fsize = size(filter);
img = padarray(img, [imgsize(1)/2, imgsize(2)/2]);
filter = padarray(filter, [2*imgsize(1)-fsize(1), 2*imgsize(2)-fsize(2)], 'post');
% Non-Seperable convolution using 2D FFT
tic;
o = round(real(ifft2(fft2(img) .* fft2(filter))));
toc;
% Remove Pad
o = o(imgsize(1)+1:2*imgsize(1),imgsize(2)+1:2*imgsize(2));
end
And the timing results which are:
Elapsed time is 0.003315 seconds.
Elapsed time is 0.004369 seconds.
For a 4 x 4 image, the separable method is much faster, but for larger images. This is not the case and I'm not sure why. Any help would be really appreciated.