Question

I'm trying to find an as efficient as possible way to store and call my matlab shape-functions. I have an interval x=linspace(0,20) and a position-vector

count = 10;
for i=1:count
    pos(i)=rand()*length(x);
end

And now, I want to put on every position pos(j) shape functions like Gauss-kernels with compact support or hat-functions or something similar (it should be possible to change the prototype function easily). The support of the function is controlled by a so-called smoothing length h. So I constructed a function in a .m-file like (e.g. a cubic spline)

function y = W_shape(x,pos,h)

l=length(x);
y=zeros(1,l);
if (h>0)
    for i=1:l
        if (-h <= x(i)-pos && x(i)-pos < -h/2)
            y(i) = (x(i)-pos+h)^2;
        elseif (-h/2 <= x(i)-pos && x(i)-pos <= h/2)
            y(i) = -(x(i)-pos)^2 + h^2/2;
        elseif (h/2 < x(i)-pos && x(i)-pos <=h)
            y(i) = (x(i)-pos-h)^2;
        end
    end
else
    error('h must be positive')
end

And then construct my functions on the interval x like

w = zeros(count,length(x));
for j=1:count
    w(j,:)=W_shape(x,pos(j),h);
end

So far so good, but when I make x=linspace(0,20,10000) and count=1000, it takes my computer (Intel Core-i7) several minutes to calculate the whole stuff. Since it should be some kind of PDE-Solver, this procedure has to be done in every time-step (under particular circumstances). I think my problem is, that I use x as an argument for my function-call and store every function instead of store just one and shift it, but my matlab-knowledge is not that good, so any advices? Fyi: I need the integral of the areas, where two or more function-supports intersect...and when I'm done with this in 1D, I wanna do it for 2D-functions, so it has to be efficient anyways

Was it helpful?

Solution

One initial vectorization would be to remove the for loop in the W_shape function:

for i=1:l
    if (-h <= x(i)-pos && x(i)-pos < -h/2)
        y(i) = (x(i)-pos+h)^2;
    elseif (-h/2 <= x(i)-pos && x(i)-pos <= h/2)
        y(i) = -(x(i)-pos)^2 + h^2/2;
    elseif (h/2 < x(i)-pos && x(i)-pos <=h)
        y(i) = (x(i)-pos-h)^2;
    end
end

Could become

xmpos=x-pos; % compute once and store instead of computing numerous times
inds1=(-h <= xmpos) & (xmpos < -h/2);
y(inds1) = (xmpos(inds1)+h).^2;
inds2=(-h/2 < xmpos) & (xmpos <= h/2);
y(inds2) = -(xmpos(inds2).^2 + h^2/2;
inds3=(h/2 < xmpos) & (xmpos <=h);
y(inds3) = (xmpos(inds3)-h).^2;

There is probably better optimisations than this.

EDIT: I forgot to mention, you should use the profiler to find out what is actually slow!

profile on
run_code
profile viewer
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top