Question

This is a follow-up to an earlier question of mine posted here. Based on Oleg Komarov's answer I wrote a little tool to get daily, hourly, etc. averages or sums of my data that uses accumarray() and datevec()'s output structure. Feel free to have a look at it here (it's probably not written very well, but it works for me).

What I would like to do now is add the functionality to calculate n-minute, n-hour, n-day, etc. statistics instead of 1-minute, 1-hour, 1-day, etc. like my function does. I have a rough idea that simply loops over my time-vector t (which would be pretty much what I would have done already if I hadn't learnt about the beautiful accumarray()), but that means I have to do a lot of error-checking for data gaps, uneven sampling times, etc.

I wonder if there is a more elegant/efficient approach that lets me re-use/extend my old function posted above, i.e. something that still makes use of accumarray() and datevec(), since this makes working with gaps very easy.

You can download some sample data taken from my last question here. These were sampled at 30 min intervals, so a possible example of what I want to do would be to calculate 6 hour averages without relying on the assumption that they are free of gaps and/or always sampled at exactly 30 min.


This is what I have come up with so far, which works reasonably well, apart from a small but easily fixed problem with the time stamps (e.g. 0:30 is representative for the interval from 0:30 to 0:45 -- my old function suffers from the same problem, though):

[ ... see my answer below ...]

Thanks to woodchips for inspiration.

Was it helpful?

Solution 2

I guess I figured it out using parts of @Bas Swinckels answer and @woodchip 's code linked above. Not exactly what I would call good code, but working and reasonably fast.

function [ t_acc, x_acc, subs ] = ts_aggregation( t, x, n, target_fmt, fct_handle )
% t is time in datenum format (i.e. days)
% x is whatever variable you want to aggregate
% n is the number of minutes, hours, days
% target_fmt is 'minute', 'hour' or 'day'
% fct_handle can be an arbitrary function (e.g. @sum)
    t = t(:);
    x = x(:);
    switch target_fmt
        case 'day'
            t_factor = 1;
        case 'hour'
            t_factor = 1 / 24;
        case 'minute'
            t_factor = 1 / ( 24 * 60 );
    end
    t_acc = ( t(1) : n * t_factor : t(end) )';
    subs = ones(length(t), 1);
    for i = 2:length(t_acc)
       subs(t > t_acc(i-1) & t <= t_acc(i)) = i; 
    end
    x_acc = accumarray( subs, x, [], fct_handle );
end

/edit: Updated to a much shorter fnction that does use loops, but appears to be faster than my previous solution.

OTHER TIPS

The linked method of using accumarray seems overkill and too complex to me if you start with evenly spaced measurements without any gaps. I have the following function in my private toolbox for calculating an N-point average of vectors:

function y = blockaver(x, n)
% y = blockaver(x, n)
% input points are averaged over n points
% always returns column vector

if n == 1
    y = x(:);
else
    nblocks = floor(length(x) / n);
    y = mean(reshape(x(1:n * nblocks), n, nblocks), 1).';
end

Works pretty well for quick and dirty decimating by a factor N, but note that it does not apply proper anti-alias filtering. Use decimate if that is important.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top