Average on overlapping windows in Python

Question 1

One way to compute the average of a sliding window across a list in Python is to use a list comprehension. You can use

>>> range(0, len(data), 2)
[0, 2, 4, 6, 8]

to get the starting indices of each window, and then numpy's mean function to take the average of each window. See the demo below:

>>> import numpy as np
>>> window_size = 4
>>> stride = 2
>>> window_avg = [ np.mean(data[i:i+window_size]) for i in range(0, len(data), stride)
                   if i+window_size <= len(data) ]
>>> window_avg
[2.5, 4.5, 6.5, 8.5]

Note that the list comprehension does have a condition to ensure that it only computes the average of "full windows", or sublists with exactly window_size elements.

When run on a dataset of the size discussed in the OP, this method computes on my MBA in a little over 200 ms:

In [5]: window_size = 450
In [6]: data = range(70000)
In [7]: stride = 30
In [8]: timeit [ np.mean(data[i:i+window_size]) for i in range(0, len(data), stride)
                 if i+window_size <= len(data) ]
1 loops, best of 3: 220 ms per loop

It is about twice as fast on my machine to the itertools approach presented by @Abhijit:

In [9]: timeit map(np.mean, izip(*(islice(it, i, None, stride) for i, it in enumerate(tee(data, window_size)))))
1 loops, best of 3: 436 ms per loop

Question 2

The following approach uses itertools at its fullest to create moving average window of size 4. As then entire expression is a generator which is evaluated when calculating the average, it has a complexity of O(n).

>>> import numpy as np
>>> from itertools import count, tee, izip, islice
>>> map(np.mean, izip(*(islice(it,i,None,2)
                      for i, it in enumerate(tee(data, 4)))))
[2.5, 4.5, 6.5, 8.5]

Its interesting to note, how individual itertools function works in accord.

itertools.tee n-plicates an iterator, in this case 4 times
enumerate creates an enumerator object which yield a tuple of index and element (which is the iterator)
slice the iterator with stride 2, starting from the index position.