Exponential Moving Average Sampled at Varying Times

https://stackoverflow.com/questions/1023860

06-07-2019
|

Question

I have a continuous value for which I'd like to calculate an exponential moving average. Normally I'd just use the standard formula for this:

S_n = αY + (1-α)S_n-1

where S_n is the new average, α is the alpha, Y is the sample, and S_n-1 is the previous average.

Unfortunately, due to various issues I don't have a consistent sample time. I may know I can sample at the most, say, once per millisecond, but due to factors out of my control, I may not be able to take a sample for several milliseconds at a time. A likely more common case, however, is that I simple sample a bit early or late: instead of sampling at 0, 1 and 2 ms. I sample at 0, 0.9 and 2.1 ms. I do anticipate that, regardless of delays, my sampling frequency will be far, far above the Nyquist limit, and thus I need not worry about aliasing.

I reckon that I can deal with this in a more-or-less reasonable way by varying the alpha appropriately, based on the length of time since the last sample.

Part of my reasoning that this will work is that the EMA "interpolates linearly" between the previous data point and the current one. If we consider calculating an EMA of the following list of samples at intervals t: [0,1,2,3,4]. We should get the same result if we use interval 2t, where the inputs become [0,2,4], right? If the EMA had assumed that, at t₂ the value had been 2 since t₀, that would be the same as the interval t calculation calculating on [0,2,2,4,4], which it's not doing. Or does that make sense at all?

Can someone tell me how to vary the alpha appropriately? "Please show your work." I.e., show me the math that proves that your method really is doing the right thing.

Solution

This answer based on my good understanding of low-pass filters ("exponential moving average" is really just a single-pole lowpass filter), but my hazy understanding of what you're looking for. I think the following is what you want:

First, you can simplify your equation a little bit (looks more complicated but it's easier in code). I'm going to use "Y" for output and "X" for input (instead of S for output and Y for input, as you have done).

Y_n = αX + (1-α)Y_n-1 → Y_n = Y_n-1 + α(X - Y_n-1)

which codes to:

 Y += alpha * (X-Y);

Second, the value of α here is "equal" to 1-e^-Δt/τ where Δt is the time between samples, and τ is the time constant of the low-pass filter. I say "equal" in quotes because this works well when Δt/τ is small compared to 1, and α = 1-e^-Δt/τ ≈ Δt/τ. (But not too small: you'll run into quantizing issues, and unless you resort to some exotic techniques you usually need an extra N bits of resolution in your state variable S, where N = -log₂(α). ) For larger values of Δt/τ the filtering effect starts to disappear, until you get to the point where α is close to 1 and you're basically just assigning the input to the output.

This should work properly with varying values of Δt (the variation of Δt is not very important as long as alpha is small, otherwise you will run into some rather weird Nyquist issues / aliasing / etc.), and if you are working on a processor where multiplication is cheaper than division, or fixed-point issues are important, precalculate ω = 1/τ, and consider trying to approximate the formula for α.

If you really want to know how to derive the formula

α = 1-e^-Δt/τ

then consider its differential equation source:

Y + τ dY/dt = X

which, when X is a unit step function, has the solution Y = 1 - e^-t/τ. For small values of Δt, the derivative can be approximated by ΔY/Δt, yielding

Y + τ ΔY/Δt = X

ΔY/Δt = (X-Y)/τ

ΔY = (X-Y)(Δt/τ) = α(X-Y)

and the "extrapolation" of α = 1-e^-Δt/τ comes from trying to match up the behavior with the unit step function case.

OTHER TIPS

Have a look here: http://www.eckner.com/research.html

Look at the second link: ""Algorithms for Unevenly-Spaced Time Series: Moving Averages and Other Rolling Operators"

The document describes exactly the programming algorithms you need, I think.

This is not a complete answer, but may be the start of one. It's as far as I got with this in an hour or so of playing; I'm posting it as an example of what I'm looking for, and perhaps an inspiration to others working on the problem.

I start with S₀, which is the average resulting from the previous average S_-1 and the sample Y₀ taken at t₀. (t₁ - t₀) is my sample interval and α is set to whatever is appropriate for that sample interval and the period over which I wish to average.

I considered what happens if I miss the sample at t₁ and instead have to make do with the sample Y₂ taken at t₂? Well, we can start by expanding the equation to see what would have happened if we had had Y₁:

S₂ = αY₂ + (1-α)S₁, where S₁ = αY₁ + (1-α)S₀

Substituting:

S₂ = αY₂ + (1-α)(αY₁ + (1-α)S₀)
S₂ = αY₂ + (1-α)αY₁ + (1-α)(1-α)S₀
S₂ = αY₂ + (1-α)αY₁ + (1-α)²S₀

I notice that the series seems to extend infinitely this way, because we can substitute the S_n in the right-hand side indefinitely:

S₂ = αY₂ + (1-α)αY₁ + (1-α)²(αY₀ + (1-α)S_-1)
S₂ = αY₂ + (1-α)αY₁ + (1-α)²αY₀ + (1-α)³S_-1
etc.

Ok, so it's not really a polynomial (silly me), but if we multiply the initial term by one, we then see a pattern:

S₂ = (1-α)⁰αY₂ + (1-α)αY₁ + (1-α)²αY₀ + (1-α)³S_-1

Hm: it's an exponential series. Quelle surprise! Imagine that coming out of the equation for an exponential moving average!

So anyway, I have this x⁰ + x¹ + x² + x³ + ... thing going, and I'm sure I'm smelling e or a natural logarithm kicking around here, but I can't remember where I was heading next before I ran out of time.

Any answer to this question, or any proof of correctness of such an answer, highly depends on the data you're measuring.

If your samples were taken at t₀=0ms , t₁=0.9ms and t₂=2.1ms , but your choice of α is based on 1-ms-intervals, and therefore you want a locally adjusted α_n , the proof of correctness of the choice would mean knowing the sample values at t=1ms and t=2ms .

This leads to the question: Can you interpolate your data resonably to have sane guesses of what in-between values might have been? Or can you even interpolate the average itself?

If neither of these is possible, then as far as I see it, the logical choice of an in-between value Y(t) is the most recently calculated average, i.e. Y(t) ≈ S_n where n is maxmial such that t_n<t.

This choice has a simple consequence: Leave α alone, no matter what the time difference was.

If, on the other hand, it is possible to interpolate your values, then this will give you averagable constant-interval samples. Lastly, if it's even possible to interpolate the average itself, that would render the question meaningless.

By using a slightly different α that is equal to (1-α_{the one from the question}), the basic formula to add a new value Y to an existing average of S₀ looks like this:

S(Y,S₀) =

(1-α)Y + αS₀ =

Y - αY + αS₀ =

Y + α(S₀-Y)

If we now add the length of the time interval t and assume that just α depends on that t, that formula looks like this:

S(Y,t,S₀) = Y + α_t(S₀-Y)

Now assume that t = t₁ + t₂. If the average is created by adding two values of Y for time time intervals t₁ and t₂, the resulting average looks like this:

S(Y,t₂, S(Y,t₁,S₀)) =

Y + α_t₂(S(Y,t₁,S₀) - Y) =

Y + α_t₂((Y + α_t₁(S₀-Y)) - Y) =

Y + α_t₂α_t₁(S₀-Y)

If this average should be the same as if the whole t interval would have been added at once, it follows that α_t = α_t₁α_t₂. A definition of α that fulfills this requirement would be:

α_x := A^x (for some constant A)

Because:

α_t = A^t = A^{t₁ + t₂} = A^t₁ A^t₂ = α_t₁α_t₂

This results in the following averaging function:

S(Y,t,S₀) = Y + A^t(S₀-Y)

I haven't really tested this, but if the assumptions I made fit your scenario this looks like an averaging function that can handle variations in the sampling intervals quite well.

Let's say we would like to make an exponential decaying average on a continuous function. However we don't have all the values of that function, only a few samples. This formula would make a weighted average of the samples that we do have with the weights they would have in the continuous average.

Multiplier_n = Alpha^{Time_n-Time_n-1}

Sum_n = Val_n + Sum_n-1*Multiplier_n

Count_n = 1 + Count_n-1*Multiplier_n

Avg_n = Sum_n/Count_n

I would leave the alpha value alone, and fill in the missing data.

Since you don't know what happens during the time when you can't sample, you can fill those samples with 0s, or hold the previous value stable and use those values for the EMA. Or some backward interpolation once you have a new sample, fill in the missing values, and recompute the EMA.

What I am trying to get at is you have an input x[n] which has holes. There is no way to get around the fact you are missing data. So you can use a zero order hold, or set it to zero, or some kind of interpolation between x[n] and x[n+M], where M is the number of missing samples and n the start of the gap. Possibly even using values before n.

This is similar to an open problem on my todo list. I have one scheme worked out to some extent but do not have mathematical working to back this suggestion yet.

Update & summary: Would like to keep the smoothing factor (alpha) independent of the compensation factor (which I refer as beta here). Jason's excellent answer already accepted here works great for me.

First step.

If you can also measure the time since the last sample was taken (in rounded multiples of your constant sampling time -- so 7.8 ms since last sample would be 8 units), that could be used to apply the smoothing multiple times. Apply the formula 8 times in this case. You have effectively made a smoothing biased more towards the current value.

Second step.

To get a better smoothing, we need to tweak the alpha while applying the formula 8 times in the previous case.

What will this smoothing approximation miss?

It has already missed 7 samples in the example above
This was approximated in step 1 with a flattened re-application of the current value an additional 7 times
If we define an approximation factor beta that will be applied along with alpha (as alpha*beta instead of just alpha), we will be assuming that the 7 missed samples were changing smoothly between the previous and current sample values.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow