파이썬에서 지수 이동 평균을 계산합니다

https://stackoverflow.com/questions/488670

20-08-2019
|

문제

각 날짜에 대한 다양한 날짜와 측정이 있습니다. 각 날짜에 대한 지수 이동 평균을 계산하고 싶습니다. 이 작업을 수행하는 방법을 아는 사람이 있습니까?

나는 Python을 처음 사용합니다. 평균이 표준 파이썬 라이브러리에 내장되어있는 것처럼 보이지 않으므로 조금 이상하게도 닿습니다. 어쩌면 나는 올바른 장소를보고 있지 않을 것입니다.

따라서 다음 코드가 주어지면 달력 날짜에 대한 IQ 포인트의 이동 평균 평균을 어떻게 계산할 수 있습니까?

from datetime import date
days = [date(2008,1,1), date(2008,1,2), date(2008,1,7)]
IQ = [110, 105, 90]

(아마도 데이터를 구성하는 더 좋은 방법이있을 것입니다. 모든 조언에 감사 할 것입니다)

해결책

편집 : 그것은 것 같습니다 mov_average_expw() 기능 scikits.timeseries.lib.moving_funcs 하위 모듈 스키 츠 (보완하는 추가 툴킷 Scipy) 질문의 문구에 더 잘 어울립니다.

계산하려면 지수 스무딩 스무딩 계수가있는 데이터의 alpha (그것은이다 (1 - alpha) Wikipedia의 용어로) :

>>> alpha = 0.5
>>> assert 0 < alpha <= 1.0
>>> av = sum(alpha**n.days * iq 
...     for n, iq in map(lambda (day, iq), today=max(days): (today-day, iq), 
...         sorted(zip(days, IQ), key=lambda p: p[0], reverse=True)))
95.0

위의 것은 예쁘지 않으므로 조금 리팩터링하겠습니다.

from collections import namedtuple
from operator    import itemgetter

def smooth(iq_data, alpha=1, today=None):
    """Perform exponential smoothing with factor `alpha`.

    Time period is a day.
    Each time period the value of `iq` drops `alpha` times.
    The most recent data is the most valuable one.
    """
    assert 0 < alpha <= 1

    if alpha == 1: # no smoothing
        return sum(map(itemgetter(1), iq_data))

    if today is None:
        today = max(map(itemgetter(0), iq_data))

    return sum(alpha**((today - date).days) * iq for date, iq in iq_data)

IQData = namedtuple("IQData", "date iq")

if __name__ == "__main__":
    from datetime import date

    days = [date(2008,1,1), date(2008,1,2), date(2008,1,7)]
    IQ = [110, 105, 90]
    iqdata = list(map(IQData, days, IQ))
    print("\n".join(map(str, iqdata)))

    print(smooth(iqdata, alpha=0.5))

예시:

$ python26 smooth.py
IQData(date=datetime.date(2008, 1, 1), iq=110)
IQData(date=datetime.date(2008, 1, 2), iq=105)
IQData(date=datetime.date(2008, 1, 7), iq=90)
95.0

다른 팁

나는 약간의 인터넷 검색을했고 다음 샘플 코드를 찾았습니다 (http://osdir.com/ml/python.matplotlib.general/2005-04/msg00044.html):

def ema(s, n):
    """
    returns an n period exponential moving average for
    the time series s

    s is a list ordered from oldest (index 0) to most
    recent (index -1)
    n is an integer

    returns a numeric array of the exponential
    moving average
    """
    s = array(s)
    ema = []
    j = 1

    #get n sma first and calculate the next n period ema
    sma = sum(s[:n]) / n
    multiplier = 2 / float(1 + n)
    ema.append(sma)

    #EMA(current) = ( (Price(current) - EMA(prev) ) x Multiplier) + EMA(prev)
    ema.append(( (s[n] - sma) * multiplier) + sma)

    #now calculate the rest of the values
    for i in s[n+1:]:
        tmp = ( (i - ema[j]) * multiplier) + ema[j]
        j = j + 1
        ema.append(tmp)

    return ema

나는 항상 팬더로 EMA를 계산합니다.

다음은 다음을 수행하는 방법입니다.

import pandas as pd
import numpy as np

def ema(values, period):
    values = np.array(values)
    return pd.ewma(values, span=period)[-1]

values = [9, 5, 10, 16, 5]
period = 5

print ema(values, period)

Pandas Ewma에 대한 더 많은 정보 :

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.ewma.html

내 파이썬은 약간 녹슬 었습니다 (누군가는 구문을 엉망으로 만들면 누구나이 코드를 편집하여 수정을 할 수 있습니다). 그러나 여기에 간다 ....

def movingAverageExponential(values, alpha, epsilon = 0):

   if not 0 < alpha < 1:
      raise ValueError("out of range, alpha='%s'" % alpha)

   if not 0 <= epsilon < alpha:
      raise ValueError("out of range, epsilon='%s'" % epsilon)

   result = [None] * len(values)

   for i in range(len(result)):
       currentWeight = 1.0

       numerator     = 0
       denominator   = 0
       for value in values[i::-1]:
           numerator     += value * currentWeight
           denominator   += currentWeight

           currentWeight *= alpha
           if currentWeight < epsilon: 
              break

       result[i] = numerator / denominator

   return result

이 함수는 목록의 끝에서 시작으로 뒤로 이동하여 요소의 중량 계수가 주어진 Epsilon보다 작을 때까지 뒤로 작업하여 각 값의 지수 이동 평균을 계산합니다.

함수가 끝나면 목록을 반환하기 전에 값을 되돌립니다 (발신자의 올바른 순서가되도록).

(사이드 참고 : 파이썬 이외의 언어를 사용하고 있다면 먼저 전체 크기의 빈 배열을 만들고 뒤로 채우면 끝에서 뒤집을 필요가 없습니다. 그러나 나는 돈을 금지합니다. '파이썬에서 큰 빈 배열을 선언 할 수 있다고 생각합니다. 그리고 파이썬 목록에서, 추가는 선불보다 훨씬 저렴하기 때문에 목록을 역 순서로 구축 한 이유입니다. 내가 틀렸다면 수정하십시오.)

'알파'인수는 각 반복의 부패 요인입니다. 예를 들어, 알파를 0.5로 사용한 경우 오늘날의 이동 평균값은 다음과 같은 가중치 값으로 구성됩니다.

today:        1.0
yesterday:    0.5
2 days ago:   0.25
3 days ago:   0.125
...etc...

물론, 당신이 엄청난 값을 가지고 있다면, 10 일 또는 15 일 전의 값은 오늘날의 가중 평균에 크게 기여하지 않을 것입니다. 'Epsilon'논쟁을 통해 컷오프 지점을 설정할 수 있습니다. 아래는 이전 가치에 대한 신경을 끊을 수 있습니다 (오늘날의 가치에 대한 기여는 중요하지 않기 때문에).

다음과 같은 기능을 호출합니다.

result = movingAverageExponential(values, 0.75, 0.0001)

matplotlib.org 예제 (http://matplotlib.org/examples/pylab_examples/finance_work2.html) Numpy를 사용하여 지수 이동 평균 (EMA) 기능의 좋은 예가 제공됩니다.

def moving_average(x, n, type):
    x = np.asarray(x)
    if type=='simple':
        weights = np.ones(n)
    else:
        weights = np.exp(np.linspace(-1., 0., n))

    weights /= weights.sum()

    a =  np.convolve(x, weights, mode='full')[:len(x)]
    a[:n] = a[n]
    return a

나는 Python을 모르지만 평균화 부분의 경우, 당신은 형식의 기하 급수적으로 부패하는 저역 통과 필터를 의미합니까?

y_new = y_old + (input - y_old)*alpha

여기서 alpha = dt/tau, dt = 필터의 타임 스텝, 타우 = 필터의 시간 상수? (이것의 변수 timestep 형식은 다음과 같습니다. 단지 1.0을 초과하지 않도록 DT/TAU를 클립합니다)

y_new = y_old + (input - y_old)*dt/tau

날짜와 같은 것을 필터링하려면 1970 년 1 월 1 일 이후 몇 초와 같은 부동 소수점 수량으로 변환해야합니다.

EMA는 IIR 필터이기 때문에 Scipy 필터 방법을 사용할 수도 있습니다. 이것은 내 시스템에서 측정 한대로 약 64 배 더 빠른 이점이 있습니다. 시간대 큰 데이터 세트에서 세다() 접근하다.

import numpy as np
from scipy.signal import lfilter

x = np.random.normal(size=1234)
alpha = .1 # smoothing coefficient
zi = [x[0]] # seed the filter state with first value
# filter can process blocks of continuous data if <zi> is maintained
y, zi = lfilter([1.-alpha], [1., -alpha], x, zi=zi)

@earino의 위 코드 스 니펫이 매우 유용하다는 것을 알았지 만 값의 스트림을 지속적으로 부드럽게 할 수있는 것이 필요했습니다.

def exponential_moving_average(period=1000):
    """ Exponential moving average. Smooths the values in v over ther period. Send in values - at first it'll return a simple average, but as soon as it's gahtered 'period' values, it'll start to use the Exponential Moving Averge to smooth the values.
    period: int - how many values to smooth over (default=100). """
    multiplier = 2 / float(1 + period)
    cum_temp = yield None  # We are being primed

    # Start by just returning the simple average until we have enough data.
    for i in xrange(1, period + 1):
        cum_temp += yield cum_temp / float(i)

    # Grab the timple avergae
    ema = cum_temp / period

    # and start calculating the exponentially smoothed average
    while True:
        ema = (((yield ema) - ema) * multiplier) + ema

그리고 나는 이것을 사용합니다.

def temp_monitor(pin):
    """ Read from the temperature monitor - and smooth the value out. The sensor is noisy, so we use exponential smoothing. """
    ema = exponential_moving_average()
    next(ema)  # Prime the generator

    while True:
        yield ema.send(val_to_temp(pin.read()))

(여기서 pin.read ()는 소비하고 싶은 다음 값을 생성합니다).

다음은 내가 작업 한 간단한 샘플입니다. http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages

스프레드 시트와 달리 SMA를 계산하지 않으며 10 개의 샘플 후에 EMA를 생성하기를 기다리지 않습니다. 이것은 내 값이 약간 다르다는 것을 의미하지만, 차트가 있다면 10 개의 샘플 이후 정확히 따릅니다. 처음 10 개의 샘플 동안, EMA I 계산은 적절하게 부드럽습니다.

def emaWeight(numSamples):
    return 2 / float(numSamples + 1)

def ema(close, prevEma, numSamples):
    return ((close-prevEma) * emaWeight(numSamples) ) + prevEma

samples = [
22.27, 22.19, 22.08, 22.17, 22.18, 22.13, 22.23, 22.43, 22.24, 22.29,
22.15, 22.39, 22.38, 22.61, 23.36, 24.05, 23.75, 23.83, 23.95, 23.63,
23.82, 23.87, 23.65, 23.19, 23.10, 23.33, 22.68, 23.10, 22.40, 22.17,
]
emaCap = 10
e=samples[0]
for s in range(len(samples)):
    numSamples = emaCap if s > emaCap else s
    e =  ema(samples[s], e, numSamples)
    print e

빠른 방법 (복사 목마 여기)는 다음과 같습니다.

def ExpMovingAverage(values, window):
    """ Numpy implementation of EMA
    """
    weights = np.exp(np.linspace(-1., 0., window))
    weights /= weights.sum()
    a =  np.convolve(values, weights, mode='full')[:len(values)]
    a[:window] = a[window]
    return a

목록과 붕괴 속도를 입력으로 사용하고 있습니다. 깊은 재귀가 파이썬에서 안정적이지 않다는 점을 고려할 때 여기에서 두 줄만있는이 작은 기능이 여기에서 도움이되기를 바랍니다.

def expma(aseries, ratio):
    return sum([ratio*aseries[-x-1]*((1-ratio)**x) for x in range(len(aseries))])

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow