Numpy의 가중 표준 편차

https://stackoverflow.com/questions/2413522

19-09-2019
|

문제

numpy.average() 가중치 옵션이 있지만 numpy.std() 하지 않습니다. 누구든지 해결 방법에 대한 제안이 있습니까?

해결책

다음의 짧은 "수동 계산"은 어떻습니까?

def weighted_avg_and_std(values, weights):
    """
    Return the weighted average and standard deviation.

    values, weights -- Numpy ndarrays with the same shape.
    """
    average = numpy.average(values, weights=weights)
    # Fast and numerically precise:
    variance = numpy.average((values-average)**2, weights=weights)
    return (average, math.sqrt(variance))

다른 팁

수업이 있습니다 statsmodels 따라서 가중 통계를 쉽게 계산할 수 있습니다. statsmodels.stats.weightstats.DescrStatsW.

이 데이터 세트 및 가중치를 가정합니다.

import numpy as np
from statsmodels.stats.weightstats import DescrStatsW

array = np.array([1,2,1,2,1,2,1,3])
weights = np.ones_like(array)
weights[3] = 100

클래스를 초기화합니다 (수정 요인을 통과해야합니다. 델타 자유도 이 지점에서):

weighted_stats = DescrStatsW(array, weights=weights, ddof=0)

그런 다음 계산할 수 있습니다.

.mean 그만큼 가중 평균:

>>> weighted_stats.mean      
1.97196261682243

.std 그만큼 가중 표준 편차:

>>> weighted_stats.std       
0.21434289609681711

.var 그만큼 가중 분산:

>>> weighted_stats.var       
0.045942877107170932

.std_mean 그만큼 표준 에러 가중 평균 :
```
>>> weighted_stats.std_mean  
0.020818822467555047
```
표준 오류와 표준 편차 사이의 관계에 관심이있는 경우 : 표준 오류는 ( ddof == 0) 가중 표준 편차는 가중치 뺀 숫자의 합의 제곱근으로 나눈 값으로 계산되었습니다.해당 소스 statsmodels Github에서 버전 0.9):
```
standard_error = standard_deviation / sqrt(sum(weights) - 1)
```

Numpy/Scipy에는 그런 기능이없는 것처럼 보이지만 티켓 이 추가 기능을 제안합니다. 거기에 포함되어 있습니다 통계 .py 가중 표준 편차를 구현합니다.

다음은 한 가지 옵션입니다.

np.sqrt(np.cov(values, aweights=weights))

제안한 아주 좋은 예가 있습니다 gabouph:

import pandas as pd
import numpy as np
# X is the dataset, as a Pandas' DataFrame
mean = mean = np.ma.average(X, axis=0, weights=weights) # Computing the 
weighted sample mean (fast, efficient and precise)

# Convert to a Pandas' Series (it's just aesthetic and more 
# ergonomic; no difference in computed values)
mean = pd.Series(mean, index=list(X.keys())) 
xm = X-mean # xm = X diff to mean
xm = xm.fillna(0) # fill NaN with 0 (because anyway a variance of 0 is 
just void, but at least it keeps the other covariance's values computed 
correctly))
sigma2 = 1./(w.sum()-1) * xm.mul(w, axis=0).T.dot(xm); # Compute the 
unbiased weighted sample covariance

가중 편향 샘플 공분산, URL (버전 : 2016-06-28)에 대한 올바른 방정식

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow