Pregunta

I have two arrays of floats, and want to calculate the weighted correlation, meaning that I want some of my data to have lower weight than others.

      X          Y        w
   2.02382   6.00298   0.43873
   3.94601   6.41983   0.36818
   3.76877   4.55656   0.49836
   3.68307   6.46925   0.95965
   3.09073   4.57723   0.88889
   2.56690   2.70020   0.72812
   3.35469   6.76874   0.26863
   3.88722   5.23205   0.77492
   3.29389   3.50355   0.79567
   3.80725   3.18414   0.82439

So, I want correlation between X, and Y regarding the weights w. My problem is mainly a theory problem, but at the end I want to implement it in C.

¿Fue útil?

Solución

The main idea is that whenever you see E(...) you replace 1/n with w/sum(w).

Theory:

Corr(X,Y) = E( (X - E(X))*(Y - E(Y) ) / SD(X)SD(Y) ;

So first calculate E(X) and E(Y).

E(X) = (2.02382 * .43873 + ... + 3.80725*.82439) / (.43873+...+.82439) = 3.368

E(Y) = [same weighted average idea] = 4.705

sd(X) = sqrt( var(X) ) = sqrt( E( (X-E(X))^2 ) ) = sqrt( ( (.43873)(2.02382-3.368)^2 + ... + (.82439)(3.80725-3.368)^2 ) / (.43873+...+.82439) ) = sqrt(0.3054023) = 0.5526321

sd(Y) = [same weighted average idea] = sqrt(1.860124) = 1.363863

corr(x,y) = ( (.43873)(2.02382-3.368)(6.00298-4.705)+...+(.82439)(3.80725-3.368)(3.18414-4.705) ) / ( (.43873+...+.82439)(.5526)(1.3634) ) = 0.2085651

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top