I have two arrays of floats, and want to calculate the weighted correlation, meaning that I want some of my data to have lower weight than others.

      X          Y        w
   2.02382   6.00298   0.43873
   3.94601   6.41983   0.36818
   3.76877   4.55656   0.49836
   3.68307   6.46925   0.95965
   3.09073   4.57723   0.88889
   2.56690   2.70020   0.72812
   3.35469   6.76874   0.26863
   3.88722   5.23205   0.77492
   3.29389   3.50355   0.79567
   3.80725   3.18414   0.82439

So, I want correlation between X, and Y regarding the weights w. My problem is mainly a theory problem, but at the end I want to implement it in C.

有帮助吗?

解决方案

The main idea is that whenever you see E(...) you replace 1/n with w/sum(w).

Theory:

Corr(X,Y) = E( (X - E(X))*(Y - E(Y) ) / SD(X)SD(Y) ;

So first calculate E(X) and E(Y).

E(X) = (2.02382 * .43873 + ... + 3.80725*.82439) / (.43873+...+.82439) = 3.368

E(Y) = [same weighted average idea] = 4.705

sd(X) = sqrt( var(X) ) = sqrt( E( (X-E(X))^2 ) ) = sqrt( ( (.43873)(2.02382-3.368)^2 + ... + (.82439)(3.80725-3.368)^2 ) / (.43873+...+.82439) ) = sqrt(0.3054023) = 0.5526321

sd(Y) = [same weighted average idea] = sqrt(1.860124) = 1.363863

corr(x,y) = ( (.43873)(2.02382-3.368)(6.00298-4.705)+...+(.82439)(3.80725-3.368)(3.18414-4.705) ) / ( (.43873+...+.82439)(.5526)(1.3634) ) = 0.2085651

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top