This looks like a bug in scipy.stats.mstats.pearsonr
. It appears that the values in x
and y
are expected to be paired by index, so if one is masked, the other should be ignored. That is, if x
and y
look like (using --
for a masked value):
x = [1, --, 3, 4, 5]
y = [9, 8, --, 6, 5]
then both (--, 8)
and (3, --)
are to be ignored, and the result should should be the same as scipy.stats.pearsonr([1, 4, 5], [9, 6, 5])
.
The bug in the mstats
version is that the code to compute the means of x
and y
does not use the common mask.
I created an issue for this on the scipy github site: https://github.com/scipy/scipy/issues/3645