Pergunta

I have a pandas data frame of the form:

r1    r2    r3    r4    r5

0    1    12    0    4
1    1    2    9    2
32   5    0    0    0
12   14   3    1    23
0    2    43    5    2
9    3    5    1    1
0    0    0    0    1
1    0    0    0    0

And I want to check if any column: r1, r2, r3, r4, r5 significantly differs from any of the other. Should I do a t test or an anova? And how would I set it up for the computation?

Foi útil?

Solução

This is typical statistics problem. When you have multiple 'classes' that you assume are normally distributed you first run an ANOVA. Then, IFF (if-and-only-if) the ANOVA is significant, then run post-hoc pairwise t-tests with an appropriate correction (e.g. Bonferroni).

Licenciado em: CC-BY-SA com atribuição
scroll top