Pergunta

I would like to calculate the gini coefficient of several plots with R unsing the gini() function from the package reldist. I have a data frame from which I need to use two columns as input to the gini function.

>  head(merged[,c(1,17,29)])
  idp c13     w
1  19 126 14.14
2  19 146 14.14
3  19  76 39.29
4  19  74 39.29
5  19  86 39.29
6  19  93 39.29

The gini function uses the first elements for calculation (c13 here) and the second elements are the weights (w here) corresponding to each element from c13.

So I need to use the column c13 and w like this:

gini(merged$c13,merged$w)
[1] 0.2959369

The thing is I want to do this for each plot (idp). I have 4 thousands different values of idp with dozens of values of the two other columns for each.

I thought I could do this using the function tapply(). But I can't put two colums in the function using tapply.

tapply(list(merged$c13,merged$w), merged$idp, gini)

As you know this does not work. So what I would love to get as a result is a data frame like this:

 idp  Gini 
1  19 0.12 
2  21 0.45
3  35 0.65
4  65 0.23

Do you have any idea of how to do this?? Maybe the plyr package? Thank you for your help!

Foi útil?

Solução

You can use function ddply() from library plyr() to calculate coefficient for each level (changed in example data frame some idp values to 21).

library(plyr)
library(reldist)
ddply(merged,.(idp),summarize, Gini=gini(c13,w))

  idp       Gini
1  19 0.15307402
2  21 0.05006588
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top