Question

I have the data frame named OP1DadosCelDez12 in R as follows:

State   City    QuedasConx  ConxEstab   Sol.ConxDadosAtend  Sol.ConxDados
SC  ABDON BATISTA   25071   2176654 2105688 2180192
SC  ABDON BATISTA   10319   1594057 1562627 1740117
SC  ADAMANTINA  79210   7723455 7468357 7772426
MG  ADOLFO  43230   2820074 2655908 2867880
MG  AGROLANDIA  120016  10633996    10273314    10836846
SP  AGROLANDIA  106545  9184706 8611340 9377591

I was able to consolidate each column of QuedasConx, ConxEstab, Sol.ConxDadosAtend and Sol.ConxDados (numeric values) for each city using the aggregate function. For instance:

agg1 <- aggregate(OP1DadosCelDez12$Sol.ConxDadosAtend,
        data.frame(OP1DadosCelDez12$Município), FUN=sum, na.rm=FALSE)
                           (...)

Then I created a new data frame which should have the same variables, but only one row for each city.

ClaroDadosMunDez12 <- data.frame(agg0,agg1$Município,agg1$QuedasConx,
agg2$ConxEstab,agg3$Sol.ConxDadosAtend,agg4$Sol.ConxDados)
colnames(ClaroDadosMunDez12)[1] <- "UF"
colnames(ClaroDadosMunDez12)[2] <- "Município"
colnames(ClaroDadosMunDez12)[3] <- "QuedasConx"
colnames(ClaroDadosMunDez12)[4] <- "ConxEstab"
colnames(ClaroDadosMunDez12)[5] <- "Sol.ConxDadosAtend"
colnames(ClaroDadosMunDez12)[6] <- "Sol.ConxDados"

The desired result is this:

State   City    QuedasConx  ConxEstab   Sol.ConxDadosAtend  Sol.ConxDados
SC  ABDON BATISTA   35390   3770711 3668315 3920309
SC  ADAMANTINA  79210   7723455 7468357 7772426
MG  ADOLFO  43230   2820074 2655908 2867880
MG  AGROLANDIA  226561  19818702    18884654    20214437

My problem is I cannot read the state value for the unique rows for each city and write this into the variable OP1DadosMunDez12$UF (of the new data frame).

I could easily do this with the command procv in MS EXCEL, but I am having a hard time in R. Appreciate your time.

Was it helpful?

Solution

Here's what you could do to get what you describe as your 'desired result':

# input data
dat <- read.table(header= TRUE, text = "State    City    QuedasConx  ConxEstab   SolConxDadosAtend  SolConxDados
SC  ABDONBATISTA   25071   2176654 2105688 2180192
SC  ABDONBATISTA   10319   1594057 1562627 1740117
SC  ADAMANTINA  79210   7723455 7468357 7772426
MG  ADOLFO  43230   2820074 2655908 2867880
MG  AGROLANDIA  120016  10633996    10273314    10836846
SP  AGROLANDIA  106545  9184706 8611340 9377591")

# summarise by state
aggregate(dat[,3:6],by=list(dat[,2]),FUN=sum)

       Group.1 QuedasConx ConxEstab SolConxDadosAtend SolConxDados
1 ABDONBATISTA      35390   3770711           3668315      3920309
2   ADAMANTINA      79210   7723455           7468357      7772426
3       ADOLFO      43230   2820074           2655908      2867880
4   AGROLANDIA     226561  19818702          18884654     20214437

# this will give the same result
library(plyr)
ddply(dat, .(City), function(x) ( colSums(x[,3:6]) ))

# and this will also...
library(sqldf)
sqldf("SELECT City, SUM(QuedasConx), SUM(ConxEstab), SUM(SolConxDadosAtend), SUM(SolConxDados) FROM dat GROUP BY City")

The aggregate code in your question doesn't work for me and I cannot reproduce your ClaroDadosMunDez12 dataframe, what is in agg0, agg4, etc.

Also, it seems like you want to do something further than what I've shown here, can you edit your question to explain more clearly what you want to do?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top