سؤال

I have a dataframe (df) of goals scored against various teams by date

gamedate teamID Gls
 1992-08-22  CHL  3
 1992-08-22  MNU  1
 1992-08-23  ARS  0
 1992-08-23  LIV  2
 1992-08-24  MNU  0
 1992-08-25  LIV  2
 1992-08-26  ARS  0
 1992-08-26  CHL  0

I wish to produce a summary table which shows the number of games played and number of games these teams have blanked the opposition on each date

gamedate   games blanks
 1992-08-22   2     0
 1992-08-23   2     1
 1992-08-24   1     1
 1992-08-25   1     0
 1992-08-26   2     2

I can get the games and blanks separately using ddply

df.a <- ddply(df,"gamedate",function(x) c(count=nrow(x)))
df.b <- ddply(subset(df,Gls==0),"gamedate",function(x) c(count=nrow(x)))

and then merger df.a and df.b to get my answer. However, I am sure there must be a more simple and elegant solution

هل كانت مفيدة؟

المحلول

You just need to use summarise:

Read the data in:

   dat <- read.table(textConnection("gamedate teamID Gls
  1992-08-22  CHL  3
  1992-08-22  MNU  1
  1992-08-23  ARS  0
  1992-08-23  LIV  2
  1992-08-24  MNU  0
  1992-08-25  LIV  2
  1992-08-26  ARS  0
  1992-08-26  CHL  0"),sep = "",header = TRUE)

and then call ddply:

ddply(dat,.(gamedate),summarise,tot = length(teamID),blanks = length(which(Gls == 0)))
    gamedate tot blanks
1 1992-08-22   2      0
2 1992-08-23   2      1
3 1992-08-24   1      1
4 1992-08-25   1      0
5 1992-08-26   2      2

نصائح أخرى

The only thing you are missing is wrapping your functions in a data.frame() call and giving them column names... and the column names are optional :)

I'm using @joran's dat data.frame as it allowed me to test my answer.

ddply( dat, "gamedate", function(x) data.frame( 
                                      tot = nrow( x ), 
                                      blanks = nrow( subset(x, Gls == 0 ) ) 
                                              ) 
     )

BTW, my funny formatting above is just to prevent it from scrolling on the screen and to help illustrate how I'm really just bringing together the functions you already created.

Another solution using simple aggregate. I am using joran's dat.

agg <- aggregate(cbind(1, dat$Gls==0), list(dat$gamedate), sum)
names(agg) <- c("gamedate", "games", "blanks")
agg
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top