Question

I have a rather large data frame. Here is a simplified example:

Group Element Value Note
1     AAA     11    Good
1     ABA     12    Good
1     AVA     13    Good
2     CBA     14    Good
2     FDA     14    Good
3     JHA     16    Good
3     AHF     16    Good
3     AKF     17    Good

Here it is as a dput:

dat <- structure(list(Group = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), Element = structure(c(1L, 
2L, 5L, 6L, 7L, 8L, 3L, 4L), .Label = c("AAA", "ABA", "AHF", 
"AKF", "AVA", "CBA", "FDA", "JHA"), class = "factor"), Value = c(11L, 
12L, 13L, 14L, 14L, 16L, 16L, 17L), Note = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = "Good", class = "factor")), .Names = c("Group", 
"Element", "Value", "Note"), class = "data.frame", row.names = c(NA, 
-8L))

I'm trying to separate it based on the group. so let's say

Group 1 will be a data frame:

Group Element Value Note
1     AAA     11    Good
1     ABA     12    Good
1     AVA     13    Good

Group 2:

2     CBA     14    Good
2     FDA     14    Good

and so on.

Was it helpful?

Solution

You can use split for this.

> dat
##   Group Element Value Note
## 1     1     AAA    11 Good
## 2     1     ABA    12 Good
## 3     1     AVA    13 Good
## 4     2     CBA    14 Good
## 5     2     FDA    14 Good
## 6     3     JHA    16 Good
## 7     3     AHF    16 Good
## 8     3     AKF    17 Good

> x <- split(dat, dat$Group)

Then you can access each individual data frame by group number with x[[1]], x[[2]], etc.
For example, here is group 2:

> x[[2]]  ## or x[2]
##   Group Element Value Note
## 4     2     CBA    14 Good
## 5     2     FDA    14 Good

ADD: Since you asked about it in the comments, you can write each individual data frame to file with write.csv and lapply. The invisible wrapper is simply to suppress the output of lapply

> invisible(lapply(seq(x), function(i){
      write.csv(x[[i]], file = paste0(i, ".csv"), row.names = FALSE)
  }))

We can see that the files were created by looking at list.files

> list.files(pattern = "^[0-9].csv")
## [1] "1.csv" "2.csv" "3.csv"

And we can see the data frame of the third group with read.csv

> read.csv("3.csv")
##   Group Element Value Note
## 1     3     JHA    16 Good
## 2     3     AHF    16 Good
## 3     3     AKF    17 Good

OTHER TIPS

Obligatory plyr version (pretty much equiv to Richard's, but I'll bet it's slower, too:

library(plyr)

groups <- dlply(dat, .(Group), function(x) { return(x) })

length(groups)
## [1] 3

groups$`1` # can also do groups[[1]]
##   Group Element Value Note
## 1     1     AAA    11 Good
## 2     1     ABA    12 Good
## 3     1     AVA    13 Good

groups[[2]]
##   Group Element Value Note
## 1     2     CBA    14 Good
## 2     2     FDA    14 Good
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top