Domanda

I've got myself in a little jam, and there is probably a better way to describe what I want to do (will edit if needed).

What I have is a data frame representing some observations, x. I would like to create a different dataframe, y, where I have all distinct combinations of some variables from x and where one of columns is a list of lists composed of other variables from x.

I've simplified this into an example, here is x:

x <- data.frame( c(1,1,1,1,1,1,1,2,2,2), c(11:12,11:12,11:12,11:12,16,17), c(101:110))
names(x) <- c("a","b","c")

   a  b   c
1  1 11 101
2  1 12 102
3  1 11 103
4  1 12 104
5  1 11 105
6  1 12 106
7  1 11 107
8  2 12 108
9  2 16 109
10 2 17 110

And here is y (distinct combos of a,b in x):

y <- unique(data.frame(x$a,x$b))
names(y) <- c("a","b")
row.names(y) <- NULL

  a  b
1 1 11
2 1 12
3 2 12
4 2 16
5 2 17

What I want to do is to transform y into this:

  a  b                  c
1 1 11 101, 103, 105, 107
2 1 12      102, 104, 106
3 2 12                108
4 2 16                109 
5 2 17                110

Where "c" in each row contains values of c from x collected into a list.

I'd like to find a nice succinct and idiomatic way of doing this, but will settle for anything that does the job.

È stato utile?

Soluzione

This is going to be pretty and cryptic looking:

aggregate(c ~ a + b, x, I)
#   a  b                  c
# 1 1 11 101, 103, 105, 107
# 2 1 12      102, 104, 106
# 3 2 12                108
# 4 2 16                109
# 5 2 17                110

The I function (you can also use c) would create a list in your third column. You don't need to create a separate data.frame for the unique combinations of "a" and "b". Just use them as the grouping variables in aggregate.


Of course, there are many other ways to do this.

Here's data.table:

library(data.table)
X <- as.data.table(x)
X[, list(c = list(I(c))), by = list(a, b)]
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top