Normalize data frame with list column [duplicate]

https://stackoverflow.com/questions/22128300

19-10-2022
|

문제

I have a data frame that maps some ids to a list of versions:

id versions
 1  1, 2, 4
 2        1
 3     3, 4

It can be created with the following code:

df <- data.frame(id=c(1, 2, 3), 
  versions=c("1 2 4", "1", "3 4"), 
  stringsAsFactors=F)
df$versions <- strsplit(df$versions, " ")

Notice that each element of the versions column is a list.

How to normalize this data frame? I need to get a data frame like this:

해결책

stack would be perfect for this:

stack(setNames(df$versions, df$id))
#   values ind
# 1      1   1
# 2      2   1
# 3      4   1
# 4      1   2
# 5      3   3
# 6      4   3

다른 팁

I adapted and simplified the solution from another SO question for future reference:

data.frame(id = rep(df$id, sapply(df$versions, length)),
      version = unlist(df$versions))

The new id column is computed by repeating each id according to the number of versions it has (i.e., the length of the list versions). The new version column is computed using unlist, that just returns a vector by concatenating all elements in the list.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow