stack
would be perfect for this:
stack(setNames(df$versions, df$id))
# values ind
# 1 1 1
# 2 2 1
# 3 4 1
# 4 1 2
# 5 3 3
# 6 4 3
Вопрос
I have a data frame that maps some ids to a list of versions:
id versions
1 1, 2, 4
2 1
3 3, 4
It can be created with the following code:
df <- data.frame(id=c(1, 2, 3),
versions=c("1 2 4", "1", "3 4"),
stringsAsFactors=F)
df$versions <- strsplit(df$versions, " ")
Notice that each element of the versions
column is a list.
How to normalize this data frame? I need to get a data frame like this:
id version
1 1
1 2
1 4
2 1
3 3
3 4
Решение
stack
would be perfect for this:
stack(setNames(df$versions, df$id))
# values ind
# 1 1 1
# 2 2 1
# 3 4 1
# 4 1 2
# 5 3 3
# 6 4 3
Другие советы
I adapted and simplified the solution from another SO question for future reference:
data.frame(id = rep(df$id, sapply(df$versions, length)),
version = unlist(df$versions))
The new id
column is computed by repeating each id according to the number of versions it has (i.e., the length of the list versions
). The new version
column is computed using unlist
, that just returns a vector by concatenating all elements in the list.