سؤال

I have a data frame that maps some ids to a list of versions:

id versions
 1  1, 2, 4
 2        1
 3     3, 4

It can be created with the following code:

df <- data.frame(id=c(1, 2, 3), 
  versions=c("1 2 4", "1", "3 4"), 
  stringsAsFactors=F)
df$versions <- strsplit(df$versions, " ")

Notice that each element of the versions column is a list.

How to normalize this data frame? I need to get a data frame like this:

id version
 1       1
 1       2
 1       4
 2       1
 3       3
 3       4
هل كانت مفيدة؟

المحلول

stack would be perfect for this:

stack(setNames(df$versions, df$id))
#   values ind
# 1      1   1
# 2      2   1
# 3      4   1
# 4      1   2
# 5      3   3
# 6      4   3

نصائح أخرى

I adapted and simplified the solution from another SO question for future reference:

data.frame(id = rep(df$id, sapply(df$versions, length)),
      version = unlist(df$versions))

The new id column is computed by repeating each id according to the number of versions it has (i.e., the length of the list versions). The new version column is computed using unlist, that just returns a vector by concatenating all elements in the list.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top