One of your actual problems is using the incorrect arguments to paste
. You are looking for collapse
, not sep
. Another problem is using "data.table" syntax incorrectly.
Update
Considering the comments to this answer, I would suggest something like this instead:
library(data.table)
library(reshape2)
DT <- as.data.table(repex)
setkey(DT, cat, year, org) ## Sorts everything
## Creates a column "var" with the sequence of values ("V1", "V2", and so on)
DT[, var := paste("V", sequence(.N), sep = ""), by = list(cat, year)]
head(DT)
# cat year org var
# 1: x 1980 a V1
# 2: x 1980 b V2
# 3: x 1982 a V1
# 4: x 1982 c V2
# 5: x 1990 d V1
# 6: x 1991 f V1
Converts that to a "wide" format:
dcast.data.table(DT, cat + year ~ var, value.var="org")
# cat year V1 V2 V3 V4
# 1: x 1980 a b NA NA
# 2: x 1982 a c NA NA
# 3: x 1990 d NA NA NA
# 4: x 1991 f j k NA
# 5: x 1993 e NA NA NA
# 6: y 1981 a b NA NA
# 7: y 1983 d NA NA NA
# 8: y 1990 b NA NA NA
# 9: y 1996 c e h m
# 10: z 1994 e NA NA NA
# 11: z 1999 h NA NA NA
# 12: z 2002 b NA NA NA
Original answer
This is a pretty straightforward aggregate
problem:
aggregate(org ~ cat + year, repex, function(x) paste(sort(x), collapse = " "))
# cat year org
# 1 x 1980 a b
# 2 y 1981 a b
# 3 x 1982 a c
# 4 y 1983 d
# 5 x 1990 d
# 6 y 1990 b
# 7 x 1991 f j k
# 8 x 1993 e
# 9 z 1994 e
# 10 y 1996 c e h m
# 11 z 1999 h
# 12 z 2002 b
A "data.table" approach:
library(data.table)
DT <- as.data.table(repex)
DT[, list(org = paste(sort(org), collapse = " ")), by = list(cat, year)]
And, to round things out, a "dplyr" approach:
library(dplyr)
repex %.% group_by(cat, year) %.% summarise(org = paste(sort(org), collapse = " "))