EDIT: cleaner version courtesy Arun (note there is a key
argument added to the data.table
creation):
dt1 <- data.table(
id=c(123, 456, 456, 456, 123, 789),
indicator = c("abc", NA, NA, NA, "abcd", "abc"),
key=c("id", "indicator")
)
dt1[,
list(indicator=
if(nchar(indicator) > 2)
paste0(indicator, c("", 2:(max(2, .N))))
else
rep(indicator, .N)
),
by=list(indicator, id)
][, -1]
# id indicator
# 1: 123 abc
# 2: 123 abc2
# 3: 123 abcd
# 4: 123 abcd2
# 5: 456 NA
# 6: 456 NA
# 7: 456 NA
# 8: 789 abc
# 9: 789 abc2
Old version
There probably is a more elegant way, but this will do it. Basically, you rbind the rows that don't meet your condition, with those that do, modified by appending the numeric modifier (or "" for the first one). Note, if you have non-unique id/indicators, this will just add another numeric modifier (i.e. 123-abc, 123-abc, ends up as 123-abc, 123-abc2, 123-abc3).
dt1 <- data.table(id=c(123, 456, 456, 456, 123, 789), indicator = c("abc", NA, NA, NA, "abcd", "abc"))
rbind(
dt1[nchar(indicator) <= 2 | is.na(indicator)],
dt1[
nchar(indicator) > 2,
list(indicator=paste0(indicator, c("", 2:(max(2, .N))))),
by=list(indicator, id)
][, -1]
)[order(id, indicator)]
# id indicator
# 1: 123 abc
# 2: 123 abc2
# 3: 123 abcd
# 4: 123 abcd2
# 5: 456 NA
# 6: 456 NA
# 7: 456 NA
# 8: 789 abc
# 9: 789 abc2