Question

I'd like to add a column to a data.table object which lists the column names that are NA for that row. For example let's say I have the following data.table:

dt <- data.table(a = c(1, 2, 3, NA), 
                 b = c(1, 2, NA, NA), 
                 c = c(NA, 2, NA, 4))
    a  b  c        
1:  1  1 NA        
2:  2  2  2        
3:  3 NA NA        
4: NA NA  4

I'd like to add a column with these values, resulting in the below data.table:

dt[, na.cols := c("c", "", "b,c", "a,b")]
    a  b  c na.cols        
1:  1  1 NA       c
2:  2  2  2        
3:  3 NA NA     b,c
4: NA NA  4     a,b

How can I add this column dynamically?

Was it helpful?

Solution

Here is an approach that will avoid usingapply on a data.table (which coerces to matrix internally)

dt[, na.cols := gsub('(^,+)|(,+$)','',do.call(paste, c(lapply(seq_along(.SD), function(x) ifelse(is.na(.SD[[x]]),names(.SD)[x],'')), sep=',')))]
#     a  b  c na.cols
# 1:  1  1 NA       c
# 2:  2  2  2        
# 3:  3 NA NA     b,c
# 4: NA NA  4     a,b

OTHER TIPS

You could do it this way:

dt[, na.cols := 
   apply(dt, 1, function(row) paste(names(row)[which(is.na(row))],
                                    collapse=","))]  

Details: basically, you're using apply along margin 1 (i.e. along the rows) and then, for each row, pasting together column names that are NA.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top