Here is another ddply
alternative where you don't have to specify the variable names which the function should be applied on. By using numcolwise
, the function operates on all numerical columns.
library(plyr)
myfun <- function(x){
x[is.na(x) & (sum(!is.na(x) & x == 0) > 0)] <- 0
x}
ddply(df, .(FIRMID), numcolwise(myfun))
# FIRMID VAR1 VAR2
# 1 FIRM1 0 1
# 2 FIRM1 0 NA
# 3 FIRM2 1 0
# 4 FIRM2 NA 0
Or in base
R, where I assume that the first column contains the grouping variable (dat[ , -1]
). You could of course refer to it by name instead.
df2 <- do.call(rbind, by(df, df[ , "FIRMID"], function(dat){
sapply(dat[ , -1], function(x){
myfun(x)
})
}))
data.frame(FIRMID = df$FIRMID, df2)
# FIRMID VAR1 VAR2
# 1 FIRM1 0 1
# 2 FIRM1 0 NA
# 3 FIRM2 1 0
# 4 FIRM2 NA 0
Update 'myfun' can be written much simpler. Thanks @Arun for the suggestion!
myfun <- function(x){
x[is.na(x) & any(x == 0)] <- 0
x}