Question

I am using R to generate examples of how to deal with missing data for the statistics class I am teaching. One method requires generating a "missing values binary variable", with 0 for cases containing missing values, and 1 with no missing values. For example

n  X  Y    Z  
1  4  300  2  
2  8  400  4  
3  10 500  7  
4  18 NA   10  
5  20  50  NA  
6  NA 1000 5  

I would like to generate a variable M, such that

n m  
1 1  
2 1   
3 1  
4 0  
5 0  
6 0  

It seems this should be simple, given R's ability to handle missing values. The closest I have found is m <-ifelse(is.na(missguns),0,1), but all this does is generate a new entire data matrix with 0 or 1 indicating missingness. However, I just want one variable indicating if a row contains missing values.

Was it helpful?

Solution

complete.cases does exactly what you want.

complete.cases(x)
## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE

You can coerce to numeric or integer:

as.integer(complete.cases(x))
## [1] 1 1 1 0 0 0
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top