For practically any data structure X
containing numerics, use
X[is.na(X)] <- 0
Your question seems slightly discombobulated though - you have indicated that you mean <NA>
not NA, without explaining what type <NA>
is.
If it is the string "<NA>"
you mean, then
X[X=="<NA>"] <- "0"
If you have mixed data types in your data frame, check for that too:
X[is.character(X) & X=="<NA>"] <- "0"
which is strictly more useful in the numeric case.
X[is.numeric(X) & is.na(X)] <- 0
This is a very common idiom for dealing with missing data in R, although you should also look at the parameter na.rm = TRUE
which many functions such as mean
, sum
, &c. will accept.
This strategy will fail for a factor, because you cannot add new factor levels by assigning to the value of a factor. I haven't used read.spss, but looking at the documentation, I suggest you add the use.value.labels = FALSE
argument to your call, to avoid creating factors in the first place.
In your specific case, your entire data frame is of the same type (factor). This means it's safe to convert to a character matrix
> class(mydata[[1]])
"factor"
> mydataM <- as.matrix(mydata)
> mode(mydataM)
"character"
Now you can replace the NA values
X[is.character(X) & X=="<NA>"] <- "0"
In the more general case where you have unwanted factor columns mixed in with other types, you need to do something a little more complex.
myDataM=as.data.frame(lapply(x,
function(x)if(class(x)=="factor")as.character(x)else x))