I have a data frame where multiple type of values need to be replaced with NAs while some other columns with these values are valid data to kept. For example,

>df<-data.frame(
  x1=c("1999-09-09","2013-01-02","2013-06-08","1999-09-09","2013-06-08","2013-06-08"),
  x2=c(1,2,3,4,5,9),
  x3=c(7,8,9,9,12,9),
  x4=c(78,88,99,9,12,999)
  )
>df
          x1 x2 x3  x4
1 1999-09-09  1    7  78
2 2013-01-02  2    8  88
3 2013-06-08  3    9  99
4 1999-09-09  4    9   9
5 2013-06-08  5   12  12
6 2013-06-08  9    9 999

Where "1999-09-09", 9 and 99 are missing values for x1,x2, and x4 while 9 is valid observations for x3. Does anyone know what's the best way to realize that. I have nearly 100 dataframes and want to write a simple function for this purpose. If I have miss<-c("1999-09-09", 9, "",99), how to apply it in a simple way to the df for replacing with NAs.

And furthermore, if there's another similar data frame where all these type of values are valid. How to distinguish them within multiple data frames?

有帮助吗?

解决方案

Try this:

miss<-c("1999-09-09", 9, NA,99)
data.frame(Map(
 function(x,y) {x[x==y] <- NA; x;},
 df,
 miss
))

Result:

          x1 x2 x3  x4
1       <NA>  1  7  78
2 2013-01-02  2  8  88
3 2013-06-08  3  9  NA
4       <NA>  4  9   9
5 2013-06-08  5 12  12
6 2013-06-08 NA  9 999
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top