문제

I have the following dataframe where I want to replace all occurrences of "Blank(s)" and NA's.

dat <- data.frame(
    "a"=c("Blank(s)", "1", "2", "Blank(s)", <NA>),
    "b"=c("Blank(s)", "1", "2", "Blank(s)", <NA>),
    "c"=c("Blank(s)", "1", "2", "Blank(s)", <NA>),
    "d"=c("Blank(s)", "1", "2", "Blank(s)", <NA>),
    "e"=c("Blank(s)", "1", "2", "Blank(s)", <NA>),
    "f"=c("Blank(s)", "1", "2", "Blank(s)", <NA>)
)

For the NA's I have successfully found a wonderful solution by Muhammad Ariz:

x <- c(rnorm(5),rep(NA,3),rnorm(5))    # sample data 
dat <- data.frame(x,x)                 # make sample dataframe 
dat2 <- as.matrix(dat)                 # convert to matrix 
y <- which(is.na(dat)==TRUE)           # get index of NA values 
dat2[y] <- "your string"               # replace all NA values 

and just use as.data.frame(dat2) to convert the matrix to a dataframe again.

To add a condition for "Blank(s)" I tried doing y <- which(is.na(dat3)==TRUE || dat3=="Blank(s)") but nothing happened.

I want to know if I can combine these conditionals so that next time I can just add a string, vector or is. functions, like a good old find and replace mechanism, ie:

y <- which(is.na(dat3)==TRUE || is.character(dat3)==TRUE || 
    is.equal(dat3)=="Blank(s)" || is.equal(dat3)==-1 || ...)

Note: I would love to have a function that performs fast because my original dataframe has 500,000 observations and 55 variables.

도움이 되었습니까?

해결책

try using single | in y <- which(is.na(dat3)==TRUE || dat3=="Blank(s)")

so,

y <- which(is.na(dat3)==TRUE | dat3=="Blank(s)")
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top