문제

I have a data frame of mostly numeric columns, each with few unique elements. Those with 20 or fewer unique values, I'd like to convert to factors as is, those with more, I'd like to convert to factors using gtools::quantcut.

What am I not understanding about the behavior of ifelse within lapply?

d <- data.frame(a = sample(1:10, 100, replace=T), 
                                b = sample(1:10, 100 ,replace=T), 
                                c = sample(1:30, 100 ,replace=T),
                                d = sample(1:30, 100 ,replace=T),
                                e = sample(1:30, 100 ,replace=T))

wrong <- as.data.frame(lapply(d[,sapply(d, is.numeric)],
function(x) ifelse(length(unique(x)) <=20, 
                   as.factor(x),
                   quantcut(x))))
dim(wrong)
# [1]  1 5
right <- as.data.frame(lapply(d[, sapply(d, is.numeric)],
                       function(x) { 
                           if(length(unique(x)) <= 20) {
                           return(as.factor(x))
                           }
                           quantcut(x)
                           }))
dim(right)
# [1] 100    5
도움이 되었습니까?

해결책

The problem is that you are asking ifelse to return a vector when the test argument is a scalar. The ifelse statement in the wrong way you have above is returning the first element of the desired vector. From the help file: ifelse can only return "a value that is the same shape as test".

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top