You're right, which.min
returns 0
if x has no non-NAs. You can still use apply
and which.min
like this:
apply(data, 2, function(x) {if (all(is.na(x))) {NA} else {which.min(x)} })
Question
Say I have a data frame where some entire columns are NA
, like so:
set.seed(0)
data <- data.frame(A = rnorm(10, 10, 1),
B = rnorm(10, 12, 2),
C = rep(NA, 10))
If I apply min()
across the columns, I get the output I would hope for:
apply(data, 2, min)
# A B C
# 8.460050 9.524923 NA
However, when I apply which.min()
, my output is a list and the column C
is gives integer(0)
:
apply(data, 2, which.min)
# $A
# [1] 6
# $B
# [1] 10
# $C
# integer(0)
I can make it look the way I want with this rather ugly workaround:
which.mins <- unlist(apply(data, 2, which.min))
which.mins[names(data)[!(names(data) %in% names(which.mins))]] <- NA
which.mins
# A B C
# 6 10 NA
Is there a better way to do this, that would mimic the output that I get when using apply()
with min()
?
Solution
You're right, which.min
returns 0
if x has no non-NAs. You can still use apply
and which.min
like this:
apply(data, 2, function(x) {if (all(is.na(x))) {NA} else {which.min(x)} })
OTHER TIPS
Note that calling apply
on a data.frame causes the data.frame to be coerced to a matrix before the function is applied. You should use sapply
(or vapply
) instead, else you may get strange errors because all the columns of your data.frame get coerced to a common type (often character).
Just test if the length of the result of which.min
is zero and return NA
in that case.
> # if() evaluates to FALSE if length(wm) is 0 because as.logical(0) is FALSE
> sapply(data, function(x) if(length(wm <- which.min(x))) wm else NA)
A B C
6 10 NA
The first example is not giving an NA value because it's detecting NAs in your vector and returning them as the min value, it's giving an NA because there are no numbers in column C of your data frame so it can't return a number to position 3 of the numeric vector min
returns. which.min()
returns a list of lists of positions of the minimum value:
str(apply(data, 2, which.min)[1])
List of 1
$ A: int 6
And since there is no minimum value in column C it returns a list of length 0, giving you the integer(0)
result.
You workaround is fine if that's what you're trying to do. Alternatively you could just wrap the whole thing in a function
whichMinNAs <- function(x){
if(FALSE %in% is.na(x)){
return(which.min(x))
} else {
return(NA)
}
}
apply(data, 2, whichMinNAs)
A B C
6 10 NA
Here is an example of a work-around:
apply(data, 2, FUN=function(x) ifelse(length(test<-which.min(x))>0, test, NA))
> apply(data, 2, FUN=function(x) ifelse(length(test<-which.min(x))>0, test, NA))
A B C
6 10 NA