Question

Say I have a data frame where some entire columns are NA, like so:

set.seed(0)
data <- data.frame(A = rnorm(10, 10, 1),
                   B = rnorm(10, 12, 2),
                   C = rep(NA, 10))

If I apply min() across the columns, I get the output I would hope for:

apply(data, 2, min)
#        A        B        C 
# 8.460050 9.524923       NA 

However, when I apply which.min(), my output is a list and the column C is gives integer(0):

apply(data, 2, which.min)
# $A
# [1] 6
# $B
# [1] 10
# $C
# integer(0)

I can make it look the way I want with this rather ugly workaround:

which.mins <- unlist(apply(data, 2, which.min))
which.mins[names(data)[!(names(data) %in% names(which.mins))]] <- NA
which.mins
#  A  B  C 
#  6 10 NA 

Is there a better way to do this, that would mimic the output that I get when using apply() with min()?

Was it helpful?

Solution

You're right, which.min returns 0 if x has no non-NAs. You can still use apply and which.min like this:

apply(data, 2, function(x) {if (all(is.na(x))) {NA}  else {which.min(x)} }) 

OTHER TIPS

Note that calling apply on a data.frame causes the data.frame to be coerced to a matrix before the function is applied. You should use sapply (or vapply) instead, else you may get strange errors because all the columns of your data.frame get coerced to a common type (often character).

Just test if the length of the result of which.min is zero and return NA in that case.

> # if() evaluates to FALSE if length(wm) is 0 because as.logical(0) is FALSE
> sapply(data, function(x) if(length(wm <- which.min(x))) wm else NA)
 A  B  C 
 6 10 NA

The first example is not giving an NA value because it's detecting NAs in your vector and returning them as the min value, it's giving an NA because there are no numbers in column C of your data frame so it can't return a number to position 3 of the numeric vector min returns. which.min() returns a list of lists of positions of the minimum value:

str(apply(data, 2, which.min)[1])
List of 1
 $ A: int 6

And since there is no minimum value in column C it returns a list of length 0, giving you the integer(0) result.

You workaround is fine if that's what you're trying to do. Alternatively you could just wrap the whole thing in a function

whichMinNAs <- function(x){
  if(FALSE %in% is.na(x)){
    return(which.min(x))
  } else {
    return(NA)
  }
}

apply(data, 2, whichMinNAs)

 A  B  C 
 6 10 NA

Here is an example of a work-around:

apply(data, 2, FUN=function(x) ifelse(length(test<-which.min(x))>0, test, NA))

> apply(data, 2, FUN=function(x) ifelse(length(test<-which.min(x))>0, test, NA))
 A  B  C 
 6 10 NA
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top