Question

I have a data.frame that contains 4 columns (given below). I want to find the index of the minimum column (NOT THE VALUE) for each row. Any idea hiw to achieve that?

> d
            V1         V2         V3         V4
1  0.388116155 0.98999967 0.41548536 0.76093748
2  0.495971331 0.47173142 0.51582728 0.06789924
3  0.436495321 0.48699268 0.21187838 0.54139290
4  0.313514389 0.50265539 0.08054103 0.46019601
5  0.277275961 0.39055360 0.29594162 0.70622532
6  0.264804739 0.86996266 0.85708635 0.61136741
7  0.627344463 0.54277873 0.96769568 0.80399490
8  0.814420492 0.35362949 0.39023446 0.39246250
9  0.517459983 0.65895805 0.93662382 0.06762166
10 0.498319937 0.67081260 0.43225997 0.42139151
11 0.046862110 0.97304915 0.06542971 0.09779383
12 0.619009734 0.82363618 0.14514799 0.52858058
13 0.007262782 0.82203403 0.08573499 0.61094206
14 0.001602586 0.33241230 0.57762669 0.45285004
15 0.698388370 0.83541257 0.21051568 0.84431347
16 0.296088411 0.34363164 0.02179999 0.70551493
17 0.897869571 0.50625928 0.92861583 0.61249019
18 0.372497428 0.29025182 0.23201891 0.55737699
19 0.172931860 0.03604668 0.50291560 0.10850847
20 0.988827604 0.15800337 0.87999839 0.09899663

So I want the following output:

1    1
2    4
3    3
4    3

which continues for all the rows. Thanks

Était-ce utile?

La solution

Your English description suggests you want:

 apply( df, 1, which.min)

But the answer you give is not formatted as a vector and is not the correct answer if the above interpretation is correct. Oh wait, you were expecting rownumbers.

 as.matrix(apply( d, 1, which.min))

   [,1]
1     1
2     4
3     3
4     3
5     1
6     1
7     2
8     2
9     4
10    4
11    1
12    3
13    1
14    1
15    3
16    3
17    2
18    3
19    2
20    4

Autres conseils

Another option is max.col of d multiplied by -1

max.col(-d)
# [1] 1 4 3 3 1 1 2 2 4 4 1 3 1 1 3 3 2 3 2 4

If you need a matrix as output, use

cbind(1:nrow(d),    # row
      max.col(-d))  # column position of minimum

Here is a benchmark of the two approaches

set.seed(42)
dd <- as.data.frame(matrix(runif(1e5 * 100), nrow = 1e5, ncol = 100))

library(microbenchmark)
library(ggplot2)

b <- microbenchmark(
  apply = apply(dd, 1, which.min),
  max_col = max.col(-dd),
  times = 25
)

autoplot(b)

enter image description here

b
#Unit: milliseconds
#    expr      min       lq     mean   median       uq       max neval cld
#   apply 705.7478 855.7112 906.2340 892.3214 933.4655 1211.5016    25   b
# max_col 162.8273 175.6363 227.1156 206.0213 225.2973  406.9124    25  a 
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top