Вопрос

I have a number of columns in a data frame that represent replicates of an experimental result.

Example here

        1a      2a      3a      4a      5a
1      154     152     154     156      NA
2      154     154     154      NA      NA
3      154     154     154     154      NA
4      154     154     154     154      NA
5      154      NA     154     154      NA
6       NA      NA      NA     154      NA
7      154     154      NA     154      NA
8      154     154      NA     154      NA
9      154      NA     154     150      NA
10     149     149      NA     149     149

What I would like is to create another column which has the value that occurs(>=2)from each of the other columns.

        1a      2a      3a      4a      5a    score 
1      154     152     154     156      NA    154
2      154     154     154      NA      NA    154
3      154     154     154     154      NA    154
4      154     154     154     154      NA    154
5      154      NA     154     154      NA    154
6       NA      NA      NA     154      NA     NA
7      154     154      NA     154      NA    154
8      154     154      NA     154      NA    154
9      154      NA     154     150      NA    154
10     149     149      NA     149     149    149

EDIT: Modified example above to demonstrate. flodel's answer of using the mode was initially successful however it would use a value even if it only occurred once. I would like it to either come up NA or a character string (which ever is easier)if there are not 2>x values in each row.

Это было полезно?

Решение

You are not looking for the median but the mode, which is easy enough to define yourself:

Mode <- function(x, min.freq = 1L) {
  f <- table(x)
  k <- f[f >= min.freq]
  if (length(k) > 0L) as.numeric(names(f)[which.max(f)]) else NA
}

test$score <- apply(test2, 1, Mode, min.freq = 2L)
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top