Question

Return row value when certain number of columns reach certain value from the following table

    V1   V2   V3   V4   V5   V6   V7   V8   V9   V10
1   3.93    3.92    3.74    4.84    4.55    4.67    3.99    4.10    4.86     4.06
2   4.00    3.99    3.81    4.90    4.61    4.74    4.04    4.15    4.92     4.11
3   4.67    4.06    3.88    5.01    4.66    4.80    4.09    4.20    4.98     4.16
4   4.73    4.12    3.96    5.03    4.72    4.85    4.14    4.25    5.04     4.21
5   4.79    4.21    4.04    5.09    4.77    4.91    4.18    4.30    5.10     4.26
6   4.86    4.29    4.12    5.15    4.82    4.96    4.23    4.35    5.15     4.30
7   4.92    4.37    4.19    5.21    4.87    5.01    4.27    4.39    5.20     4.35
8   4.98    4.43    4.25    5.26    4.91    5.12    4.31    4.43    5.25     4.38
9   5.04    4.49    4.31    5.30    4.95    5.15    4.34    4.46    5.29     4.41
10   5.04    4.50    4.49    5.31    5.01    5.17    4.50    4.60    5.30     4.45
11   ...
12   ...

As an output, I need a data frame, containing the % reach of the value of interest ('5' in this example) by V1-V10:

Rownum   Percent
1   0
2   0
3   10
4   20
5   20
6   20
7   33
8   33
9   40
10  50

Many thanks!

Was it helpful?

Solution

If your matrix is mat:

cbind(1:dim(mat)[1],rowSums(mat>5)/dim(mat)[2]*100)

OTHER TIPS

As far as it's always about 0 and 1 with ten columns, I would multiply the whole dataset by 10 (equals percentage values in this case...). Just use the following code:

# Sample data
set.seed(10)

data <- as.data.frame(do.call("rbind", lapply(seq(9), function(...) {
  sample(c(0, 1), 10, replace = TRUE)
})))
rownames(data) <- c("abc", "def", "ghi", "jkl", "mno", "pqr", "stu", "vwx", "yza")

# Percentages
rowSums(data * 10)

# abc def ghi jkl mno pqr stu vwx yza 
#  80  40  80  60  60  10  30  50  50

Ok, so now I believe you want to get the percentage of values in each row that meet some threshold criteria. You give the example > 5. One solution of many is using apply:

apply( df , 1 , function(x) sum( x > 5 )/length(x)*100  )
# 1  2  3  4  5  6  7  8  9 10 
# 0  0 10 20 20 20 30 30 40 50 

@Thomas' solution will be faster for large data.frames because it converts to a matrix first, and these are faster to operate on.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top