Question

I need help solving a data subsetting problem using R. Here is part of a data frame:

df <- read.table(text="
Name    C1      C2      C3      C4      C5
Bill    0.006   0.003   0       0.002   0
Frank   0       0.333   0.23    0.12    0
Ted     0.567   0.011   0.001   0.002   0
Jimmy   0.001   0.003   0.001   0.002   0
Sam     0.002   0.002   0.32    0.45    0.002", header=T)

What I want to do is make a new data frame containing the subset of those rows where the values in columns 2 to 6 are less than .05.

The trick is that I want to set a flexible criteria, such that in any particular row only 4 of 5 values need to be < .05. It can be any 4 of 5 values and this must be able to differ between rows.

So, for example, Bill and Ted would meet this criteria, but Sam and would not.

I have tried various apply functions but these only work on the complete row data. I need some sort of conditional statement to evaluate each row individually.

I am stuck how to do this.

Was it helpful?

Solution

Is this what you are after?

> df[rowSums(df[,2:6]<0.05)>=4,]
   Name    C1    C2    C3    C4 C5
1  Bill 0.006 0.003 0.000 0.002  0
3   Ted 0.567 0.011 0.001 0.002  0
4 Jimmy 0.001 0.003 0.001 0.002  0
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top