Assuming your data looks somewhat like this
set.seed(104)
dd<-data.frame(
a=sample(c(T,F),25, replace=T),
b=sample(c(T,F),25, replace=T),
c=sample(c(T,F),25, replace=T),
d=sample(c(T,F),25, replace=T),
prob = runif(25)
)
collist<-list("a","c","b")
then a function that would do what you want in part one is
myfun<-function(N) {
rowmatches <- apply(as.matrix(dd[, unlist(collist[1:N])]), 1, any)
dd[rowmatches, ]
}
There is no need to dynamically build a predicate list. Here we just extract the columns you are asking for from the data.set and turn it into a matrix. Then we use apply
to scan across the values in the row to see if any are true. Then we returns the rows that match. So
myfun(1)
# nrow(myfun(1)) == sum(dd$a==T)
# TRUE
returns all the rows where column a is true. And
myfun(2)
# nrow(myfun(2)) == sum(dd$a==T | dd$c==T)
# TRUE
returns all rows where column "a" or "c" is true.
Then, if you want to grab the top values in the list, you can do something like
result<-myfun(2)
head(result[order(result$prob),], 3)
# a b c d prob
#15 FALSE TRUE TRUE FALSE 0.08670653
#14 TRUE TRUE FALSE FALSE 0.12188057
#16 TRUE TRUE TRUE TRUE 0.13206675
where you use order()
to sort the data.frame and use head()
to extract a certain number of rows (in this case 3).