subset dataframe on similar columns

https://stackoverflow.com/questions/21635554

08-10-2022
|

Question

I have a data frame with multiple columns, I would like to subset the data frame using similarly named columns. For example my data frame looks like:

df1<-data.frame(a=c("a","b","c"),px1=c(123,456,789),px2=c(111,222,333),px3=c(278,908,456),b=c(456,123,333))
> df1
  a px1 px2 px3   b
1 a 123 111 278 456
2 b 456 222 908 123
3 c 789 333 456 333

Now I want to create a subset of df1 where px1 or px2 or px3 has value of 456 ( In actual scenario there are lot more variables I tried the following, but didn't work :

 > subset(df1,grep("px",names(df1)) %in% c(456))
 [1] a   px1 px2 px3 b  
 <0 rows> (or 0-length row.names)

I couldn't figure out the missing part - can anyone help?

Solution

Here's an easy approach:

df1[as.logical(rowSums(df1[grepl("px", names(df1))] == 456)), ]

  a px1 px2 px3   b
2 b 456 222 908 123
3 c 789 333 456 333

If you want to take care of multiple values, e.g., 456 and 333, you can use this approach:

df1[as.logical(rowSums(sapply(df1[grepl("px", names(df1))], 
                              "%in%", c(456, 333)))), ]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow