Question

I have a dataframe, and I'd like to subset by picking out all the rows that conform to a condition on the factor value for year:

subset_df <- df[ (which(df$year < '1972') || (df$year > '1982')),]

My problem is that the line above returns the whole dataframe, df.

Forgive me if this is too basic or simple, but I cannot figure out where the flaw lies.

I'm suspecting there is something regarding || which I don't understand, or my other theory is that arr.ind=T somehow plays a role. Either that, or the nature of the which() function is a little more complicated than I think it is.

If anyone has any insight, I'd greatly appreciate it. Thanks for your time.

PS: yes, this works as expected and returns the correct subset; ie, there isn't a flaw in my dataframe:

test_df <- df[ (which(df$year < '1972')), ] 

as does it's counterpart for 1982.

Was it helpful?

Solution

Note that from the helpfile you can read (See ?"|"):

For |, & and xor a logical or raw vector... and...For ||, && and isTRUE, a length-one logical vector.

Therefore you may want to change your || to | and I think which is not required here.

subset_df <- df[ df$year < '1972' | df$year > '1982',]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top