How to select rows of a data frame that are not in other data frame in R?
-
10-09-2020 - |
Pregunta
How to select rows of a data frame that are not in other data frame in R?
Instead of finding the common rows, sometimes we need to find the uncommon rows between two data frames. It is mostly used when we expect that a large number of rows are uncommon instead of few ones. We can do this by using the negation operator which is represented by exclamation sign with subset function.
Example
Consider the below data frames −
> x1<-sample(1:10,20,replace=TRUE) > y1<-sample(1:10,20,replace=TRUE) > df1<-data.frame(x1,y1) > df1
Output
x1 y1 1 10 6 2 5 9 3 10 10 4 4 10 5 1 6 6 1 4 7 9 3 8 5 10 9 10 3 10 8 2 11 6 10 12 6 3 13 9 3 14 3 6 15 6 9 16 9 1 17 7 9 18 3 8 19 2 5 20 4 9
Example
> x2<-sample(1:10,20,replace=TRUE) > y2<-sample(1:10,20,replace=TRUE) > df2<-data.frame(x2,y2) > df2
Output
x2 y2 1 6 10 2 3 6 3 9 6 4 9 10 5 10 10 6 3 2 7 3 3 8 2 9 9 7 5 10 1 1 11 10 10 12 1 6 13 3 4 14 4 2 15 6 3 16 1 7 17 2 2 18 4 6 19 4 1 20 1 8
Now suppose we want to take a subset of df2 variable y2 that are not in y1 of df1, then it can be done as follows −
> subset(df2,!(y2%in%df1$y1)) x2 y2 16 1 7 <0 rows> (or 0-length row.names)
Similarly, taking a subset of df2 variable y2 that are not in x1 of df1, then it can be done as follows −
> subset(df2,!(y2%in%df1$x1)) [1] x2 y2 <0 rows> (or 0-length row.names)
Let’s have a look at one more example −
Example
> x1<-rep(1:10,2) > df1<-data.frame(x1) > df1
Output
x1 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 1 12 2 13 3 14 4 15 5 16 6 17 7 18 8 19 9 20 10
> x2<-rep(1:5,4) > df2<-data.frame(x2) > df2
Output
x2 1 1 2 2 3 3 4 4 5 5 6 1 7 2 8 3 9 4 10 5 11 1 12 2 13 3 14 4 15 5 16 1 17 2 18 3 19 4 20 5
> subset(df1,!(x1%in%df2$x2))
Output
x1 6 6 7 7 8 8 9 9 10 10 16 6 17 7 18 8 19 9 20 10
Advertisements
No afiliado a Tutorialspoint