duplicated
has a fromLast
argument that should suit your needs:
duplicated(c('a', 'b', 'c', 'a', 'd', 'f'), fromLast=TRUE)
## [1] TRUE FALSE FALSE FALSE FALSE FALSE
题
I have a data frame like this
head(data)
V1 V2 V3 V4 V5 V6 V7
1 a 1941 2 14 -73.90 38.60 US009239
2 b 1941 2 14 -74.00 36.90 US009239
3 c 1941 2 14 -74.00 35.40 US009239
4 a 1941 2 15 -74.00 34.00 US009239
5 d 1941 2 15 -74.00 32.60 US009239
6 f 1941 2 15 -73.80 31.70 US009239
and what I would like to do is to eliminate rows corresponding to duplicates of data$V1 (the maximum number of data$V1 duplicates is 2). The problem is that if I do
newdata <- data[!duplicated(data$V1),]
it will keep the first one
head(newdata)
V1 V2 V3 V4 V5 V6 V7
1 a 1941 2 14 -73.90 38.60 US009239
2 b 1941 2 14 -74.00 36.90 US009239
3 c 1941 2 14 -74.00 35.40 US009239
5 d 1941 2 15 -74.00 32.60 US009239
6 f 1941 2 15 -73.80 31.70 US009239
while I want to keep the second one
head(newdata)
V1 V2 V3 V4 V5 V6 V7
2 b 1941 2 14 -74.00 36.90 US009239
3 c 1941 2 14 -74.00 35.40 US009239
4 a 1941 2 15 -74.00 34.00 US009239
5 d 1941 2 15 -74.00 32.60 US009239
6 f 1941 2 15 -73.80 31.70 US009239
any help?
解决方案
duplicated
has a fromLast
argument that should suit your needs:
duplicated(c('a', 'b', 'c', 'a', 'd', 'f'), fromLast=TRUE)
## [1] TRUE FALSE FALSE FALSE FALSE FALSE