Question

What am i doing wrong here? date "5" is not in the final data.frame.. why is that?

date1 <- c(1,2,3,4,5,6,7,8,9)
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1)
df <- data.frame(date1,ret)

date2 <- c(1,2,3,5,6,8)
q <- c(3,2,1,4,5,7)
ev <- data.frame(date2,q)

matched <- ev[which(is.na(match(df[["date1"]], ev[["date2"]])) == F),]
matched

#    date2  q
# 1      1  3
# 2      2  2
# 3      3  1
# 5      6  5
# 6      8  7
# NA    NA NA
Was it helpful?

Solution

For your example above i think you want ev[ev$date2 %in% df$date1 , ].


I have created another example with new data so that the dates are quite different from the row numbers.

date1 <- 10:18
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1)
df <- data.frame(date1,ret)

date2 <- c(10:13,20,17)
q <- c(3,2,1,4,5,7)
ev <- data.frame(date2,q)


Look at the vectors that you want to match

df$date1
 #[1] 10 11 12 13 14 15 16 17 18
ev$date2
# [1] 10 11 12 13 20 17

# So all but one of ev$date2 are in df$date1 (ie. ev$date2 = 20)


Match the date values

First look at %in% operand.

This produces a " logical vector indicating if there is a match or not for its left operand" (from ?match). ie for the example below - does ev$date2 appear in df$date1: TRUE or FALSE

ev$date2 %in% df$date1
# [1]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE

I would use this method to subset data: if we only want to keep the rows of ev where ev$date2 appear in df$date1 use

ev[ev$date2 %in% df$date1 , ]
#   date2 q
# 1    10 3
# 2    11 2
# 3    12 1
# 4    13 4
# 6    17 7


match "returns a vector of the positions of (first) matches of its first argument in its second" (again from help). ie for the example - where (as in at what positional index) do values of ev$date2 appear in df$date1, if at all

match(ev$date2 , df$date1)
# [1]  1  2  3  4 NA  8

# Sp that makes sense: all values of ev$date2 are found in df$date1
# (with their position in df$date1 returned) except where ev$date2 = 20 - this 
# returns NA as it is not found in df$date1

I would use this method to pull out values from df, matching on date. ie

(ev$ret <- df$ret[match(ev$date2 , df$date1) ])
  #date2 q   ret
#1    10 3  1.20
#2    11 2  2.20
#3    12 1 -0.50
#4    13 4  0.98
#5    20 5    NA
#6    17 7  0.30
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top