Using apply
to loop over the rows and use grepl
will work...
apply( df , 1 , function(x) grepl( x[1] , x[2] ) )
#[1] TRUE FALSE FALSE TRUE
Question
I am trying to see whether the elements in vector match with the first four digits of elements within a second vector (they are nested identifiers) and I'm not quite sure how to run the match. For example:
X Y
1111 111120
1111 890933
2222 780777
2222 222247
I would like to create code to tell me whether the first four digits of element i in vector y match the digits in element i in vector x. Extending the example, I hope to see:
True
False
False
True
Thanks for any thoughts.
Solution
Using apply
to loop over the rows and use grepl
will work...
apply( df , 1 , function(x) grepl( x[1] , x[2] ) )
#[1] TRUE FALSE FALSE TRUE
OTHER TIPS
Suppose your data.frame is df
and using substr
will do the trick.
> df$X==as.numeric(substr(df$Y, start=1, stop=4))
[1] TRUE FALSE FALSE TRUE
Putting all together in a new data.frame:
> transform(df, Z=df$X==as.numeric(substr(df$Y, start=1, stop=4)))
X Y Z
1 1111 111120 TRUE
2 1111 890933 FALSE
3 2222 780777 FALSE
4 2222 222247 TRUE
Take a look at ?substr
for further details on how it works.