Try this using strapply
from the gsubfn package. We define a function that accepts the matches and returns the first two non-empty ones. Then use it with the regular expression paste(rx1, rx2, sep = "|")
for each component of my_str
:
library(gsubfn)
# test data
# there was an addition to the question in the comments. It asked to be able to handle
# one regular expression which has only a single capture. Make sure its at the end.
rx3 <- "^([[:digit:]]{2})$"
my_strs2 <- c(my_strs, "99")
# code
first2 <- function(...) { x <- c(..., NA); head(x[x != ""], 2) }
strapply(my_strs2, paste(rx1, rx2, rx3, sep = "|"), first2, simplify = TRUE)
The last line returns:
[,1] [,2] [,3] [,4]
[1,] "A " "G " "A" "99"
[2,] "01" "00" "2" NA
(If there are components of my_strs
that do not match at all then a list will be returned in which those components are NULL. In that case you may prefer to drop the simplify = TRUE
and always have it return a list.)
Note: strapplyc
in the same package is much faster than strapply
since the guts of it are written in tcl (a string processing language) whereas strapply
is written in R. Thus you might want to break it up this way to leverage off of the faster routine:
L <- strapplyc(my_strs2, paste(rx1, rx2, rx3, sep = "|"))
sapply(L, first2)