Pregunta

Something simple I'm messing up in using stringr to manipulate character vectors. I have a data frame of the following sort

library(stringr)
d1 <- data.frame(x = str_c(rpois(10, lambda=5), 
                           rpois(10, lambda=10),
                           sep = "_"))

and I want everything after the underscore as a separate variable. This use of str_sub results in a vector of length 20, and I'm at a loss to explain why.

d1$y <- str_sub(d1$x, str_locate(d1$x, fixed("_"))+1)

Error in $<-.data.frame(*tmp*, "y", value = c("_12", "_7", "_15", : replacement has 20 rows, data has 10

Could someone direct me how to write the str_sub call in the right way?

¿Fue útil?

Solución

This is what you want to be doing (check out output of str_locate to see why it wasn't working for you, also note that str_sub recycles the arguments):

d1$y = str_sub(d1$x, str_locate(d1$x, fixed("_"))[,1] + 1, -1)

Or in base R:

d1$y = sub("^[^_]*_", "", d1$x)
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top