Question
Im trying to split a string on "." and create additional columns with the two strings before and after ".".
tes<-c("1.abc","2.di","3.lik")
dat<-c(5,3,2)
h<-data.frame(tes,dat)
h$num<-substr(h$tes,1,1)
h$prim<-unlist(strsplit(as.character(h$tes),"\\."))[2]
h$prim<-sapply(h$tes,unlist(strsplit(as.character(h$tes),"\\."))[2])
I´d like h$prim
to contain "abc","di","lik"..However I´m not able to figure it out. I guess strsplit
is not vectorized, but then I thought the sapply
version should have worked. However I assume it should be easy:-)
Regards, //M
Solution
This should do the trick
R> sapply(strsplit(as.character(h$tes), "\\."), "[[", 2)
[1] "abc" "di" "lik"
OTHER TIPS
With the stringr
package it's even easier:
library(stringr)
str_split_fixed(h$tes, fixed("."), 2)[, 2]
This is the same as rcs' answer, but may be easier to understand:
> sapply(strsplit(as.character(h$tes), "\\."), function(x) x[[2]])
[1] "abc" "di" "lik"
This question appears several time on StackOverflow.
In exact form as yours:
- Selecting first element of
strsplit
- Selecting second element separate by space
- Selecting second element separate by dot I recommend this question to see in how many ways it could be achieved.
Some similar question in this topic:
And if you care about speed then you should consider tip from John answer about fixed
parameter to strsplit
.
Alternatively, you can save yourself the work of pulling out the 2nd element if you add both columns at the same time:
tes <- c("1.abc","2.di","3.lik")
dat <- c(5,3,2)
h <- data.frame(tes, dat, stringsAsFactors=FALSE)
values <- unlist(strsplit(h$tes, ".", fixed=TRUE))
h <- cbind(h, matrix(values, byrow=TRUE, ncol=2,
dimnames=list(NULL, c("num", "prim"))))