Question

Im trying to split a string on "." and create additional columns with the two strings before and after ".".

tes<-c("1.abc","2.di","3.lik")
dat<-c(5,3,2)
h<-data.frame(tes,dat)
h$num<-substr(h$tes,1,1)

h$prim<-unlist(strsplit(as.character(h$tes),"\\."))[2]
h$prim<-sapply(h$tes,unlist(strsplit(as.character(h$tes),"\\."))[2])

I´d like h$prim to contain "abc","di","lik"..However I´m not able to figure it out. I guess strsplit is not vectorized, but then I thought the sapply version should have worked. However I assume it should be easy:-)

Regards, //M

Was it helpful?

Solution

This should do the trick

R> sapply(strsplit(as.character(h$tes), "\\."), "[[", 2)
[1] "abc" "di"  "lik"

OTHER TIPS

With the stringr package it's even easier:

library(stringr)
str_split_fixed(h$tes, fixed("."), 2)[, 2]

This is the same as rcs' answer, but may be easier to understand:

> sapply(strsplit(as.character(h$tes), "\\."), function(x) x[[2]])
[1] "abc" "di"  "lik"

This question appears several time on StackOverflow.

In exact form as yours:

Some similar question in this topic:

And if you care about speed then you should consider tip from John answer about fixed parameter to strsplit.

Alternatively, you can save yourself the work of pulling out the 2nd element if you add both columns at the same time:

tes <- c("1.abc","2.di","3.lik")
dat <- c(5,3,2)
h <- data.frame(tes, dat, stringsAsFactors=FALSE)
values <- unlist(strsplit(h$tes, ".", fixed=TRUE))
h <- cbind(h, matrix(values, byrow=TRUE, ncol=2,
                     dimnames=list(NULL, c("num", "prim"))))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top