Pergunta

Im trying to split a string on "." and create additional columns with the two strings before and after ".".

tes<-c("1.abc","2.di","3.lik")
dat<-c(5,3,2)
h<-data.frame(tes,dat)
h$num<-substr(h$tes,1,1)

h$prim<-unlist(strsplit(as.character(h$tes),"\\."))[2]
h$prim<-sapply(h$tes,unlist(strsplit(as.character(h$tes),"\\."))[2])

I´d like h$prim to contain "abc","di","lik"..However I´m not able to figure it out. I guess strsplit is not vectorized, but then I thought the sapply version should have worked. However I assume it should be easy:-)

Regards, //M

Foi útil?

Solução

This should do the trick

R> sapply(strsplit(as.character(h$tes), "\\."), "[[", 2)
[1] "abc" "di"  "lik"

Outras dicas

With the stringr package it's even easier:

library(stringr)
str_split_fixed(h$tes, fixed("."), 2)[, 2]

This is the same as rcs' answer, but may be easier to understand:

> sapply(strsplit(as.character(h$tes), "\\."), function(x) x[[2]])
[1] "abc" "di"  "lik"

This question appears several time on StackOverflow.

In exact form as yours:

Some similar question in this topic:

And if you care about speed then you should consider tip from John answer about fixed parameter to strsplit.

Alternatively, you can save yourself the work of pulling out the 2nd element if you add both columns at the same time:

tes <- c("1.abc","2.di","3.lik")
dat <- c(5,3,2)
h <- data.frame(tes, dat, stringsAsFactors=FALSE)
values <- unlist(strsplit(h$tes, ".", fixed=TRUE))
h <- cbind(h, matrix(values, byrow=TRUE, ncol=2,
                     dimnames=list(NULL, c("num", "prim"))))
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top