Question

I would like to remove specific characters from strings within a vector, similar to the Find and Replace feature in Excel.

Here are the data I start with:

group <- data.frame(c("12357e", "12575e", "197e18", "e18947")

I start with just the first column; I want to produce the second column by removing the e's:

group       group.no.e
12357e      12357
12575e      12575
197e18      19718
e18947      18947
Was it helpful?

Solution

With a regular expression and the function gsub():

group <- c("12357e", "12575e", "197e18", "e18947")
group
[1] "12357e" "12575e" "197e18" "e18947"

gsub("e", "", group)
[1] "12357" "12575" "19718" "18947"

What gsub does here is to replace each occurrence of "e" with an empty string "".


See ?regexp or gsub for more help.

OTHER TIPS

Regular expressions are your friends:

R> ## also adds missing ')' and sets column name
R> group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947"))  )
R> group
   group
1 12357e
2 12575e
3 197e18
4 e18947

Now use gsub() with the simplest possible replacement pattern: empty string:

R> group$groupNoE <- gsub("e", "", group$group)
R> group
   group groupNoE
1 12357e    12357
2 12575e    12575
3 197e18    19718
4 e18947    18947
R> 

Summarizing 2 ways to replace strings:

group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947"))

1) Use gsub

group$group.no.e <- gsub("e", "", group$group)

2) Use the stringr package

group$group.no.e <- str_replace_all(group$group, "e", "")

Both will produce the desire output:

   group group.no.e
1 12357e      12357
2 12575e      12575
3 197e18      19718
4 e18947      18947

You do not need to create data frame from vector of strings, if you want to replace some characters in it. Regular expressions is good choice for it as it has been already mentioned by @Andrie and @Dirk Eddelbuettel.

Pay attention, if you want to replace special characters, like dots, you should employ full regular expression syntax, as shown in example below:

ctr_names <- c("Czech.Republic","New.Zealand","Great.Britain")
gsub("[.]", " ", ctr_names)

this will produce

[1] "Czech Republic" "New Zealand"    "Great Britain" 

Use the stringi package:

require(stringi)

group<-data.frame(c("12357e", "12575e", "197e18", "e18947"))
stri_replace_all(group[,1], "", fixed="e")
[1] "12357" "12575" "19718" "18947"
> library(stringi)                
> group <- c('12357e', '12575e', '12575e', ' 197e18',  'e18947')              
> pattern <- "e"  
> replacement <-  ""  
> group <- str_replace(group, pattern, replacement)      
> group 
[1] "12357"  "12575"  "12575"  " 19718" "18947" 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top