For the first problem:
name <- c("2,6-Octadien-1-ol, 3,7-dimethyl-, (E)-", "2,6-Octadien-1-ol,3,7-dimethyl-,(E)-") sapply(strsplit(name, "(?<!\\d), ?", perl = TRUE), function(x) paste(rev(x), collapse = "")) # [1] "(E)-3,7-dimethyl-2,6-Octadien-1-ol" "(E)-3,7-dimethyl-2,6-Octadien-1-ol"
For the second problem:
name <- c("Pyrazine <2-acetyl-, 3-ethyl->", "Cyclohexanol <4-tertbutyl-> acetate") inside <- gsub(", ", "", sub("^.*<(.+)>.*$", "\\1", name)) outside <- sub("^(.*) <.*>(.*)$" , "\\1\\2", name) paste0(inside, outside) # [1] "2-acetyl-3-ethyl-Pyrazine" "4-tertbutyl-Cyclohexanol acetate"
String rearrangement in R
Pregunta
I am on the lookout for two R functions that would perform the following string rearrangements: (1) place the parts following a ", " in a string at the start of a string, e.g.
name="2,6-Octadien-1-ol, 3,7-dimethyl-, (E)-"
should yield
"(E)-3,7-dimethyl-2,6-Octadien-1-ol"
(note that there could be any number of ", " in a string, or none at all, and that the parts after the ", " should be placed at the start of the string successively, starting from the end of the string. What would be the most efficient way of achieving this in R (without using loops etc)?
(2) place the parts between "<" and ">" at the start of a string and remove any ", ". E.g.
name="Pyrazine <2-acetyl-, 3-ethyl->"
should yield
"2-acetyl-3-ethyl-Pyrazine"
(this is a simpler gsub problem, right?) The part between the "<" and ">" could be in any place in the string though. E.g.
name="Cyclohexanol <4-tertbutyl-> acetate"
should yield
"4-tertbutyl-Cyclohexanol acetate"
Any thoughts would be welcome!
cheers, Tom
Solución