Вопрос

I have a string that's mixed letters and numbers:

"The sample is 22mg"

I'd like to split strings where a number is immediately followed by letter like this:

"The sample is 22 mg"

I've tried this:

gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg')

but am not getting the desired results.

Any suggestions?

Это было полезно?

Решение

You need to use capturing parentheses in the regular expression and group references in the replacement. For example:

gsub('([0-9])([[:alpha:]])', '\\1 \\2', 'This is a test 22mg')

There's nothing R-specific here; the R help for regex and gsub should be of some use.

Другие советы

You need backreferencing:

test <- "The sample is 22mg"
> gsub("([0-9])([a-zA-Z])","\\1 \\2",test)
[1] "The sample is 22 mg"

Anything in parentheses gets remembered. Then they're accessed by \1 (for the first entity in parens), \2, etc. The first backslash escapes the backslash's interpretation in R so that it gets passed to the regular expression parser.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top