Вопрос

In my data frame many column names end with ".y" as in the example:

dat <- data.frame(x1=sample(c(0:1)), id=sample(10), av1.y = sample(10) , av2.y = sample(10) , av3.y = sample(10),av4.y=sample(10))
dat

I would like to get rid of the last two characters of all the column names that end with .y and leave the others unchanged in order to have a data frame like this:

colnames(dat) <- c("x1","id","av1","av2","av3","av4")
dat

How can I achieve this without re-typing all the column names? I found a way to do it for a single string but don't know how to do it repeatedly over series of strings:

library(stringi)
stri_sub("av3.y",1,3)
Это было полезно?

Решение

One possibility is gsub:

gsub(pattern = ".y", replacement = "", x = names(dat), fixed = TRUE)
# [1] "x1"  "id"  "av1" "av2" "av3" "av4"

More explicitly match of ".y" at the end of the string:

gsub(pattern = "\\.y$", replacement = "", x = names(dat))

Другие советы

stri_sub function was man for the job :) Look at the doc. You can get substring counting from the end of string by using negative value, like this:

stri_sub("abc1.y",1,-1) #whole string
## [1] "abc1.y"
stri_sub("abc1.y",1,-3) #without last two characters
## [1] "abc1"

and it is also vectorized, so you can use this function on vector :)

stri_sub(c("abc1.y","V1.y","somethingreallylong.y"),1,-3)
## [1] "abc1"                "V1"                  "somethingreallylong"
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top