質問

I am working on a data set with columns with numbers like this:

icd9code
285.21
593.9
285.21
v04.81

in order to run the R comorbidities package, I need to change them to 5 digits numbers without decimal points.

so they need to look like this:

icd9code
28521
59390
28521
v0481

What function can I use? In particular, how can I get it to show 0 at the end of the number if it has only 4 digits. Also, how can I transfer number starts with 'v'?

役に立ちましたか?

解決

Here's a vectorized solution:

x <- c("285.21", "593.9", "285.21", "v04.81")

substr(gsub("\\.", "", paste0(x, "00000")), 1, 5)
# [1] "28521" "59390" "28521" "v0481"

他のヒント

It's not all that pretty, but it should work on all systems:

x <- scan(text="285.21 593.9 285.21 v04.81", what="character")
#[1] "285.21" "593.9"  "285.21" "v04.81"

res <- gsub("\\.","",x)
mapply(paste0, res, sapply(5-nchar(res),rep,x="0"))

#  28521    5939   28521   v0481 
#"28521" "59390" "28521" "v0481" 

Here is another way to solve it, in case there are several columns where you would need the replacement. I'm sure there are better ways to do this, but the logic is clear: 1) Split the string of each column 2) Check if the amount of characters after the decimal point and replace accordingly

char <- data.frame(icd9code1 = c("285.21", "593.9", "285.21" ,"v04.81"),
                   icd9code2 = c("285.21", "593.9", "285.21" ,"v04.81"),
                   icd9code3 = c("285.21", "593.9", "285.21" ,"v04.81")
                   )

for(col in 1:dim(char)[2]){
  split_str <- strsplit(char[,col],"\\.")

  for(i in 1:nrow(char)){
    if(nchar(split_str[[i]][2]) == 1){
      char[,col][i] <- paste0(gsub("\\.", "", char[,col][i]),"0")
    } else {
      char[,col][i] <- paste0(gsub("\\.", "", char[,col][i]))
    }
  }
}

# > char
#   icd9code1 icd9code2 icd9code3
# 1     28521     28521     28521
# 2     59390     59390     59390
# 3     28521     28521     28521
# 4     v0481     v0481     v0481
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top