Pergunta

I have a variable, and for some reason R has added an extra "X" in the beginning of each. Is this a common occurrence that I could have avoided?

Anyhow, below is my data (currently the variable is stored in a list):

X1
X5
X33
X37
...

> str(rc1_output)
 chr [1:63, 1:3] "X1" "X5" "X33" "X37" "X52" "X645" "X646" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:63] "X1" "X5" "X33" "X37" ...
  ..$ : chr [1:3] "" "Entropy" "Subseq."

> dput(head(rc1_output))
structure(c("X1", "X5", "X33", "X37", "X52", "X645", "0", "0", 
"0", "0", "0", "0", "0.256010845762264", "0.071412419435563", 
"0.071412419435563", "0.071412419435563", "0.071412419435563", 
"0.071412419435563"), .Dim = c(6L, 3L), .Dimnames = list(c("X1", 
"X5", "X33", "X37", "X52", "X645"), c("", "Entropy", "Subseq."
)))

How can I loop through all rows of the variable and remove the X?

Foi útil?

Solução

Try substr or gsub:

x <- c("X1", "X354", "X234", "X2134")
substr(x, 2, nchar(x))
# [1] "1"    "354"  "234"  "2134"
gsub("^X", "", x)
# [1] "1"    "354"  "234"  "2134"

Update

It looks like just the first column (which is unnamed) and the rownames are affected. The same general approach applies:

> rc1_output[, 1] <- gsub("^X", "", rc1_output[, 1])
> rc1_output
           Entropy Subseq.            
X1   "1"   "0"     "0.256010845762264"
X5   "5"   "0"     "0.071412419435563"
X33  "33"  "0"     "0.071412419435563"
X37  "37"  "0"     "0.071412419435563"
X52  "52"  "0"     "0.071412419435563"
X645 "645" "0"     "0.071412419435563"

Repeat the process for rownames(rc1_output) if required, like this:

rownames(rc1_output) <- gsub("^X", "", rownames(rc1_output))

My guess, however, is that you can solve this problem more effectively at an earlier stage in your code somewhere. If we knew how this data came to be in this form in the first place, that would make it much easier to diagnose.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top