R split character vector using strsplit?
-
30-06-2021 - |
Pregunta
As a newbie in R how to treat correctly a variable having multiple values like that :
x = c("1","1","1/2","2","2/3","1/3")
As you see value 3 only appears in conjonction with others.
To compute x
further, the best would be to obtain 3 vectors like :
X[1] = c(1,1,1,NA,NA,1)
because "1" appears in 1st, 2nd, 3rd and 6th places.
idem with X[2]
and X[3]
All information seems to be preserved doing so : Am I wrong ?
I have already tested strsplit but it is not preserving NA
's values that are not already in my vector.
Solución
This seems to work:
x = c("1","1","1/2","2","2/3","1/3")
#Split on your character. This may not be inclusive of all characters that
#need to be split on.
xsplit <- strsplit(x, "\\/")
#Find the unique items
xunique <- unique(unlist(xsplit))
#Iterate over each xsplit for all unique values
out <- sapply(xsplit, function(z)
sapply(xunique, function(zz) zz %in% z)
)
#convert FALSE to NA
out[out == FALSE] <- NA
#Results in
> out
[,1] [,2] [,3] [,4] [,5] [,6]
1 TRUE TRUE TRUE NA NA TRUE
2 NA NA TRUE TRUE TRUE NA
3 NA NA NA NA TRUE TRUE
Otros consejos
An alternative is to use cSplit_e
from my "splitstackshape" package.
x = c("1","1","1/2","2","2/3","1/3")
library(splitstackshape)
cSplit_e(data.frame(x), "x", "/")
# x x_1 x_2 x_3
# 1 1 1 NA NA
# 2 1 1 NA NA
# 3 1/2 1 1 NA
# 4 2 NA 1 NA
# 5 2/3 NA 1 1
# 6 1/3 1 NA 1
(Note that the results here are transposed in comparison to the results in the accepted answer.)
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow