Question

I am trying to reshape data from wide format to long format. In the following table, I have:

 Sample 1    Sample 2   Sample 3 ...   Sample 18
 string1     string2    0              String3
 0           string1    0              0
 0           0          0              0

As you can see, several samples can have the same string. The samples are the colnames. I would like to have the following into a vector. I don't want any zero, and I need all the instances of each string:

 string1
 string2
 string1
 string3

So far, I wrote the following code:

 reshape(SV37.refined, direction="long",varying=names(SV37.refined), v.names="Value", idvar ="Index", times=names(SV37.refined), timevar="Sample")

SV37.refined is the name of my data frame. However, I get:

1.Sample1   Sample1    string1     1
2.Sample1   Sample1    0           2
3.Sample1   Sample1    0           3
4.Sample2   Sample2    string2     4
5.Sample2   Sample2    string1     5
6.Sample2   Sample2    0           6

Do you have any idea?

Thank you very much for your time!

Was it helpful?

Solution

If its not necessary to use reshape

out <- unlist(lapply(SV37.refined, as.character))
out[out != "0"]
##  Sample11  Sample21  Sample22 Sample181 
## "string1" "string2" "string1" "string3" 

or if you're into one-liners

Filter(function(x) x != "0", unlist(lapply(SV37.refined, as.character)))
##  Sample11  Sample21  Sample22 Sample181 
## "string1" "string2" "string1" "string3" 

OTHER TIPS

Using reshape:

dat <- read.table(text="
Sample1 Sample2
string1 string2
0 string1
0 0", header=TRUE)

#  Sample1 Sample2
#1 string1 string2
#2       0 string1
#3       0       0

out <- reshape(
  dat,
  varying=c("Sample1","Sample2"),
  direction="long",
  times=1:2,
  v.names="Value",
  timevar="Sample" 
)

out[out$Value != 0,]

#    Sample   Value id
#1.1      1 string1  1
#1.2      2 string2  1
#2.2      2 string1  2
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top