Question

I have a data.frame that looks like this:

  name1 feat   x  y perc.x perc.y
1   foo    z 100 10    0.1      1
2   bar    w 200 20    0.2      2
3   qux    x 300 30    0.3      3
4   qux    y 400 40    0.4      4
5   bar    v 500 50    0.5      5

It is generated with the following code:

name1 <- c("foo","bar","qux","qux","bar")
 feat <- c("z","w","x","y","v")
 x <- c(100,200,300,400,500)
 y <- c(10,20,30,40,50)
 perc.x <- c(0.1,0.2,0.3,0.4,0.5)
 perc.y <- c(1,2,3,4,5)

 df <- data.frame(name1,feat,x,y,perc.x,perc.y)
 df

How can I create a melted data like this:

    name1 feat variable value value2.perc
1    foo    z        x 100.0   0.1
2    bar    w        x 200.0   0.2
3    qux    x        x 300.0   0.3
4    qux    y        x 400.0   0.4
5    bar    v        x 500.0   0.5
6    foo    z        y  10.0   1
7    bar    w        y  20.0   2
8    qux    x        y  30.0   3
9    qux    y        y  40.0   4
10   bar    v        y  50.0   5

I tried this but failed:

   library(reshape2)
    melt(df)
Was it helpful?

Solution

Solution with base R, Using reshape :

 reshape(df,direction='long', varying=list(c(3, 4), c(5, 6)))
    name1 feat time   x perc.x id
1.1   foo    z    1 100    0.1  1
2.1   bar    w    1 200    0.2  2
3.1   qux    x    1 300    0.3  3
4.1   qux    y    1 400    0.4  4
5.1   bar    v    1 500    0.5  5
1.2   foo    z    2  10    1.0  1
2.2   bar    w    2  20    2.0  2
3.2   qux    x    2  30    3.0  3
4.2   qux    y    2  40    4.0  4
5.2   bar    v    2  50    5.0  5

Maybe you should work a little bit time variable.

EDIT better, thanks to @mnel brilliant comment :

reshape(df,direction='long', varying=list(c(3, 4), c(5, 6)),
        ,v.names = c('value','perc'), times = c('x','y'))

    name1 feat time value perc id
1.x   foo    z    x   100  0.1  1
2.x   bar    w    x   200  0.2  2
3.x   qux    x    x   300  0.3  3
4.x   qux    y    x   400  0.4  4
5.x   bar    v    x   500  0.5  5
1.y   foo    z    y    10  1.0  1
2.y   bar    w    y    20  2.0  2
3.y   qux    x    y    30  3.0  3
4.y   qux    y    y    40  4.0  4
5.y   bar    v    y    50  5.0  5

OTHER TIPS

I only checked the first few entries, but I think this is what you're after:

m1 <- melt(df[,1:4],id.vars = 1:2)
> m2 <- melt(df[,c(1:2,5:6)],id.vars = 1:2)
> m2$variable <- substr(m2$variable,6,6)
> merge(m1,m2,by = 1:3)
   name1 feat variable value.x value.y
1    bar    v        x     500     0.5
2    bar    v        y      50     5.0
3    bar    w        x     200     0.2
4    bar    w        y      20     2.0
5    foo    z        x     100     0.1
6    foo    z        y      10     1.0
7    qux    x        x     300     0.3
8    qux    x        y      30     3.0
9    qux    y        x     400     0.4
10   qux    y        y      40     4.0

The "melt twice and then merge" strategy should work...

As @agstudy points out, reshape is ideal for this type of thing. A close relative is stack, and you can use it similarly. You'll have to do some minimal cleanup on the variable names though.

data.frame(df[1:2], lapply(list(c(3, 4), c(5, 6)), function(x) stack(df[x])))
#    name1 feat values ind values.1  ind.1
# 1    foo    z    100   x      0.1 perc.x
# 2    bar    w    200   x      0.2 perc.x
# 3    qux    x    300   x      0.3 perc.x
# 4    qux    y    400   x      0.4 perc.x
# 5    bar    v    500   x      0.5 perc.x
# 6    foo    z     10   y      1.0 perc.y
# 7    bar    w     20   y      2.0 perc.y
# 8    qux    x     30   y      3.0 perc.y
# 9    qux    y     40   y      4.0 perc.y
# 10   bar    v     50   y      5.0 perc.y

Here, the first item in the data.frame call gets recycled to match the length that results form the stack.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top