Question

I am dealing with a dataset that is in wide format, as in

> data=read.csv("http://www.kuleuven.be/bio/ento/temp/data.csv")
> data
  factor1 factor2 count_1 count_2 count_3
1       a       a       1       2       0
2       a       b       3       0       0
3       b       a       1       2       3
4       b       b       2       2       0
5       c       a       3       4       0
6       c       b       1       1       0

where factor1 and factor2 are different factors which I would like to take along (in fact I have more than 2, but that shouldn't matter), and count_1 to count_3 are counts of aggressive interactions on an ordinal scale (3>2>1). I would now like to convert this dataset to long format, to get something like

   factor1 factor2 aggression
1        a       a          1
2        a       a          2
3        a       a          2
4        a       b          1
5        a       b          1
6        a       b          1
7        b       a          1
8        b       a          2
9        b       a          2
10       b       a          3
11       b       a          3
12       b       a          3
13       b       b          1
14       b       b          1
15       b       b          2
16       b       b          2
17       c       a          1
18       c       a          1
19       c       a          1
20       c       a          2
21       c       a          2
22       c       a          2
23       c       a          2
24       c       b          1
25       c       b          2

Would anyone happen to know how to do this without using for...to loops, e.g. using package reshape2? (I realize it should work using melt, but I just haven't been able to figure out the right syntax yet)

Edit: For those of you that would also happen to need this kind of functionality, here is Ananda's answer below wrapped into a little function:

    widetolong.ordinal<-function(data,factors,responses,responsename) {
    library(reshape2)
    data$ID=1:nrow(data) # add an ID to preserve row order
    dL=melt(data, id.vars=c("ID", factors)) # `melt` the data
    dL=dL[order(dL$ID), ] # sort the molten data
    dL[,responsename]=match(dL$variable,responses) # convert reponses to ordinal scores
    dL[,responsename]=factor(dL[,responsename],ordered=T)
    dL=dL[dL$value != 0, ] # drop rows where `value == 0`
    out=dL[rep(rownames(dL), dL$value), c(factors, responsename)] # use `rep` to "expand" `data.frame` & drop unwanted columns
    rownames(out) <- NULL
    return(out)
    }

    # example
    data <- read.csv("http://www.kuleuven.be/bio/ento/temp/data.csv")
    widetolong.ordinal(data,c("factor1","factor2"),c("count_1","count_2","count_3"),"aggression")
Was it helpful?

Solution

melt from "reshape2" will only get you part of the way through this problem. To go the rest of the way, you just need to use rep from base R:

data <- read.csv("http://www.kuleuven.be/bio/ento/temp/data.csv")
library(reshape2)

## Add an ID if the row order is importantt o you
data$ID <- 1:nrow(data)

## `melt` the data
dL <- melt(data, id.vars=c("ID", "factor1", "factor2"))

## Sort the molten data, if necessary
dL <- dL[order(dL$ID), ]

## Extract the numeric portion of the "variable" variable
dL$aggression <- gsub("count_", "", dL$variable)

## Drop rows where `value == 0`
dL <- dL[dL$value != 0, ]

## Use `rep` to "expand" your `data.frame`.
## Drop any unwanted columns at this point.
out <- dL[rep(rownames(dL), dL$value), c("factor1", "factor2", "aggression")]

This is what the output finally looks like. If you want to remove the funny row names, just use rownames(out) <- NULL.

out
#      factor1 factor2 aggression
# 1          a       a          1
# 7          a       a          2
# 7.1        a       a          2
# 2          a       b          1
# 2.1        a       b          1
# 2.2        a       b          1
# 3          b       a          1
# 9          b       a          2
# 9.1        b       a          2
# 15         b       a          3
# 15.1       b       a          3
# 15.2       b       a          3
# 4          b       b          1
# 4.1        b       b          1
# 10         b       b          2
# 10.1       b       b          2
# 5          c       a          1
# 5.1        c       a          1
# 5.2        c       a          1
# 11         c       a          2
# 11.1       c       a          2
# 11.2       c       a          2
# 11.3       c       a          2
# 6          c       b          1
# 12         c       b          2
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top