Question

I am facing a reshaping problem with a dataframe. It has many more rows and columns. Simplified, it structure looks like this:

rownames    year    x1  x2  x3
a           2000    2   6   11
b           2000    0   4   2
c           2000    0   3   5
a           2010    2   6   11
b           2010    0   0   0
c           2020    4   1   8
a           2020    10  1   7
b           2020    8   4   10
c           2020    22  1   16

I would like to come out with a dataframe that has one single row for the variable "year", copy the x1, x2, x3 values in subsequent columns, and rename the columns with a combination between the rowname and the x-variable. It should look like this:

year  a_x1  a_x2  a_x3  b_x1  b_x2  b_x3  c_x1  c_x2  c_x3
2000  2     6     11    0     4     2     0     3     5
2010  2     6     11    0     0     0     4     1     8
2020  10    1     7     8     4     10    22    1     16

I thought to use subsequent cbind() functions, but since I have to do it for thousands of rows and hundreds columns, I hope there is a more direct way with the reshape package (with which I am not so familiar yet)

Thanks in advance!

Was it helpful?

Solution

First, I hope that rownames is a data.frame column and not the data.frame's rownames. Otherwise you'll encounter problems due to the non-uniqueness of the values.

I think your main problem is, that your data.frame is not entirely molten:

library(reshape2)

dt <- melt( dt, id.vars=c("year", "rownames") )
head(dt)
   year rownames variable value
1  2000        a       x1     2
2  2000        b       x1     0
3  2000        c       x1     0
4  2010        a       x1     2
...

dcast( dt, year ~ rownames + variable )
  year a_x1 a_x2 a_x3 b_x1 b_x2 b_x3 c_x1 c_x2 c_x3
1 2000    2    6   11    0    4    2    0    3    5
2 2010    2    6   11    0    0    0    4    1    8
3 2020   10    1    7    8    4   10   22    1   16

EDIT:

As @spdickson points out, there is also an error in your data avoiding a simple aggregation. Combinations of year, rowname have to be unique of course. Otherwise you need an aggregation function which determines the resulting values of non-unique combinations. So we assume that row 6 in your data should read c 2010 4 1 8.

OTHER TIPS

You can try using reshape() from base R without having to melt your dataframe further:

df1 <- read.table(text="rownames    year    x1  x2  x3
a           2000    2   6   11
b           2000    0   4   2
c           2000    0   3   5
a           2010    2   6   11
b           2010    0   0   0
c           2010    4   1   8
a           2020    10  1   7
b           2020    8   4   10
c           2020    22  1   16",header=T,as.is=T)

reshape(df1,direction="wide",idvar="year",timevar="rownames")
#   year x1.a x2.a x3.a x1.b x2.b x3.b x1.c x2.c x3.c
# 1 2000    2    6   11    0    4    2    0    3    5
# 4 2010    2    6   11    0    0    0    4    1    8
# 7 2020   10    1    7    8    4   10   22    1   16
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top