Question

I have a dataframe, dfregion, which looks as follows:

dput(dfregion)
structure(list(region = structure(c(1L, 2L, 3L, 3L, 1L), .Label = c("East", 
"New England", "Southeast"), class = "factor"), words = structure(c(4L, 
 2L, 1L, 3L, 5L), .Label = c("buildings, tallahassee", "center, mass, visitors", 
"god, instruct, estimated", "seeks, metropolis, convey", "teaching, academic, metropolis"
), class = "factor")), .Names = c("region", "words"), row.names = c(NA, 
-5L), class = "data.frame")

      region                       words                                                                                                                                             
 1        East                    seeks, metropolis, convey 
 3 New England                    center, mass, visitors 
 4   Southeast                    buildings, tallahassee
 5   Southeast                    god, instruct, estimated
 6        East                    teaching, academic, metropolis

I am working on "melting" or "reshaping" this dataframe by region and then would like to paste the words together.

The following code is what I have tried:

dfregionnew<-dcast(dfregion, region ~ words,fun.aggregate= function(x) paste(x) )

dfregionnew<-dcast(dfregion, region ~ words, paste)

dfregionnew <- melt(dfregion,id=c("region"),variable_name="words")

Finally, I did this- however I am not sure this is the best way to accomplish what I want

dfregionnew<-ddply(dfregion, .(region), mutate, index= paste0('words', 1:length(region)))
dfregionnew<-dcast(dfregionnew, region~ index, value.var ='words')

The result is a dataframe reshapen in the right way, yet each "word" column is separate. Subsequently, I tried to paste these columns together and am getting various errors while doing so.

dfregionnew$new<-lapply(dfregionnew[,2:ncol(dfregionnew)], paste, sep=",")
dfregionnew$new<-ldply(apply(dfregionnew, 1, function(x) data.frame(x = paste(x[2:ncol(dfregionnew], sep=",", collapse=NULL))))
dfregionnew$new <- apply( dfregionnew[ , 2:ncol(dfregionnew) ] , 1 , paste , sep = "," )

I was able to solve that problem by doing something similar to below:

dfregionnew$new <- apply( dfregionnew[ , 2:5] , 1 , paste , collapse = "," )

I guess my real question is, would it be possible to do this in one step using melt or dcast, without having to paste together the various columns after they are output. I am very interested in improving my skills and would love faster/ better practices in R. Thanks in advance!

Was it helpful?

Solution

It sounds like you just want to paste the values in the "word" column together, in which case, you should be able to just use aggregate as follows:

aggregate(words ~ region, dfregion, paste)
#        region                                                     words
# 1        East seeks, metropolis, convey, teaching, academic, metropolis
# 2 New England                                    center, mass, visitors
# 3   Southeast          buildings, tallahassee, god, instruct, estimated

No melting or dcasting required....


If you do want to use dcast from "reshape2", you can try something like this:

dcast(dfregion, region ~ "WORDS", value.var="words", 
      fun.aggregate=function(x) paste(x, collapse = ", "))
#        region                                                     WORDS
# 1        East seeks, metropolis, convey, teaching, academic, metropolis
# 2 New England                                    center, mass, visitors
# 3   Southeast          buildings, tallahassee, god, instruct, estimated
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top