Question

I have a dataset that looks like this:

CATA 1 10101
CATA 2 11101
CATA 3 10011
CATB 1 10100
CATB 2 11100
CATB 3 10011

etc.

and I want to combine these different rows into a single, long row like this:

CATA 101011110110011
CATB 101001110010011

I've tried doing this with melt() and then dcast(), but it doesn't seem to work. Does anyone have some simple pieces of code to do this?

Était-ce utile?

La solution

Look at the paste command and specifically the collapse argument. It's not clear what should happen if/when you have different values for the first column, so I won't venture to guess. Update your question if you get stuck.

dat <- data.frame(V1 = "CATA", V2 = 1:3, V3 = c(10101, 11101, 10011))
paste(dat$V3, collapse= "")
[1] "101011110110011"

Note that you may want to convert the data to character first to prevent leading zeros from being trimmed.

EDIT: to address multiple values for the first column

Use plyr's ddply function which expects a data.frame as an input and a grouping variable(s). We then use the same paste() trick as before along with summarize().

    library(plyr)
    dat <- data.frame(V1 = sample(c("CATA", "CATB"), 10, TRUE)
                    , V2 = 1:10
                    , V3 = sample(0:100, 10, TRUE)
                    )

    ddply(dat, "V1", summarize, newCol = paste(V3, collapse = ""))

    V1         newCol
1 CATA          16110
2 CATB 19308974715042

Autres conseils

Assuming all possible elements in V1 of dat are known,

elements <- c("CATA","CATB","CATC")
i <- 1
final_list <- c()
while (i <= length(elements)){
k <- grep(elements[i], dat$V1, ignore.case = FALSE, fixed = TRUE, value = FALSE)
m <- paste(dat$V1[k[1]], " ", paste(dat[k,3], collapse=""), sep="")
final_list <- c(final_list,m)
i=i+1
}

@Chase answer is much better !

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top