Domanda

I have a dataset that looks like this:

CATA 1 10101
CATA 2 11101
CATA 3 10011
CATB 1 10100
CATB 2 11100
CATB 3 10011

etc.

and I want to combine these different rows into a single, long row like this:

CATA 101011110110011
CATB 101001110010011

I've tried doing this with melt() and then dcast(), but it doesn't seem to work. Does anyone have some simple pieces of code to do this?

È stato utile?

Soluzione

Look at the paste command and specifically the collapse argument. It's not clear what should happen if/when you have different values for the first column, so I won't venture to guess. Update your question if you get stuck.

dat <- data.frame(V1 = "CATA", V2 = 1:3, V3 = c(10101, 11101, 10011))
paste(dat$V3, collapse= "")
[1] "101011110110011"

Note that you may want to convert the data to character first to prevent leading zeros from being trimmed.

EDIT: to address multiple values for the first column

Use plyr's ddply function which expects a data.frame as an input and a grouping variable(s). We then use the same paste() trick as before along with summarize().

    library(plyr)
    dat <- data.frame(V1 = sample(c("CATA", "CATB"), 10, TRUE)
                    , V2 = 1:10
                    , V3 = sample(0:100, 10, TRUE)
                    )

    ddply(dat, "V1", summarize, newCol = paste(V3, collapse = ""))

    V1         newCol
1 CATA          16110
2 CATB 19308974715042

Altri suggerimenti

Assuming all possible elements in V1 of dat are known,

elements <- c("CATA","CATB","CATC")
i <- 1
final_list <- c()
while (i <= length(elements)){
k <- grep(elements[i], dat$V1, ignore.case = FALSE, fixed = TRUE, value = FALSE)
m <- paste(dat$V1[k[1]], " ", paste(dat[k,3], collapse=""), sep="")
final_list <- c(final_list,m)
i=i+1
}

@Chase answer is much better !

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top