Summarizing multiple dummies in R

https://stackoverflow.com/questions/21942676

14-10-2022
|

Frage

Say I've asked 10 people which fruits they like. More than one answer is possible. The results are entered into R like so:

set.seed(234078)
df <- data.frame(q1.banana = sample(0:1, 10, replace = TRUE),
                 q1.apple  = sample(0:1, 10, replace = TRUE),
                 q1.melon  = sample(0:1, 10, replace = TRUE))

So that

> df
   q1.banana q1.apple q1.melon
1          0        0        1
2          0        1        1
3          1        1        0
4          1        0        0
5          0        1        1
6          0        0        0
7          1        0        0
8          0        0        0
9          0        1        1
10         0        0        1

How can I summarize the information in a table like the following?

q1.*    Freq
banana     3
apple      4
melon      5

After searching, I've found a couple of ideas such as using interaction(q1.banana, q1.apple, q1.melon), but that gives a different kind of output. Moreover, I would really appreciate if your answer involves a wild card, because my real case is expected to have few dozen dummies and I don't want to write them all.

Lösung

This might be one option.

set.seed(234078)
df <- data.frame(q1.banana = sample(0:1, 10, replace = TRUE),
                 q1.apple  = sample(0:1, 10, replace = TRUE),
                 q1.melon  = sample(0:1, 10, replace = TRUE))

library(reshape2)
# Melt the data 
df1 <- melt(df)

df1$value <- as.numeric(df1$value)

library(plyr)

# Now use ddply to sum the values
ddply(df1,.(variable),summarize,Freq=sum(value))
   variable Freq
1 q1.banana    3
2  q1.apple    4
3  q1.melon    5

Another option

> colSums(df)
q1.banana  q1.apple  q1.melon 
        3         4         5

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow