Transforming the two categorical variables into summary proportion data

https://stackoverflow.com/questions/21209777

29-09-2022
|

Domanda

In R,

What is the most efficient way to go from :

   gender soda
1       f    y
2       f    y
3       f    n
4       m    n
5       f    y
6       m    n
7       m    n
8       f    y
9       m    y
10      m    n

         y   n
m       0.2 0.8
f       0.8 0.2

I use the following command:

> tmp<-ddply(subdata,.(gender), summarise, y=length(soda[soda=="y"])/length(soda),n=length(soda[soda=="n"])/length(soda))
> rownames(tmp)<-tmp$gender
> tmp$gender<-NULL
> tmp
    y   n
f 0.8 0.2
m 0.2 0.8

But I feel there must be more idiomatic expression I am not aware of. Is there?

Soluzione

You can use table and prop.table:

> prop.table(table(subdata), 2)

      soda
gender   n   y
     f 0.2 0.8
     m 0.8 0.2

The function table counts the values per combination. prop.table calculates relative frequencies along the second margin (i.e., 2: columns).

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow