Domanda

Program being used

I am using the statistical program R to analyze some data and have what is likely a fairly simple question.

Background to the problem

I have a variable full of numeric values called study_data$LN_reviewed. I also have a variable called study_data$Gender that has the sex of each subject in the study. I would like to compute some simple summary statistics stratified by gender. This is easy to do using the code shown below:

> by(study_data$LN_reviewed, study_data$Gender, summary)

study_data$Gender: FEMALE
Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
2.00   13.00   19.00   27.77   35.50  125.00 
------------------------------------------------
study_data$Gender: MALE
Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
1.00   12.00   19.00   26.98   34.00  122.00 

My question

How can I get R to display this information in an easier to digest format? Specifically, I would like a table that has two rows, entitled "FEMALE" and "MALE", and six columns, entitled "Min.", "1st Qu.", "Median", "Mean", "3rd Qu.", and "Max.", as shown below.

       Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
FEMALE 2.00   13.00   19.00   27.77   35.50  125.00
MALE   1.00   12.00   19.00   26.98   34.00  122.00

I have spent some time trying to solve it on my own and have been unable to find the solution.

È stato utile?

Soluzione

do.call(rbind , by(study_data$LN_reviewed, study_data$Gender, summary))

Altri suggerimenti

This is what plyr is for (/ or dplyr for large dataframes), the Split-Apply-Combine paradigm:

require(plyr)

summary_by_gender <- function(...) {
                         ss <- summary(...)
                         return(ftable(ss, col.vars=names(ss))
                     }

ddply(study_data, .(Gender), summarize, summary_by_gender(LN_reviewed) )

(A slight hack needed to prevent ftable renaming the summary column names. Something like that, I can't test it on your data.)

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top