Question

I am trying to find the summary statistics for different factor levels.

data.frame(apply(final_data[Company=="BPO",c(66:84)],2,summary))  

Now I have different values for company - i can repeat the statement for different values. I know it can be automated - using apply family (ddply,tapply,sapply), but I am not getting it right.

Was it helpful?

Solution

You could split on company and then use your function:

spl = split(final_data, final_data$Company)
list.of.summaries = lapply(spl, function(x) data.frame(apply(x[,66:84], 2, summary)))

OTHER TIPS

You may want to think about using the by or tapply functions. This will allow you to skip the explicit call to split. Here's an example, since you haven't provided data.

# some example data
set.seed(1)
df <- data.frame(x = as.factor(rep(1:5, each=10)), y1=rnorm(50), y2=rnorm(50))

# with `tapply`
a <- do.call(rbind, sapply(df[,2:3], function(i) tapply(i, df$x, summary)))
# with `by`
a <- do.call(rbind, sapply(df[,2:3], function(i) by(i, df$x, summary)))

Here's the output:

> a
         Min.  1st Qu.    Median    Mean 3rd Qu.   Max.
 [1,] -0.8356 -0.54620  0.256600  0.1322  0.5537 1.5950
 [2,] -2.2150 -0.03775  0.491900  0.2488  0.9132 1.5120
 [3,] -1.9890 -0.39760  0.009218 -0.1337  0.5694 0.9190
 [4,] -1.3770 -0.32140 -0.056560  0.1207  0.6693 1.3590
 [5,] -0.7075 -0.23120  0.126100  0.1341  0.6619 0.8811
 [6,] -1.1290 -0.55080  0.103000  0.1435  0.5268 1.9800
 [7,] -1.8050 -0.02243  0.171000  0.4512  1.2720 2.4020
 [8,] -1.2540 -0.67980 -0.221100 -0.2477  0.2372 0.6107
 [9,] -1.5240 -0.26190  0.300000  0.1274  0.5380 1.1780
[10,] -1.2770 -0.56560  0.042540  0.1123  1.0450 1.5870

You might also want to combine this with the variable and level names to know what's going on:

b <- expand.grid(level=levels(df$x),var=names(df[,2:3]))
cbind(a,b)

Here's the output of that:

> cbind(b,a)
   level var    Min.  1st Qu.    Median    Mean 3rd Qu.   Max.
1      1  y1 -0.8356 -0.54620  0.256600  0.1322  0.5537 1.5950
2      2  y1 -2.2150 -0.03775  0.491900  0.2488  0.9132 1.5120
3      3  y1 -1.9890 -0.39760  0.009218 -0.1337  0.5694 0.9190
4      4  y1 -1.3770 -0.32140 -0.056560  0.1207  0.6693 1.3590
5      5  y1 -0.7075 -0.23120  0.126100  0.1341  0.6619 0.8811
6      1  y2 -1.1290 -0.55080  0.103000  0.1435  0.5268 1.9800
7      2  y2 -1.8050 -0.02243  0.171000  0.4512  1.2720 2.4020
8      3  y2 -1.2540 -0.67980 -0.221100 -0.2477  0.2372 0.6107
9      4  y2 -1.5240 -0.26190  0.300000  0.1274  0.5380 1.1780
10     5  y2 -1.2770 -0.56560  0.042540  0.1123  1.0450 1.5870
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top