Pregunta

20 Lines of the data I'm working on:

Zv9_NA110   6176    7276    5'to3'IntronExon    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA110   10126   11226   5'to3'IntronExon    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   7   9   9   15  18  18  18  18  18  18  18  18  18  18  18  18  18  18  18  18  18  18  13  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA110   11219   12319   5'to3'ExonIntron    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA110   14887   15987   5'to3'IntronExon    0   +   1100    1 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   7   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9
Zv9_NA110   18923   20023   5'to3'IntronExon    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA110   21069   22169   5'to3'ExonIntron    0   +   1100    0 135   115 65  54  45  36  27  16  9   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA113   1615    2715    5'to3'IntronExon    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA113   2335    3435    5'to3'ExonIntron    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   7   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   3   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA113   5398    6498    5'to3'IntronExon    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA113   7173    8273    5'to3'ExonIntron    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA118   11674   12774   5'to3'IntronExon    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA118   12711   13811   5'to3'ExonIntron    0   +   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA123   38151   39251   5'to3'ExonIntron    0   -   1100    0 1061  958 844 796 695 600 464 346 265 210 150 133 94  81  72  46  18  4   0   0   0   0   0   0   0   0   0   7   9   9   9   11  21  35  43  58  91  108 180 268 406 547 712 833 882 960 1094    1172    1245    1331    1432    1510    1604    1711    1810    1830    1837    1823    1781    1690    1638    1560    1489    1257    854 731 631 589 551 497 439 404 369 301 231 168 123 76  58  50  42  28  20  11  9   9   24  27  27  27  27  27  25  18  18  18  18  18  18  18  18  18  18  18  18  18  14  5   0   0
Zv9_NA124   2578    3678    5'to3'ExonIntron    0   +   1100    0 423   407 401 377 357 345 324 304 249 185 111 54  30  12  0   0   0   0   0   0   0   0   0   0   0   0   0   1   9   9   9   9   14  18  25  27  27  27  27  27  27  27  27  27  27  27  26  18  18  18  18  18  18  16  4   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   8   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA129   4939    6039    5'to3'IntronExon    0   +   1100    226 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   4   9   9   9   9   9   9   9   9   9   9   9   9   14  34  45  60  97  128 175 293 395 524 621 764 894 1036    1164    1334    1469    1639    1801    1885    1983
Zv9_NA132   12589   13689   5'to3'ExonIntron    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA132   13634   14734   5'to3'IntronExon    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA132   14481   15581   5'to3'ExonIntron    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   5   9   9   9   9   9   9   9   9   9   5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA132   19534   20634   5'to3'IntronExon    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Zv9_NA132   28708   29808   5'to3'ExonIntron    0   -   1100    0 0 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   5   9   15  18  24  27  42  46  73  112 142 157 162 162 162 162 162 162 162 162 159 153 153 153 153 153 150 144 132 112 76  52  30  25  1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

I get this into R as follows:

> dat <- read.table("dat.dat",header=F)

I need to get the averages for columns 9 through 118, parsed by column 4.

This works:

> all_means <- aggregate(cbind(V9,V10,V11)~V4,data=dat,FUN=mean)

                V4 V9  V10 V11
1 5'to3'ExonIntron  0 0.00   0
2 5'to3'IntronExon  0 0.75   1

But there's no way I'm typing this out to V118.

I've tried this:

> aggregate(cbind(9:118)~V4,data=blah,FUN=mean)

But I get this error:

Error in model.frame.default(formula = cbind(9:118) ~ V4, data = blah) : 
  variable lengths differ (found for 'V4')

Is there something dumb I'm missing?

¿Fue útil?

Solución 3

You can use

## S3 method for class 'data.frame'
aggregate(x, by, FUN, ..., simplify = TRUE)

With your data assuming your data is in dataframe DF

DF <- read.table(text = txt, header = FALSE, stringsAsFactors = FALSE)
result <- aggregate(DF[, 9:118], by = list(DF[, 4]), FUN = mean)

# Using pander to print result table nicely. It's not needed for aggregation :)
require(pander)
pandoc.table(result)
## 
## ----------------------------------------------------
##     Group.1       V9    V10   V11   V12   V13   V14 
## ---------------- ----- ----- ----- ----- ----- -----
## 5'to3'ExonIntron 161.9  148   131  122.7 109.7 98.1 
## 
## 5'to3'IntronExon  0.0    0     0    0.0   0.0   0.0 
## ----------------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V15   V16   V17   V18   V19   V20   V21   V22 
## ----- ----- ----- ----- ----- ----- ----- -----
## 81.5  66.6  52.3  39.5  26.1  18.7  12.4   9.3 
## 
##  0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V23   V24   V25   V26   V27   V28   V29   V30 
## ----- ----- ----- ----- ----- ----- ----- -----
##  7.2   4.6   1.8   0.4    0     0     0    0.5 
## 
##  0.0   0.0   0.0   0.0    0     0     0    0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V31   V32   V33   V34   V35   V36   V37   V38 
## ----- ----- ----- ----- ----- ----- ----- -----
##  0.9   1.5   1.8   2.4   2.7    5    6.4   9.1 
## 
##  0.0   0.0   0.0   0.0   0.0    0    0.0   0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V39   V40   V41   V42   V43   V44   V45   V46 
## ----- ----- ----- ----- ----- ----- ----- -----
##  13   16.2  19.2  21.5   23   24.7   28   29.7 
## 
##   0    0.0   0.0   0.0    0    0.0    0    0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V47   V48   V49   V50   V51   V52   V53   V54 
## ----- ----- ----- ----- ----- ----- ----- -----
## 36.9  45.7  59.5  73.3  89.2  101.3 106.2  114 
## 
##  0.0   0.0   0.0   0.0   0.0   0.0   0.0    0  
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V55   V56   V57   V58   V59   V60   V61   V62 
## ----- ----- ----- ----- ----- ----- ----- -----
## 127.3  134  140.7 148.1 156.2 160.4 167.4 175.7
## 
##  0.0    0    0.0   0.0   0.0   0.0   0.0   0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V63   V64   V65   V66   V67   V68   V69   V70 
## ----- ----- ----- ----- ----- ----- ----- -----
## 183.9 183.1 183.7 182.3 178.1  169  163.8 156.7
## 
##  0.0   0.0   0.0   0.0   0.0    0    0.0   0.0 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V71   V72   V73   V74   V75   V76   V77   V78 
## ----- ----- ----- ----- ----- ----- ----- -----
## 149.8 126.6 86.3  74.0  64.0  59.8  56.0  50.6 
## 
##  0.7   0.9   0.9   1.5   1.8   1.8   1.8   1.8 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V79   V80   V81   V82   V83   V84   V85   V86 
## ----- ----- ----- ----- ----- ----- ----- -----
## 45.6  42.2  38.7  31.9  24.9  18.6  14.1   9.4 
## 
##  1.8   1.8   1.8   1.8   1.8   1.8   2.2   2.7 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## -----------------------------------------------
##  V87   V88   V89   V90   V91   V92   V93   V94 
## ----- ----- ----- ----- ----- ----- ----- -----
##  7.6   6.2   5.1   3.7   2.9   2.5   2.7   2.7 
## 
##  2.7   2.7   2.7   2.7   2.7   2.7   2.2   0.9 
## -----------------------------------------------
## 
## Table: Table continues below
## 
##  
## --------------------------------------------------
##  V95   V96   V97   V98   V99   V100   V101   V102 
## ----- ----- ----- ----- ----- ------ ------ ------
##  4.2   4.5   4.5   4.5   4.5   4.5    4.3    2.5  
## 
##  0.9   0.9   0.9   1.4   4.1   5.4    6.9    10.6 
## --------------------------------------------------
## 
## Table: Table continues below
## 
##  
## -------------------------------------------------------
##  V103   V104   V105   V106   V107   V108   V109   V110 
## ------ ------ ------ ------ ------ ------ ------ ------
##  1.8    1.8    1.8    1.8    1.8    1.8    1.8    1.8  
## 
##  13.7   18.4   30.2   40.4   53.3   63.0   77.3   90.3 
## -------------------------------------------------------
## 
## Table: Table continues below
## 
##  
## -------------------------------------------------------
##  V111   V112   V113   V114   V115   V116   V117   V118 
## ------ ------ ------ ------ ------ ------ ------ ------
##  1.8    1.8    1.8    1.8    1.4    0.5    0.0    0.0  
## 
## 104.5  117.3  134.3  147.8  164.8  181.0  189.4  199.2 
## -------------------------------------------------------
## 

Otros consejos

You have a number of options.

create a formula using . and pass a subset of the data

aggregate( . ~ V4, data = dat[,c(4,9:118)], FUN = mean)

You could also create the vector of column names using paste

nn <- paste0('V', 9:118)

and refer by column name

aggregate( . ~ V4, data = dat[,c('V4',nn)], FUN = mean)

There isn't much point using cbind here, given the formula approach works, but for example.

aggregate( do.call(cbind,lapply(nn, as.name)) ~ V4, data = dat, FUN = mean)

But this is messy as it doesn't name the columns nicely. (and is hard to follow)

If speed is an issue in general (not necessary for this operation) and you want to use the data.table package, this is done as follows:

Safer solution

Thanks to mnel's comment, I would use that:

library(data.table)
dat <- as.data.table(dat)
dat[,lapply(.SD,mean),by="V4",.SDcols=paste0("V", 9:118)]

Old solution

dat[,lapply(.SD,mean),by="V4",.SDcols=9:118]
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top