Question

I am working with a large list results.list that contains 22 tables (23544 obs of 6 variables).

I want to sort each table by a specific column (FDR) False Discovery rate and select the first 100 rows. I can do this manually using my simple R commands.

attach(results.list$adult.OLFvsVTA)
sort(FDR)
detach(results.list$adult.OLFvsVTA)
adult.OLFvsVTA100<-adult.OLFvsVTA[1:100,]

I want to combine the top 100 rows from all 22 tables. I do not want the FDR values in the combined vector but rather I want to combine the top 100 rows by one column named (genes). I would like to automate this process using an apply function. Despite a series of attempts I can not get it to work. I created another vector called r.names that contains the names of all 22 tables in my list that I was planning to feed into my apply function. I read several apply help pages but I can't get it to work. Any help would be appreciated.

Était-ce utile?

La solution

do.call(rbind, lapply(results.list, function(dd) { dd[with(dd, order(FDR)),][(1:100), ]}))

so assuming results.list is a list of data frames we want to apply (lapply is for lists) the function that sorts them by FDR and grabs the first 100 rows (function(dd) {....} <- stolen from other stackoverflow post for sorting by column) to each data frame. The result of this will be a list of data frames. We can call do.call which is a fancy function that takes a function and a list where the list will be decomposed from a list to the arguments for our function. In this case our function is rbind will will take the X number of 100 row tables and create one big table. Let me know if you want further explanation.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top