Question

The goal here is to concatenate the first 6 columns from several files into a data frame in R. I am puzzled by why method (1) works but method (2) does not work. To me, both methods should be equivalent.

Answers or hints for debugging are both very welcomed.

Method (1)

ret <- sapply(fn, function(x) { (read.table(x, header = FALSE)) })
ret <- lapply(ret, function(x) {x[, 1:6]})

Method (1) correct outputs:

> head(ret)
 $`../pool.11421.poolFile`
    V1    V2     V3    V4  V5              V6
 1   1 M5132 ACAGTG 11421 351 1,2,3,4,5,6,7,8
 2   2 M6764 ACTGAT 11421 351 1,2,3,4,5,6,7,8
 3   3 M5597 AGTCAA 11421 351 1,2,3,4,5,6,7,8
 4   4 M5636 AGTTCC 11421 351 1,2,3,4,5,6,7,8
 5   5 M2463 ATCACG 11421 351 1,2,3,4,5,6,7,8
 6   6 M5792 ATGTCA 11421 351 1,2,3,4,5,6,7,8
 7   7 M6799 ATTCCT 11421 351 1,2,3,4,5,6,7,8

Method (2)

ret <- sapply(fn, function(x) { (read.table(x, header = FALSE))[, 1:6]})

Method (2) wrong outputs:

> head(ret)
        ../pool.11421.poolFile ../pool.11422.poolFile ../pool.11423.poolFile
 V1 Integer,23             Integer,48             Integer,48
 V2 Character,23           Character,48           Character,48
 V3 Character,23           Character,48           Character,48
 V4 Integer,23             Integer,48             Integer,48
 V5 Integer,23             Integer,48             Integer,48
 V6 Character,23           Character,48           Character,48
Was it helpful?

Solution

Your second method returns an array. sapply has the simplify argument. If TRUE simplify2array is called and R tries to convert your data to a vector or an array. See ?sapply for details.

Try instead:

ret <- sapply(fn, function(x) { (read.table(x, header = FALSE))[, 1:6]}, simplify=FALSE)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top