listing files in different directories which have same file naming format [R]

StackOverflow https://stackoverflow.com/questions/22888280

  •  28-06-2023
  •  | 
  •  

Pregunta

I want to cbind several files together using:

do.call("cbind",lapply(sample_list, FUN=function(files){read.table(files, header=TRUE, sep="\t", stringsAsFactors=FALSE)}))

However, my sample_list files (eg, 1c.QC.dat) are in different directories. But the directories follow the same pattern:

/home/Project1/Files/Sample_*/*.QC.dat

where * is the sample ID.

Is there a way to list these files easily?

¿Fue útil?

Solución

Let's first select our Sample_* directories.

main_dir <- '/home/Project1/Files/'
directories <- list.files(main_dir, pattern = '^Sample_')
directories <- Filter(function(x) file.info(file.path(main_dir, x))$isdir, directories)

We now have a character vector of directories beginning with Sample_. Now we can read in our data.frames:

dfs <- lapply(directories, function(subdir) {
  files <- list.files(path <- file.path(main_dir, subdir), pattern = '\\.QC\\.dat$')
  subdfs <- lapply(files, function(filename)
    read.table(file.path(path, filename), header=TRUE, sep="\t", stringsAsFactors=FALSE)
  )
  do.call(rbind, subdfs)
})

Finally, we bind them into one giant dataframe:

dfs <- do.call(rbind, dfs) # Notice we used the same trick twice

A shorter but cleverer option is to use the recursive = TRUE parameter on list.files:

dfs <- do.call(rbind, lapply(
  list.files(path <- '/home/Project1/Files/',
             pattern = '^Sample_.*\\.QC\\.dat$', recursive = TRUE),
  function(filename)
    read.table(file.path(path, filename), header=TRUE, sep="\t", stringsAsFactors=FALSE)
))
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top