Question

I have to load in many files and tansform their data. Each file contains only one data.table, however the tables have various names.

I would like to run a single script over all of the files -- to do so, i must assign the unknown data.table to a common name ... say blob.

What is the R way of doing this? At present, my best guess (which seems like a hack, but works) is to load the data.table into a new environment, and then: assign('blob', get(objects(envir=newEnv)[1], env=newEnv).

In a reproducible context this is:

newEnv <- new.env()
assign('a', 1:10, envir = newEnv)
assign('blob', get(objects(envir=newEnv)[1], env=newEnv))

Is there a better way?

Was it helpful?

Solution 2

I assume that you saved the data.tables using save() somewhat like this:

d1 <- data.table(value=1:10)
save(d1, file="data1.rdata")

and your problem is that when you load the file you don't know the name (here: d1) that you used when saving the file. Correct?

I suggest you use instead saveRDS() and readRDS() for saving/loading single objects:

d1 <- data.table(value=1:10)
saveRDS(d1, file="data1.rds")
blob <- readRDS("data1.rds")

OTHER TIPS

The R way is to create a single object, i.e. a single list of data tables.

Here is some pseudocode that contains three steps:

  • Use list.files() to create a list of all files in a folder.
  • Use lapply() and read.csv() to read your files and create a list of data frames. Replace read.csv() with read.table() or whatever is appropriate for your data.
  • Use lapply() again, this time with as.data.table() to convert the data frames to data tables.

The pseudocode:

filenames <- list.files("path/to/files")
dat <- lapply(files, read.csv)
dat <- lapply(dat, as.data.table)

Your result should be a single list, called dat, containing a data table for each of your original files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top