Question

I am working on a project that imports all csv files from a given folder and merges them into one file. I was able to import the rows and columns I wanted from each of the files from the folder but now need help merging them all into one file. I do not know how many files I will eventually end up with (probably around 120) so I do not want to merge them 1 by 1.

Here is what I have so far:

 # Import All files
 rowsToUse <- c(9:104,657:752)
 colsToUse <- c(15,27,28,29,30,33,35)
 filenames <- list.files("save", pattern="*.csv", full.names=TRUE)
 for (i in seq_along(filenames)) {
   assign(paste("df", i, sep = "."), read.csv(filenames[i])[!is.na(30),][rowsToUse,colsToUse])
 }

 # Merge into one file
 for (i in seq_along(filenames)) {
   df<-rbind(df.[i])
 }

The first part of the code creates a series of dataframes labled df.1, df.2, etc. I would like them to end up in one final dataframe called df. All files are identical in structure.

I would really appreciate some help if someone has a few extra minutes! Thank you!

Was it helpful?

Solution

Since you have already read the files in, you can try the following:

do.call(rbind, mget(ls(pattern = "df")))

The ls(pattern = df) should capture all of your "df.1", "df.2", and so on. Hopefully you don't have other things named with the same pattern, but if you do, experiment with a stricter pattern until the command lists just your data.frames.

mget() will bring all of these into a list on which you can use do.call(rbind, ...).

OTHER TIPS

Those all seem complicated ;). The answers above seem to be operating on "we have a list of objects with very similar names, how do we handle that". Answer: they don't need to have very similar names. They don't even have to be different objects.

If you read the files in not through a for loop, but through lapply(), you get a single object that contains all of the data frames - each one as a single element. These can then trivially be extracted. So you'd have something that looks like...

#Grab a list of filenames
filenames <- list.files("save", pattern="*.csv", full.names=TRUE)

#Iterate through that list of names, using lapply(), reading the data in.
list_of_data_frames <- lapply(filenames, function(x){

    #Read the data in
    to_return <- read.csv(x)[!is.na(30),][c(9:104,657:752),c(15,27,28,29,30,33,35)])

    #Return it. You could save lines of code (and processor time!) by just reading
    #straight into return(), but it would be a lot less clear.
    return(to_return)
})

#Now use do.call to turn it into a single data frame.
data.df <- do.call("rbind", list_of_data_frames)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top