Question

I've been working on a R project (projectA) that I want to hand over to a colleague, what would be the best way to handle workspace references in the scripts? To illustrate, let's say projectA consists of several R scripts that each read input and write output to certain directories (dirs). All dirs are contained within my local dropbox. The I/O part of the scripts look as follows:

# Script 1. 
# Give input and output names and dirs:
dat1Dir       <- "D:/Dropbox/ProjectA/source1/"
dat1In        <- "foo1.asc"
dat2Dir       <- "D:/Dropbox/ProjectA/source2/"
dat2In        <- "foo2.asc"
outDir        <- "D:/Dropbox/ProjectA/output1/"
outName       <- "fooOut1.asc"
# Read data 
setwd(dat1Dir)
dat1          <- read.table(dat1In)
setwd(dat2Dir)
dat2          <- read.table(dat2In)
# do stuff with dat1 and dat2 that result in new data foo
# Write new data foo to file
setwd(outDir)
write.table(foo, outName)

# Script 2. 
# Give input and output names and dirs
dat1Dir       <- "D:/Dropbox/ProjectA/output1/"
dat1In        <- "fooOut1.asc"
outDir        <- "D:/Dropbox/ProjectA/output2/"
outName       <- "fooOut2.asc"

Etc. Each script reads and write data from/to file and subsequent scripts read the output of previous scripts. The question is: how can I ensure that the directory-strings remain valid after transfer to another user?

Let's say we copy the ProjectA folder, including subfolders, to another PC, where it is stored at, e.g., C:/Users/foo/my documents/. Ideally, I would have a function FindDir() that finds the location of the lowest common folder in the project, here "ProjectA", so that I can replace every directory string with:

dat1Dir       <- paste(FindDir(), "ProjectA/source1", sep= "")

So that:

# At my own PC
dat1Dir       <- paste(FindDir(), "ProjectA/source1", sep= "")
> "D:/Dropbox/ProjectA/source1/"

# At my colleagues PC
dat1Dir       <- paste(FindDir(), "ProjectA/source1", sep= "")
> "C:Users/foo/my documents/ProjectA/source1/"

Or perhaps there is a different way? Our work IT infrastructure currently does not allow using a shared disc. I'll put helper-functions in an 'official' R project (ie, hosted on R forge), but I'd like to use scripts when many I/O parameters are required and because the code can easily be viewed and commented.

Many thanks in advance!

Was it helpful?

Solution

You should be able to do this by using relative directory paths. This is what I do for my R projects that I have in Dropbox and that I edit/run on both my Windows and OS X machines where the Dropbox folder is D:/Dropbox and /Users/robin/Dropbox respectively.

To do this, you'll need to

  1. Set the current working directory in R (either in the first line of your script, or interactively at the console before running), using setwd('/Users/robin/Dropbox;) (see the full docs for that command).

  2. Change your paths to relative paths, which mean they just have the bit of the path from the current directory, in this case the 'ProjectA/source1' bit if you've set your current directory to your Dropbox folder, or just 'source1' if you've set your current directory to the ProjectA folder (which is a better idea).

Then everything should just work!

You may also be interested in an R library that I love called ProjectTemplate - it gives you really nice functionality for making self-contained projects for this sort of work in R, and they're entirely reproducible, moveable between computers and so on. I've written an introductory blog post which may be useful.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top