Question

I was trying to implement simple authentication in R that would store credentials outside of the source code under revision control. I'm aware of the approach using options() and getOption(), but using it would force me to remove project-level .Rprofile from revision control. I prefer to use another approach, based on exporting credentials' environment variables via .bashrc of an R-associated Linux user (ruser) and then reading these credentials in project-specific .Rprofile into global variables like this:

CB_API_KEY <<- Sys.getenv("CB_API_KEY")

However, accessing such global variables in R modules fails with message "object 'CB_API_KEY' not found". I suspect that the reason is that I source .Rprofile via separate calling R CMD BATCH in Makefile. Executing R modules, where I attempt to access these global variables, is done again via Makefile by separate call to Rscript. Therefore, it appears to me that global environment of the first R session is lost, hence access failures. I would appreciate your comments and advice on this issue.

UPDATE: The following is the contents of project-specific .Rprofile as well as the project's top-level and sub-project-level Makefile files, correspondingly.

.Rprofile:

# Execute global R profile first...
source("~/.Rprofile")

# ...then local project R setup

# Retrieve SRDA (SourceForge) credentials
SRDA_USER <<- Sys.getenv("SRDA_USER")
SRDA_PASS <<- Sys.getenv("SRDA_PASS")

# Retrieve CrunchBase API key
CB_API_KEY <<- Sys.getenv("CB_API_KEY")

# Another approach is to use options() and getOption(),
# but it requires removing this file from source control
options(SRDA_USER = "XXX", SRDA_PASS = "YYY", CB_API_KEY = "ZZZ")

Top-level Makefile:

# Major variable definitions

PROJECT="diss-floss"
HOME_DIR="~/diss-floss"
REPORT={$(PROJECT)-slides}

COLLECTION_DIR=import
PREPARATION_DIR=prepare
ANALYSIS_DIR=analysis
RESULTS_DIR=results
PRESENTATION_DIR=present

RSCRIPT=Rscript


# Targets and rules

all: rprofile collection preparation analysis results presentation

rprofile:
        R CMD BATCH ./.Rprofile

collection:
        cd $(COLLECTION_DIR) && $(MAKE)

preparation: collection
        cd $(PREPARATION_DIR) && $(MAKE)

analysis: preparation
        cd $(ANALYSIS_DIR) && $(MAKE)

results: analysis
        cd $(RESULTS_DIR) && $(MAKE)

presentation: results
        cd $(PRESENTATION_DIR) && $(MAKE)


## Phony targets and rules (for commands that do not produce files)

#.html
.PHONY: demo clean

# run demo presentation slides
demo: presentation
        # knitr(Markdown) => HTML page
        # HTML5 presentation via RStudio/RPubs or Slidify
        # OR
        # Shiny app

# remove intermediate files
clean:
        rm -f tmp*.bz2 *.Rdata

Sub-project-level Makefile:

# Major variable definitions

RSCRIPT=Rscript
#RSCRIPT=R CMD BATCH
R_OPTS=--no-save --no-restore --verbose
#R_OUT=> outputFile.Rout 2>&1

# --no-save --no-restore --verbose myRfile.R > outputFile.Rout 2>&1


# Targets and rules

collection: importFLOSSmole \
            importSourceForge \
            importAngelList \
            importCrunchBase

importFLOSSmole: getFLOSSmoleDataXML.R
        $(RSCRIPT) $(R_OPTS) $<

importSourceForge: getSourceForgeData.R
        $(RSCRIPT) $(R_OPTS) $<

importAngelList: getAngelListData.R
        $(RSCRIPT) $(R_OPTS) $<

importCrunchBase: getCrunchBaseDataAPI.R
        $(RSCRIPT) $(R_OPTS) $<

.PHONY: clean

# remove intermediate files
clean:
        rm -f tmp*.bz2 *.Rdata .Rout

Directory structure is typical:

+ `ruser` home directory
|____+ project's home directory
     |____ `import` sub-directory
     |____ project's other sub-directories

Thank you!

Was it helpful?

Solution 2

OK, looks like I figured out how to solve this problem. I created another .Rprofile in the project's sub-directory (import), where processing requires authentication, and moved there code retrieving environment variables' values into R global variables. I tested it and haven't seen the previous error message (knocking on wood!). Additional benefit is that now the code is more separated into functional areas and, thus, is cleaner.

Lesson learned (and the reason of all this trouble):

Any R session sources .Rprofile file in any [current] directory where it runs (I wrongfully assumed that R session sources .Rprofile file only once, on the initial run, and subsequent R sessions cannot do that).

OTHER TIPS

If I've understood this correctly, you have a shell script named Makefile which calls R in batch mode (R CMD BATCH) to source your .Rprofile file, and then later runs another R session via Rscript, where it can't find the variable that you definine in your .Rprofile file.

You are correct in noticing that global variables aren't persisted between sessions.

However, unless you are calling Rscript with the --no-init-file or --vanilla arguments, the .Rprofile file should be sourced on startup.

Adding messages to your .Rprofile will let you know if and when it gets called.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top