Question

I am working on a piece of code that, among other variables, passes the header from a bash-script to R. This may seem silly or stupid, but for my particular needs, it is exactly what I want. So, I have a bash-script:

#!/bin/bash
Rscript script.R "c("column1","column2","column3")"

I have simplified it, but the essentials are there: it starts an instance of Rscript, with the desired header passed as an argument. The R-script contains the following pieces or relevant code:

args<-commandArgs(TRUE) # enable arguments
header <- args[1] # store the first argument in a variable

Now, I want to change the header of my data to the header that I passed as an argument. The following pieces of code all work as desired when I run it from GUI (in my case, Rstudio):

(1) colnames(data) <- header
(2) colnames(data) <- paste(header, sep=" ")
(3) for (i in 1:length(header)){colnames(data)[i] <- header[i]}

All these commands chop up the header in 3 pieces, so that all three columns get a new header (respectively "column1", "column2" and "column3"). However, if I run this from my bash-script like described above (calling Rscript), it does not work. Instead, it gives this output:

 c(column1,column2,column3)                                      Chromosome
1                                                            rs10          7
2                                                       rs1000000         12
3                                                      rs10000010          4
4                                                      rs10000012          4
5                                                      rs10000013          4
6                                                      rs10000017          4
   Position 
1  92221824 
2 125456933 
3  21227772 
4   1347325 
5  36901464 
6  84997149 

...and clearly, this is not what I want. Neither of the three commands listed above work as desired now. This confuses me, since I expect results from my code to be the same regardless of the way I run it, be it Rstudio or Rscript.

Does anyone has an explanation / solution for this? Any ideas are much appreciated.

Was it helpful?

Solution

The problem is that if you pass the argument as a string, then you must parse it into a vector, otherwise it will just be a vector of length 1. To do that, you'll have to use eval and parse.

Here is an example script.R

args<-commandArgs(TRUE)
header<-eval(parse(text=args[1]))

data<-data.frame(one=1:10,two=1:10,three=1:10)
colnames(data)<-header
head(data)

Here is how you would pass the argument in bash:

Rscript script.R "c('col1','col2','col3')"

Which would return:

#   col1 col2 col3
# 1    1    1    1
# 2    2    2    2
# 3    3    3    3
# 4    4    4    4
# 5    5    5    5
# 6    6    6    6

OTHER TIPS

My guess would be that since a vector is an R type that when you enter it through the bash script that Unix doesn't recognize it as such and passes it to R as a string. As such, R doesn't know to treat it as a vector of column names (or a vector at all, for that matter) and as such doesn't know how to break it up via the for loop. How many column names are we talking here? If it's really only a few I'd probably just enter them as separate command line args and the combine them into a list, if it's a lot then I'd enter them as a long stream with a well defined separator and use text processing to split them up into a list, ala:

myvector <- "col1,col2,col3,col4"
mycolnames <- unlist(as.list(strsplit(myvector,",")[[1]]))

Without being able to reproduce your data and script exactly I can't give you a more precise answer but hopefully this helps. This is how I do it when I need to pass a list to R via shell scripts.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top