Question

Am using the RODBC library to bring data into R. I have a long query that I want to pass a variable to, much like this SO user.

Problem is that R interprets the whitespace/carriage returns in my query as a newline '\n'.

The accepted solution for this question suggests to simply break up the text into chunks and then paste() together - which works, but ideally I'd like to keep the whitespace intact - makes it easier to test/verify the behavior of the query over in the database before pasting into R.

In other languages I'm familiar with there's a simple line continuation character - indeed, several of the comments on the accepted answer are looking for an approach similar to python's \.

I found an aside to a workaround using strwrap deep in the bowels of an R discussion lists, so in the interest of making the internet better I will post it here. However, if someone can point the direction toward a more elegant/straightforward solution, I will happily accept your answer.

Was it helpful?

Solution 3

query <- gsub(pattern='\\s',replacement="",x=query)

OTHER TIPS

I don't know if you will find this helpful or not, but I have eventually gravitated towards keeping my SQL separate from my R scripts. Keeping the query in my R script, except for very very short ones, I find gets unreadable very quickly.

These days, I tend to keep queries that are more than a single line in their own separate .sql file. Then I can keep them nice and formatted and readable in a nice text editor, and read them into R as needed via something like this:

read_sql <- function(path){
    stopifnot(file.exists(path))
    sql <- readChar(path,nchar = file.info(path)$size)
    sql
}

For binding parameters into the queries, I just keep a %s where the parameter will go in the .sql file, and then add in the parameters in R using sprintf.

I've been much happier this way, as I was finding that cluttering up my R scripts with really long paste statements and multi-line character objects was making my code really hard to read.

R's strwrap will destroy whitespace, including newline characters, per the documentation.

Essentially, you can get the desired behavior by initially letting R introduce line breaks/newline \ns, and then immediately stripping them out.

#make query using PASTE
query_1 <- paste("SELECT map.ps_studentid 
      ,students.first_name || ' ' || students.last_name AS full_name
      ,map.testritscore
      ,map.termname
      ,map.measurementscale
FROM map$comprehensive_with_growth map
JOIN students 
  ON map.ps_studentid = students.id
WHERE map.termname = '",map_term,"'", sep='')

#remove newline characters introduced above.  
#width is an arbitrary big number-
#it just needs to be longer than your string.
query_1 <- strwrap(query_1, width=10000, simplify=TRUE)

#execute the query 
map_njask <- sqlQuery(XE, query_1)

Try using sprintf to get variable substitution, and then replacing all newlines and whitespace.

See my answer to a similar question for details.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top