Question

I can download a file from the internet easily enough using code such as this:

myurl <- "http://www.jatma.or.jp/toukei/xls/13_01.xls"
download.file(myurl, destfile = myfilepath, mode = 'wb')

However, usually I want to check the date the file was last modified before I download it. I can do this very easily in Perl using the LWP::Simple package. I've poked through the documentation for RCurl (which I admit I understand only poorly) and the closest thing I can find is the basicHeaderGatherer function.

library(RCurl)

if(url.exists("http://www.jatma.or.jp/toukei/xls/13_01.xls")) {
     h = basicHeaderGatherer()
     foo <- getURL("http://www.jatma.or.jp/toukei/xls/13_01.xls",
              headerfunction = h$update)
     names(h$value())
     h$value()
  }

h$value()[3]

By using the code above I can eventually access the 'Last-Modified' attribute, but not without generating errors as per the output below. How can I clean up my code to avoid this error and access the 'Last-Modified' attribute in a straightforward manner?

(Please note: this answer looks promising but it generates similar error messages to those shown below, so it doesn't resolve this particular issue.)

Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) (from #3) : 
  embedded nul in string: '  \021ࡱ\032 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0>\0\003\0  \t\0\006\0\0\0\0\0\0\0\0\0\0\0\001\0\0\09\0\0\0\0\0\0\0\0\020\0\0    \0\0\0\0    \0\0\0\08\0\0\0                                                                                                                                                                                                                                                                                                                                                                                                                                                \t\b\020\0\0\006\005\0g2 \a \0\002\0\006\006\0\0 \0\002\0 \004 \0\002\0\0\0 \0\0\0\\\0p\0\003\0\0CVC                                                                                                          B\0\002\0 \004a\001\002\0\0\0 \001\0\0=\001\002\0$\0 \0\002\0\021\0\031\0\002\0\0\0\022\0\002\0\0\0\023\0\002\0\0\0 \001\002\0\0\0 \001\002\0\0\0=\0\022\0  \017\0xKX/8\0\0\0\
> h$value()[3]
                  Last-Modified 
"Fri, 06 Dec 2013 05:33:53 GMT" 
> 
Was it helpful?

Solution

library(RCurl)
url.exists("http://www.jatma.or.jp/toukei/xls/13_01.xls", .header=T)["Last-Modified"]
# Last-Modified 
# "Fri, 06 Dec 2013 05:33:53 GMT"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top