Frage

Using R.exe or Rterm.exe, this gives an excellent progress meter.

page=getURL(url="ftp.wcc.nrcs.usda.gov", noprogress=FALSE) 

In Rgui I am limited to:

page=getURL(url="ftp.wcc.nrcs.usda.gov", 
            noprogress=FALSE, progressfunction=function(down,up) print(down))

which gives a very limited set of download information.

Is there a way to improve this?

War es hilfreich?

Lösung

I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.

I am glad to tell that I was wrong. At least for a single line, \r can do the trick. In fact:

conc=function(){
    cat(" abcd")
    cat(" ABCD", '\n')

}
conc()

# abcd ABCD 

But:

over=function(){
    cat(" abcd")
    cat("\r ABCD", "\n")
}
over()

# ABCD

That given, I wrote this progressDown function, which can monitor download status rewriting always on the same same line:

library(RCurl) # Don't forget

### Callback function for curlPerform
progressDown=function(down, up, pcur, width){
    total=as.numeric(down[1]) # Total size as passed from curlPerform
    cur=as.numeric(down[2])   # Current size as passed from curlPerform
    x=cur/total
    px= round(100 * x)
    ## if(!is.nan(x) &&  px>60) return(pcur) # Just to debug at 60%
    if(!is.nan(x) && px!=pcur){
        x= round(width * x)
        sc=rev(which(total> c(1024^0, 1024^1, 1024^2, 1024^3)))[1]-1
        lb=c('B', 'KB', 'MB', 'GB')[sc+1]
        cat(paste(c(
            "\r  |", rep.int(".", x), rep.int(" ", width - x),
            sprintf("| %g%s of %g%s %3d%%",round(cur/1024^sc, 2), lb, round(total/1024^sc, 2), lb, px)),
                  collapse = ""))
        flush.console() # if the outptut is buffered, it will go immediately to console
        return(px)
    }
    return(pcur)
}

Now we can use the callback with curlPerform

curlProgress=function(url, fname){
    f = CFILE(fname, mode="wb")
    width= getOption("width") - 25   # you can make here your line shorter/longer
    pcur=0
    ret=curlPerform(url=url, writedata=f@ref,  noprogress=FALSE,
        progressfunction=function(down,up) pcur<<-progressDown(down, up, pcur, width),
        followlocation=T)
        close(f)
        cat('\n Download', names(ret), '- Ret', ret, '\n') # is success? 
}

Running it with a small sample binary:

curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")

the intermediate output at 60% is (no # protection):

  |.................................                      | 133.74KB of 222.75KB  60%

where KB, will be adjusted to B, KB, MB, GB, based on total size.

Final output with success status, is:

  |.......................................................| 222.61KB of 222.75KB 100%
 Download OK - Ret 0 

Note, the output line width is relative to R width option (which controls the maximum number of columns on a line) and can be customised changing the curlProgress line:

width= getOption("width") - 25

This is enough for my needs and solves my own question.

Andere Tipps

Here's a simple example using txtProgressBar. Basically, just do a HEAD request first to get the file size of the file you want to retrieve, then setup a txtProgressBar with that as its max size. Then you use the progressfunction argument to curlPerform to call setTxtProgressBar. It all works very nicely (unless there is no "content-length" header, in which case this code works by just not printing a progress bar).

url <- 'http://stackoverflow.com/questions/21731548/rcurl-display-progress-meter-in-rgui'

h <- basicTextGatherer()
curlPerform(url=url, customrequest='HEAD',
            header=1L, nobody=1L, headerfunction=h$update)

if(grepl('Transfer-Encoding: chunked', h$value())) {
    size <- 1
} else {
    size <- as.numeric(strsplit(strsplit(h$value(),'\r\nContent-Type')[[1]][1],
                                                   'Content-Length: ')[[1]][2])
}

bar <- txtProgressBar(0, size)
h2 <- basicTextGatherer()
get <- curlPerform(url=url, noprogress=0L,
                   writefunction=h2$update, 
                   progressfunction=function(down,up)
                       setTxtProgressBar(bar, down[2]))

h2$value() # return contents of page

The output is just a bunch of ====== across the console.

What about:

curlProgress=function(url, fname){
    f = CFILE(fname, mode="wb")
    prev=0
    ret=curlPerform(url=url, writedata=f@ref,  noprogress=FALSE,
        progressfunction=function(a,b){
            x=round(100*as.numeric(a[2])/as.numeric(a[1]))
            if(!is.nan(x) && x!=prev &&round(x/10)==x/10) prev<<-x else x='.'
            cat(x)      
        }, followlocation=T)
    close(f)
    cat(' Download', names(ret), '- Ret', ret, '\n')
}

?
It prints dots or percent download divisible by 10 and breaks line on 50%.
And with a small 223 KB file:

curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")

it sounds like this:

................10...............20................30...............40...............50 
..............................70...............80...............90...............100... Download OK - Ret 0 

I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top