Question

Given a file on a webserver (e.g., http://foo.com/bar.zip -> only accessible through HTTP), is there any way to get the date attributes (e.g., date [created, modified]) without downloading the entire archive in the first place?

Right now, I download the archive and read the attributes programmatically. Trouble is that the archive is dozens of MiB so it seems like a waste of resources to download the entire thing and end up reading off just a couple of bytes of information.

I realize that bandwidth is practically free, but I don't like to be wasteful in any case.

Was it helpful?

Solution

Try to read Last-Modified from header

OTHER TIPS

Be sure to use a HTTP HEAD request instead of a HTTP GET request to read the HTTP headers only. If you do a HTTP GET, you will download the whole file nevertheless, even if you decide just to inspect the HTTP headers.

Just for the sake of simplicity, here's a compilation of the existing (perfect) answers from @ihorko and @JanThomä, that uses curl. Other option are available too, of course, but here's a fully functional answer.

Use curl with the -I option:

-I, --head
(HTTP/FTP/FILE) Fetch the HTTP-header only! HTTP-servers feature the command HEAD which this uses to get nothing but the header of a document. When used on an FTP or FILE file, curl displays the file size and last modification time only.

Also, the -s option is nice here:

-s, --silent
Silent or quiet mode. Don't show progress meter or error messages. Makes Curl mute. It will still output the data you ask for, potentially even to the terminal/stdout unless you redirect it.

Hence, something like this would do the trick:

curl -sI http://foo.com/bar.zip | grep 'Last-Modified' | cut -d' ' -f 2-
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top