Question

I have some files uploaded at a filehoster which I want to download programmatically, using Delphi. They don't require any captchas or the like, normally you simply press a button and you get the file. Let's take this as an example.

Now I thought I could simply take the URL the Download Now - Button is pointing at, use an TIdHTTP.Get request and save it with a MemoryStream / Filestream / whatever. Copying the link address leads to this site, which, when entered into my browser pops up the download prompt.

var
  MemStream: TMemoryStream;
  code: string;       // added for solution
  number: integer;    // added for solution
begin
  with TIdHTTP.Create(nil) do
  try
    HandleRedirects := true;
    System.Delete(code,1,AnsiPos('var n =',code)+7);                  // added
    number := StrToInt(AnsiLeftStr(code,AnsiPos(' ',code)-1)) + 1;    // added
    MemStream := TMemoryStream.Create;
    try
      // Get('http://www56.zippyshare.com/d/5862319/604061/bgAvgTable.png', MemStream);
      Get(TIdURI.URLEncode('http://www56.zippyshare.com/d/5862319/' + IntToStr(number)
        + '/bgAvgTable.png'), MemStream);       // added for solution
      MemStream.SaveToFile('test.png');
    finally
      MemStream.Free;
    end;
  finally
    Free;
  end;
end;

However, using a checking tool I found that it contains a 302 redirect to the original site, thus when performing the GET-request I have to set HandleRedirects to avoid error messages and I get the HTML code of the original site rather than the file I had suspected.

So, I am kind of confused about how 1) I somehow get the file from my browser though the URL only contains a 302 redirect to the previous page and 2) I can achieve the same from within my code. Any chance someone of you might educate me a little there ? ;)

EDIT

Thanks to your input I could find the issue, turns out that the address I have to use gets generated using a random number, which is to be found in the original source. So posting a request to get the number first does the trick. I have edited the code accordingly.

Was it helpful?

Solution

File hosting sites make different tricks to ensure you was not hotlinking and show you advertisement and perhaps counter. There can be

  • simple analysis of HTTP Referrer field in the request
  • setting and checking session-unique cookies
  • having HTTP Forms with hidden one-time values, and Download button would be not the link but the form's Submit action.
  • generating one-time hashed URL, and encoding different parameters like your IP and your browser name into it
  • maybe more

Tools like USDownloader and JDownloader makes a lot of attempts to circumvent it.

While zippyshare seems to be more liberal, it still cannot afford hotlinking and should implement at least some measures of self-defense. When analysing traffic - start with absolutely fresh browser loading zippyshare page for the 1st time in its life and check it all.

As i re-load the page few times i see that the number "604061" is different and link keep changing time and again after each reload. You probably have to load the page, parse the link, set the HTTP referer and only then download the file.

You do not show the HTTP traffic logs so it is hard to tell for sure.

OTHER TIPS

The server may be checking for some trace to avoid the file to be downloaded programmatically.

It may be anything the hostmaster wants to check, from a wide range of possibilities, but the most typical check is the referrer.

When you navigate in a web browser from one page to another using an link, the browser adds the first page as a referrer to the second page in the request header.

Indy have support for you to add a referrer:

IdHTTP1.Request.Referer := 'http://www.any.other.page';

If the check fails, the server script just redirects the input to the donwload page. This is done to show advertising or to filfull other goals of the file hosting service.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top