Question

Given a URL (String ref), I am attempting to retrieve the redirected URL as follows:

        HttpURLConnection con = (HttpURLConnection)new URL(ref).openConnection();
        con.setInstanceFollowRedirects(false);
        con.setRequestProperty("User-Agent","");
        int responseType = con.getResponseCode()/100;
        while (responseType == 1)
        {
            Thread.sleep(10);
            responseType = con.getResponseCode()/100;
        }
        if (responseType == 3)
            return con.getHeaderField("Location");
        return con.getURL().toString();

I am having several (conceptual and technical) problems with it:

Conceptual problem:

  • It works in most cases, but I don't quite understand how.
  • All methods of the 'con' instance are called AFTER the connection is opened (when 'con' is instanciated).
  • So how do they affect the actual result?
  • How come calling 'setInstanceFollowRedirects' affects the returned value of 'getHeaderField'?
  • Is there any point calling 'getResponseCode' over and over until the returned value is not 1xx?
  • Bottom line, my general question here: is there another request/response sent through the connection every time one of these methods is invoked?

Technical problem:

  • Sometimes the response-code is 3xx, but 'getHeaderField' does not return the "final" URL.
  • I tried calling my code with the returned value of 'getHeaderField' until the response-code was 2xx.
  • But in most other cases where the response-code is 3xx, 'getHeaderField' DOES return the "final" URL, and if I call my code with this URL then I get an empty string.

Can you please advise how to approach the two problems above in order to have a "100% proof" code for retrieving the "final" URL?

Please ignore cases where the response-code is 4xx or 5xx (or anything else other than 1xx / 2xx / 3xx for that matter).

Thanks

Was it helpful?

Solution

Conceptual problems:

0.) Can one URLConnection or HttpURLConnection object be reused?

No, you can not reuse such an object. You can use it to fetch the content of one URL just once. You can not use it to retrieve another URL, nor to fetch the content twice (speaking on the network level).

If you want to fetch another URL or to fetch the URL a second time, you have to call the openConnection() method of the URL class again to instanciate a new connection object.

1.) When is the URLConnection actually connected?

The method name openConnection() is misleading. It only instanciates the connection object. It does not do anything on the network level.

The interaction on the network level starts in this line, which implicitly connects the connection (= the TCP socket under the hood is opened and data is sent and received):

int responseType = con.getResponseCode()/100;

.

Alternatively, you can use HttpURLConnection.connect() to explicitly connect the connection.

2.) How does setInstanceFollowRedirects work?

setInstanceFollowRedirects(true) causes the URLs to be fetched "under the hood" again and again until there is a non-redirect response. The response code of the non-redirect response is returned by your call to getResponseCode().

UPDATE:
Yes, this allows to write simple code if you do not want to bother about the redirects yourself. You can simply switch on to follow redirects and then you can read the final response of the location to which you get redirected as if there was no redirect taking place.

OTHER TIPS

I would be more careful in evaluating the response code. Not every 3xx-code is automatically a kind of redirection. For example the code 304 just stands for "Not modified."

Look at the original definitions here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top