Question

I have to login into a https web page and download a file using Java. I know all the URLs beforehand:

baseURL = // a https URL;
urlMap = new HashMap<String, URL>();
urlMap.put("login", new URL(baseURL, "exec.asp?login=username&pass=XPTO"));
urlMap.put("logout", new URL(baseURL, "exec.asp?exec.asp?page=999"));
urlMap.put("file", new URL(baseURL, "exec.asp?file=111"));

If I try all these links in a web browser like firefox, they work.

Now when I do:

urlConnection = urlMap.get("login").openConnection();
urlConnection.connect();
BufferedReader in = new BufferedReader(
    new InputStreamReader(urlConnection.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
    System.out.println(inputLine);
in.close();

I just get back the login page HTML again, and I cannot proceed to file download.

Thanks!

Was it helpful?

Solution

I agree with Alnitak that the problem is likely storing and returning cookies.

Another good option I have used is HttpClient from Jakarta Commons.

It's worth noting, as an aside, that if this is a server you control, you should be aware that sending the username and password as querystrings is not secure (even if you're using HTTPS). HttpClient supports sending parameters using POST, which you should consider.

OTHER TIPS

As has been noted, you must maintain the session cookie between requests (see CookieHandler).

Here is a sample implementation:

class MyCookieHandler extends CookieHandler {

    private Map<String, List<String>> cookies = new HashMap<String, List<String>>();

    @Override
    public Map<String, List<String>> get(URI uri,
            Map<String, List<String>> requestHeaders) throws IOException {
        String host = uri.getHost();
        Map<String, List<String>> ret = new HashMap<String, List<String>>();
        synchronized (cookies) {
            List<String> store = cookies.get(host);
            if (store != null) {
                store = Collections.unmodifiableList(store);
                ret.put("Cookie", store);
            }
        }

        return Collections.unmodifiableMap(ret);
    }

    @Override
    public void put(URI uri, Map<String, List<String>> responseHeaders)
            throws IOException {
        List<String> newCookies = responseHeaders.get("Set-Cookie");
        if (newCookies != null) {
            String host = uri.getHost();
            synchronized (cookies) {
                List<String> store = cookies.get(host);
                if (store == null) {
                    store = new ArrayList<String>();
                    cookies.put(host, store);
                }
                store.addAll(newCookies);
            }
        }
    }

}

Notwithstanding that you may have some other problem that's preventing the login request from getting you logged in, it's unlikely that you'll be able to proceed to the download page unless you store and return any cookies that the login page generates.

That's because HTTP itself is stateless, so in your current code there's no way for the remote server to tell that the second download request is from the same user that just logged in.

I'd say have a look at Java CURL http://sourceforge.net/projects/javacurl. I have used it before to login into an https website and download stuff, it has features such as spoofing the browser id etc. Which might solve your issue of getting redirected back to login.

Although they provide an eclipse plugin for it I have used it without and it works fine.

Alternatively you could use wget and call it from java.

Perhaps you want to try HttpUnit. Although written with testing of websites in mind it may be usable for your problem.

From their website:

"... Written in Java, HttpUnit emulates the relevant portions of browser behavior, including form submission, JavaScript, basic http authentication, cookies and automatic page redirection, and allows Java test code to examine returned pages either as text, an XML DOM, or containers of forms, tables, and links."

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top