Question

I am writing a Java program which uses Apache-HttpComponents to load a page and prints its HTML to the console; however, the program only prints part of the HTML before throwing this error: Exception in thread "main" java.net.SocketException: socket closed. The portion of the HTML displayed before the exception is exactly the same every time I run the program, and the error occurs in this simplified example with Google, Yahoo and Craigslist:

String USERAGENT = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.172 Safari/537.22";
DefaultHttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet("http://www.craigslist.org");
get.setHeader(HTTP.USER_AGENT,USERAGENT);
HttpResponse page = client.execute(get);
get.releaseConnection();
InputStream stream = page.getEntity().getContent();
try{
    BufferedReader br = new BufferedReader(new InputStreamReader(stream));
    String line = "";
    while ((line = br.readLine()) != null){
        System.out.println(line);
    }
}
finally{
    EntityUtils.consume(page.getEntity());
}
Was it helpful?

Solution

I've found that get.releaseConnection(); should not be called until after I've finished reading the HTML. Calling it immediately after EntityUtils.consume(page.getEntity()); fixes the above code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top