Question

thanks for reading.

Such an annoying problem occurred to me,I'm deserving for someone to help me. I am using httpcomponent(new version of former httpclient) in java to open some urls and scrap contents.And multihtread is used to improve performance.

So it is the problem:

1.threads share a HttpClient

1)Defination

private static final ThreadSafeClientConnManager cm = new ThreadSafeClientConnManager();
private static HttpHost proxy = new HttpHost("127.0.0.1",8086,"http");
private static DefaultHttpClient http = new DefaultHttpClient(cm);

2)and in my inital function

cm.setMaxTotal(100);
http.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);

3)and then is my thread function

public static String getUrl(String url, String Chareset)
    {
        HttpGet get = new HttpGet(url);//uri
        get.setHeader("Content-Type", "text/html");
        get.setHeader("User-Agent","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50215;)");
        get.setHeader("Accept-Charset", Chareset+";q=0.7,*;q=0.7");//"utf-8;q=0.7,*;q=0.7");
    get.getParams().setParameter("http.socket.timeout",new Integer(CONNECTION_TIMEOUT));//20000

        String result = "";
        try {
            HttpResponse response = http.execute(get);
        if (response.getStatusLine().getStatusCode() != 200){//statusCode != HttpStatus.SC_OK) {
                System.err.println("HttpGet Method failed: "
                        + response.getStatusLine());//httpGet.getStatusLine()
        }
        HttpEntity entity = response.getEntity();
        if (entity != null) {
            result = EntityUtils.toString(entity);
            EntityUtils.consume(entity);
            entity = null;
        }
    } catch(java.net.SocketException ee)
    {
            ee.printStackTrace();
            Logger.getLogger(DBManager.class.getName()).log(Level.SEVERE, null, ee);
    }
        catch (IOException e) {
            //throw new Exception(e);
            Logger.getLogger(DBManager.class.getName()).log(Level.SEVERE, null, e);//TODO Debug
    } finally {
        get.abort();//releaseConnection();//TODO http.getConnectionManager().shutdown();?
        get = null;
    }
        return result; 
    }

4)And then I create 10 threads to call the getUrl() function,but after about 1000 loops,shit happens:

**HttpGet Method failed: HTTP/1.0 503 Service Unavailable**

But I used IE and the proxy to open the url ,it's opened successfully.So that means nothing wrong with my proxy.

So what's wrong?

2.Then I changed the creation of httpclient to the getUrl() function,so threads don't share HttpClient,like that:

public static String getUrl(String url, String Chareset)
    {
        HttpGet get = new HttpGet(url);//uri
        get.setHeader("Content-Type", "text/html");
        get.setHeader("User-Agent","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50215;)");
        get.setHeader("Accept-Charset", Chareset+";q=0.7,*;q=0.7");//"utf-8;q=0.7,*;q=0.7");
    get.getParams().setParameter("http.socket.timeout",new Integer(CONNECTION_TIMEOUT));//20000

        DefaultHttpClient http = new DefaultHttpClient(cm);//threads dont't share it
        http.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);

        String result = "";
        try {
            HttpResponse response = http.execute(get);
        if (response.getStatusLine().getStatusCode() != 200){//statusCode != HttpStatus.SC_OK) {
                System.err.println("HttpGet Method failed: "
                        + response.getStatusLine());//httpGet.getStatusLine()
        }
        HttpEntity entity = response.getEntity();
        if (entity != null) {
            result = EntityUtils.toString(entity);
            EntityUtils.consume(entity);
            entity = null;
        }
    } catch(java.net.SocketException ee)
    {
            ee.printStackTrace();
            Logger.getLogger(DBManager.class.getName()).log(Level.SEVERE, null, ee);
    }
        catch (IOException e) {
            //throw new Exception(e);
            Logger.getLogger(DBManager.class.getName()).log(Level.SEVERE, null, e);//TODO Debug
    } finally {
        get.abort();//releaseConnection();//TODO http.getConnectionManager().shutdown();?
        get = null;
                http = null;//clean almost all the resources
    }
        return result; 
    }

and then after about 600 loops of 10 threads,another shit happens:

**Exception in thread "Thread-11" java.lang.OutOfMemoryError: Java heap space**

Exception occurs in result = EntityUtils.toString(entity); line

So,really need some help.

Thanks!

Was it helpful?

Solution

The answer given by Guillaume sounds perfectly reasonable to me. As far as you second problem is concerned the reason for OutOfMemoryError is quite simple. DefaultHttpClient objects are very expensive and by creating a new instance for each and every request you are depleting your system resources much faster. Besides, generally EntityUtils#toString is to be avoided for anything other than simple tests. One should consume HTTP response messages as a content stream without buffering the entire response body in memory.

OTHER TIPS

503 means service unavailable, therefore the service is down. Now it could be due to the fact that you are actually accessing the same service over and over and ends up with an error or denies you the service because of such load.

The second error is quite clear: no more memory because you used it all. Either your program is leaking memory or you should increase your heap size using -Xmx256m, -Xmx512m, -Xmx1G, etc... There are tons of answers on SO for these issues.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top