Question

Right now I'm working with performance improvement with regards to the java I/O. I have some crazy doubts with reading / writing streams over the network using java I/O as I mentioned below. There are several opinions coming and going in my mind. But I want to clear out all of them

Code

URL url = new URL("http://example.com/connector/url2Service");  

URLConnection urlConnection = url.openConnection(); // Position 1

HttpURLConnection httpURLConnection = (HttpURLConnection)urlConnection;

String requestStr = buildRequestString();// Position 2

ByteArrayOutputStream rqByteArrayOutputStream = new ByteArrayOutputStream();
rqByteArrayOutputStream.write(((String)requestStr).getBytes()); // Position 3

httpURLConnection.setDoOutput(true);
httpURLConnection.setUseCaches(false);
httpURLConnection.setDoInput(true);
httpURLConnection.setRequestMethod("POST");

rqByteArrayOutputStream.writeTo(httpURLConnection.getOutputStream()); // Position 4

// Waiting for the response.

InputStream inputStream = httpURLConnection.getInputStream(); // Position 5

ByteArrayOutputStream rsByteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[4096];
int length = 0;

while ((length = inputStream.read(buffer)) != -1) { // Position 6
    rsByteArrayOutputStream.write(buffer, 0, length);// Position 7
}

String response  = new String(rsByteArrayOutputStream.toByteArray());// Position 8

My understanding

  • Position 1 : This will provides an object to communicate with the remote resorce. But the connection has not been established.
  • Position 2 : Build and Get the request.
  • Position 3 : Write the bytes to the ByteArrayOutputStream.
  • Position 4 : This is the place that the communicating starts with the server. We are writing Bytes. So, the server can START to read them. When the execution exits this line, we have finished sending the request object.
  • Position 5 : When we exit this line the server has finished sending the response object. So, we can start reading the response object.
  • Position 6 : Reading the object as 4096 byte chunks.
  • Position 7 : Writing the read bytes in to a ByteArrayOutputStream.
  • Position 8 : Completed reading the response and converted it to a String.

My Problems

  1. What is the point that we can say that the request uploading is completed? (I believe this is completed when we the execution exists the Position 4)
  2. What is the point that we can say that the response downloading is completed? (I have a doubt between Point 5 and 8)
  3. When we exit the Point 5, is that mean the response is fully downloaded or it is just started to download?
  4. Upto which point Network (Bandwidth) will effect to the performance? (Point 5, 6, 7...)
  5. Right now I'm working on a tuning up the InputStream reading code. If you have any suggestion please share?

References :

Was it helpful?

Solution 2

  1. The request may be buffered in a system buffer while the connection is being established, so sending - "upload" - of the request isn't guaranteed to be completed until you pass position 6. At that point, you know the get request has been received by the server since it is now sending a response. Passing position 4 just means that your program has passed the request into your own system buffers.

  2. The response has been received - "download" completed - after the last time through position 6, when it returns -1 for end of stream.

  3. When you exit position 5, the response may or may not have been received. All you are doing is gaining explicit access to an input stream that already exists. It's possible the response has already been completely received and has been buffered in the system TCP buffer, and it's possible that none of the response has been received yet.

  4. You will have to have network resources open from position 4 through the last iteration of position 6. In your code, you continue to leave the connection open indefinitely. You could save network resources by closing the connection after you exit the position 6/7 while loop.

  5. (a) Don't bother with rqByteArrayOutputStream. Instead, wrap the output stream from the connection in an OutputStreamWriter and write the request string directly to that. This saves a few lines of code and removes one iteration of copying the request string.

  6. (b) Call close() on the output stream and the input stream when you are done with them, to permit freeing of network resources.

OTHER TIPS

What is the point that we can say that the request uploading is completed? (I believe this is completed when we the execution exists the Position 4)

Yes, but you've already wasted some time with the ByteArrayOutputStream. Just write the request directly to the connection output stream. Save memory and latency.

What is the point that we can say that the response downloading is completed? (I have a doubt between Point 5 and 8)

When you receive -1 from the read() method.

When we exit the Point 5, is that mean the response is fully downloaded or it is just started to download?

Neither. The request has been written and you haven't started to do load anything, so nothing has been downloaded. It may have started arriving at the socket receive buffer, but you can't see that.

Up to which point Network (Bandwidth) will effect to the performance? (Point 5, 6, 7...)

None of those. You are using network bandwidth until you get the -1.

Right now I'm working on a tuning up the InputStream reading code. If you have any suggestion please share?

Use a bigger buffer. There's not much else you can do.

The JDK Http related classes are based on the old java.io implementation. Check the code at HttpURLConnection of OpenJDK. (Details on the OpenJDK roots, based on Oracle JDK 7) Using the new java.nio implementation built on operating system primitives directly gives a whole new performance range to work with. NIO = non-blocking IO or event driven IO.

Netty is a high performance network implementation that builds on the java.nio primitives. I have been part of a project where netty was exploited to achieve 5 figure performance on sun blade hardware. Netty example http code can be found here.

Another great article by Martin Thompson from LMAX that deals with the performance of ByteArrayOutputStream is java serialization. That can give a deeper understanding of java performance or builtin JDK class performance.

Is this the information you were looking for or is it strictly java.io that shall be used because of a legacy JDK code base?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top