Reading “chunked” response with HttpWebResponse

https://stackoverflow.com/questions/16998

08-06-2019
|

Question

I'm having trouble reading a "chunked" response when using a StreamReader to read the stream returned by GetResponseStream() of a HttpWebResponse:

// response is an HttpWebResponse
StreamReader reader = new StreamReader(response.GetResponseStream());
string output = reader.ReadToEnd(); // throws exception...

When the reader.ReadToEnd() method is called I'm getting the following System.IO.IOException: Unable to read data from the transport connection: The connection was closed.

The above code works just fine when server returns a "non-chunked" response.

The only way I've been able to get it to work is to use HTTP/1.0 for the initial request (instead of HTTP/1.1, the default) but this seems like a lame work-around.

Any ideas?

@Chuck

Your solution works pretty good. It still throws the same IOExeception on the last Read(). But after inspecting the contents of the StringBuilder it looks like all the data has been received. So perhaps I just need to wrap the Read() in a try-catch and swallow the "error".

Solution

Haven't tried it this with a "chunked" response but would something like this work?

StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();
string tmpString = null;
int count = 0;
do
{
     count = resStream.Read(buf, 0, buf.Length);
     if(count != 0)
     {
          tmpString = Encoding.ASCII.GetString(buf, 0, count);
          sb.Append(tmpString);
     }
}while (count > 0);

OTHER TIPS

I am working on a similar problem. The .net HttpWebRequest and HttpWebRequest handle cookies and redirects automatically but they do not handle chunked content on the response body automatically.

This is perhaps because chunked content may contain more than simple data (i.e.: chunk names, trailing headers).

Simply reading the stream and ignoring the EOF exception will not work as the stream contains more than the desired content. The stream will contain chunks and each chunk begins by declaring its size. If the stream is simply read from beginning to end the final data will contain the chunk meta-data (and in case where it is gziped content it will fail the CRC check when decompressing).

To solve the problem it is necessary to manually parse the stream, removing the chunk size from each chunk (as well as the CR LF delimitors), detecting the final chunk and keeping only the chunk data. There likely is a library out there somewhere that does this, I have not found it yet.

Usefull resources :

http://en.wikipedia.org/wiki/Chunked_transfer_encoding http://tools.ietf.org/html/rfc2616#section-3.6.1

Craig, without seeing the stream you're reading it's a little hard to debug but MAYBE you could change the setting of the count variable to this:

count = resStream.Read(buf, 0, buf.Length-1);

It's a bit of a hack, but if the last read is killing you and it's not returning any data then theoretically this will avoid the problem. I still wonder why the stream is doing that.

I've had the same problem (which is how I ended up here :-). Eventually tracked it down to the fact that the chunked stream wasn't valid - the final zero length chunk was missing. I came up with the following code which handles both valid and invalid chunked streams.

using (StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
    StringBuilder sb = new StringBuilder();

    try
    {
        while (!sr.EndOfStream)
        {
            sb.Append((char)sr.Read());
        }
    }
    catch (System.IO.IOException)
    { }

    string content = sb.ToString();
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow