解压缩的Java gzip压缩的HTTP响应

https://stackoverflow.com/questions/2474193

21-09-2019
|

题

我试图解压缩通过使用GZIPInputStream一个GZIPed HTTP响应。但是我总是有相同的异常，当我尝试读取流：java.util.zip.ZipException: invalid bit length repeat

我的HTTP请求报头：

GET www.myurl.com HTTP/1.0\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 115\r\n
Connection: keep-alive\r\n
X-Requested-With: XMLHttpRequest\r\n
Cookie: Some Cookies\r\n\r\n

目前的HTTP响应报头的末尾，我得到path=/Content-Encoding: gzip，接着gziped响应。

我试图2个同类者代码进行解压缩：

更新：在下面的代码，tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes ();

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

StringBuffer  szBuffer = new StringBuffer ();

byte  tByte [] = new byte [1024];

while (true)
{
    int  iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here

    if (iLength < 0)
        break;

    szBuffer.append (new String (tByte, 0, iLength));
}

而这一次，我得到这个论坛上：

InputStream     gzipStream = new GZIPInputStream   (new ByteArrayInputStream (tBytes));
Reader          decoder    = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception
BufferedReader  buffered   = new BufferedReader    (decoder);

我想这是一个编码误差。

最好的问候，

bill0ute

解决方案

您不会告诉你如何让你使用在这里设立gzip的流tBytes：

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

的一种解释是要包括在tBytes整个HTTP响应。取而代之的是，它应该是唯一的HTTP标头后的内容。

另一种解释是响应被分块。

修改：您在内容编码线作为消息体之后取数据。然而，根据该HTTP 1.1规范中的报头字段不来以任何特定的顺序，所以这是非常危险的。

如在 HTTP规范时，该消息的这部分解释请求或响应的身体不来特定报头字段之后，但<强>的第一个空行之后：

请求（部分5）和反应（第6节）的消息使用通用 RFC 822 [9]的消息格式用于传送实体（的有效载荷的消息）。这两种类型的消息包括一个起始行，零个或多个头字段（也称为 “报头”），一个空行（即，一个什么也没有线CRLF前述）指示报头的末尾字段，和可能的消息体。

您还没有告诉你究竟是如何撰写tBytes，但在这一点上，我觉得你错误地包括在您尝试解压缩数据中的空行。消息主体中的空行的CRLF字符之后开始。

我可能会建议您使用 HttpClient的库，而不是提取邮件正文？

其他提示

那么有我可以看到这里的问题;

int iLength = gzip.read (tByte, 0, 1024);

使用以下来修复;

byte[] buff = new byte[1024]; byte[] emptyBuff = new byte[1024]; StringBuffer unGzipRes = new StringBuffer(); int byteCount = 0; while ((byteCount = gzip.read(buff, 0, 1024)) > 0) { // only append the buff elements that // contains data unGzipRes.append(new String(Arrays.copyOf( buff, byteCount), "utf-8")); // empty the buff for re-usability and // prevent dirty data attached at the // end of the buff System.arraycopy(emptyBuff, 0, buff, 0, 1024); }

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow