Why does one method of string compression return an empty string but another one doesn't?

StackOverflow https://stackoverflow.com/questions/11192407

  •  16-06-2021
  •  | 
  •  

문제

I'm using GZipStream to compress a string, and I've modified two different examples to see what works. The first code snippet, which is a heavily modified version of the example in the documentation, simply returns an empty string.

public static String CompressStringGzip(String uncompressed)
{
    String compressedString;
    // Convert the uncompressed source string to a stream stored in memory
    // and create the MemoryStream that will hold the compressed string
    using (MemoryStream inStream = new MemoryStream(Encoding.Unicode.GetBytes(uncompressed)),
                        outStream = new MemoryStream())
    {
        using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress))
        {
            inStream.CopyTo(compress);
            StreamReader reader = new StreamReader(outStream);
            compressedString = reader.ReadToEnd();
        }
    }
    return compressedString;

and when I debug it, all I can tell is nothing is read from reader, which is compressedString is empty. However, the second method I wrote, modified from a CodeProject snippet is successful.

public static String CompressStringGzip3(String uncompressed)
{
    //Transform string to byte array
    String compressedString;
    byte[] uncompressedByteArray = Encoding.Unicode.GetBytes(uncompressed);

    using (MemoryStream outStream = new MemoryStream())
    {
        using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress))
        {
            compress.Write(uncompressedByteArray, 0, uncompressedByteArray.Length);
            compress.Close();
        }
        byte[] compressedByteArray = outStream.ToArray();
        StringBuilder compressedStringBuilder = new StringBuilder(compressedByteArray.Length);
        foreach (byte b in compressedByteArray)
            compressedStringBuilder.Append((char)b);
        compressedString = compressedStringBuilder.ToString();
    }
    return compressedString;
}

Why is the first code snippet not successful while the other one is? Even though they're slightly different, I don't know why the minor changes in the second snippet allow it to work. The sample string I'm using is SELECT * FROM foods f WHERE f.name = 'chicken';

도움이 되었습니까?

해결책

I ended up using the following code for compression and decompression:

public static String Compress(String decompressed)
{
    byte[] data = Encoding.UTF8.GetBytes(decompressed);
    using (var input = new MemoryStream(data))
    using (var output = new MemoryStream())
    {
        using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
        {
            input.CopyTo(gzip);
        }
        return Convert.ToBase64String(output.ToArray());
    }
}

public static String Decompress(String compressed)
{
    byte[] data = Convert.FromBase64String(compressed);
    using (MemoryStream input = new MemoryStream(data))
    using (GZipStream gzip = new GZipStream(input, CompressionMode.Decompress))
    using (MemoryStream output = new MemoryStream())
    {
        gzip.CopyTo(output);
        StringBuilder sb = new StringBuilder();
        return Encoding.UTF8.GetString(output.ToArray());

    }
}

The explanation for a part of the problem comes from this question. Although I fixed the problem by changing the code to what I included in this answer, these lines (in my original code):

foreach (byte b in compressedByteArray)
            compressedStringBuilder.Append((char)b);

are problematic, because as dlev aptly phrased it:

You are interpreting each byte as its own character, when in fact that is not the case. Instead, you need the line:

string decoded = Encoding.Unicode.GetString(compressedByteArray);

The basic problem is that you are converting to a byte array based on an encoding, but then ignoring that encoding when you retrieve the bytes.

Therefore, the problem is solved, and the new code I'm using is much more succinct than my original code.

다른 팁

You need to move the code below outside the second using statement:

using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress)) 
{ 
    inStream.CopyTo(compress); 
    outStream.Position = 0;
    StreamReader reader = new StreamReader(outStream); 
    compressedString = reader.ReadToEnd(); 
}

CopyTo() is not flushing the results to the underlying MemoryStream.

Update

Seems that GZipStream closes and disposes it's underlying stream when it is disposed (not the way I would have designed the class). I've updated the sample above and tested it.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top