質問

I'm currently building an analysis application that handles large amounts of data. A typical case would looks like this: the user selects a folder with about 600 measurement files that each contain about 40.000 to 100.000 values. The application reads these values into an object that internally works as a data cache, so that the files must not be read on every access.

This works very well, but I noticed that the memory consumption is very high and it may eventually get too big. During my tests the application crashed when its memory consumption exceeded 2GB of RAM.

The data structure that holds the data is as simple as possible, it basically only consists of some dictionaries that contain the data in a 2-level nested way, nothing complex. I was wondering if there is a convenient way of storing this object in a compressed form in RAM. I know that this would bring down performance, but that is totally acceptable in my case.

Is there a way to do something like that allows me to use my objects as usual? Or do I have to implement compression on my own within my object?

Thanks for your thoughts and recommendations!

役に立ちましたか?

解決

It really depends on the type of that you're working with. One possibility is to compress your objects, keeping them as a compressed byte[] instead of raw object format using an Extension Method.

You could combine that along with making your process work x64 bit:

public static byte[] SerializeAndCompress(this object obj) 
{
    using (MemoryStream ms = new MemoryStream()) 
    using (GZipStream zs = new GZipStream(ms, CompressionMode.Compress, true))
    {
        BinaryFormatter bf = new BinaryFormatter();
        bf.Serialize(zs, obj);
        return ms.ToArray();
    }
}

public static T DecompressAndDeserialize<T>(this byte[] data)
{
    using (MemoryStream ms = new MemoryStream(data)) 
    using (GZipStream zs = new GZipStream(ms, CompressionMode.Decompress, true))
    {
        BinaryFormatter bf = new BinaryFormatter();
        return (T)bf.Deserialize(zs);
    }
}

他のヒント

Code of "Yuval Itzchakov" has an error!; he execute ms.ToArray(); before close the compressor. This will cause error in DecompressAndDeserialize method.

this code will work:

public static byte[] SerializeAndCompress(this object obj)
        {
            using (MemoryStream ms = new MemoryStream())
            {
                using (GZipStream zs = new GZipStream(ms, CompressionMode.Compress, true))
                {
                    BinaryFormatter bf = new BinaryFormatter();
                    bf.Serialize(zs, obj);
                }
                return ms.ToArray();
            }

        }:

For me it only worked when I changed it that way. With the above example I go an "serializationexception is not marked as serializable". Don't forget to add "[Serializable()]" to your class.

public static byte[] SerializeAndCompress(this object obj) 
{
    var ms = new MemoryStream();
    using (GZipStream zs = new GZipStream(ms, CompressionMode.Compress, true))
    {
        BinaryFormatter bf = new BinaryFormatter();
        bf.Serialize(zs, obj);
    }
    return ms.ToArray();
}

public static T DecompressAndDeserialize<T>(this byte[] data)
{
    using (MemoryStream ms = new MemoryStream(data)) 
    using (GZipStream zs = new GZipStream(ms, CompressionMode.Decompress, true))
    {
        BinaryFormatter bf = new BinaryFormatter();
        return (T)bf.Deserialize(zs);
    }
}
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top