Question

Should I prefer binary serialization over ascii / text serialization if performance is an issue?

Has anybody tested it on a large amount of data?

Was it helpful?

Solution

I used boost.serialization to store matrices and vectors representing lookup tables and some meta data (strings) with an in memory size of about 200MByte. IIRC for loading from disk into memory it took 3 minutes for the text archive vs. 4 seconds using the binary archive on WinXP.

OTHER TIPS

Benchmarked it for a problem involving loading a large class containing lots (thousands) of nested archived classes.

To change the format, use archive streams

boost::archive::binary_oarchive
boost::archive::binary_iarchive

instead of

boost::archive::text_oarchive
boost::archive::text_iarchive

The code for loading the (binary) archive looks like:

std::ifstream ifs("filename", std::ios::binary);
boost::archive::binary_iarchive input_archive(ifs);
Class* p_object;
input_archive >> p_object;

The files and walltimes for an optimised gcc build of the above code snippet are:

  • ascii: 820MB (100%), 32.2 seconds (100%).
  • binary: 620MB (76%), 14.7 seconds (46%).

This is from a solid state drive, without any stream compression.

So the gain in speed is larger than the file size would suggest, and you get an additional bonus using binary.

I suggest you look into protobuf - Protocol Buffers if performance is an issue

"Protocol Buffers" from .Net

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top