What's the best compression algorithm for data dumps
-
06-09-2020 - |
Question
I'm creating data dumps from my site for others to download and analyze. Each dump will be a giant XML file.
I'm trying to figure out the best compression algorithm that:
- Compresses efficiently (CPU-wise)
- Makes the smallest possible file
- Is fairly common
I know the basics of compression, but haven't a clue as to which algo fits the bill. I'll be using MySQL and Python to generate the dump, so I'll need something with a good python library.
Solution
GZIP with standard compression level should be fine for most cases. Higher compression levels=more CPU time. BZ2 is packing better but is also slower. Well, there is always a trade-off between CPU consumption/running time and compression efficiency...all compressions with default compression levels should be fine.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow