Question

Suppose we have file A that has been compressed by the the method B and the output-file is C, now if I am not wrong We can not compress C more by method B, but there might another method=algorithm D that might compress C more and produce compressed file E.

Is this kind method of using more than one compression-method in each iteration of compression standard? if there is a paper/survey paper on performance of such kind of compression, please, comment or post an answer.

I mean, is there any study where 2 or 3 methods were apply to same file, if there is a paper/survey paper on performance of such kind of compression, please, comment or post an answer.

Was it helpful?

Solution

Here's a partial "yes" answer to your question.

Consider a file containing:

0000000000
0000000001
0000000010
0000000011
0000000100
...
1111111111

That is, it is all binary numbers that are $b$ bits long (in this case $b=10$, but in the exercise that follows, try something smaller), in order, with each number separated by (say) a newline character.

The standard BWT compression algorithm (e.g. the one that BZip is based on) performs a Burrows-Wheeler transform, followed by move-to-front coding, then followed by some kind of entropy coding (possibly using run-length encoding).

This file compresses "better" if you apply the BWT twice, applying the following stages to the BWT of the BWT instead of just the BWT. I encourage you to compute both BWTs on a small example to see why.

Now this is only one stage of the full compression algorithm, and so we haven't found a file that compresses better if you compress it twice. But this BWT "adversary" is certainly suggestive.

I don't know if there are any papers on this. I discovered this one a few years ago (but I'm sure I'm not the only person who has noticed it).

OTHER TIPS

The essential problem is that most files are NOT compressible (see the counting argument). And an already compressed file is much less likely to be compressible.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top