Compression methods

Question 1

All compression methods are like that: the output is a set of parameters for a set algorithms that renders the input, or something similar to the input.

For example MP3 audio codec breaks the input into blocks of 576 samples and converts each block into frequency-amplitude space, and prunes what cannot be heard by a human being. The output is equivalent to "during the next 13 milliseconds play frequencies x,y,z with amplitudes a,b,c". This woks well for audio data, and the similar approach used in JPEG works well for photographic images.

Similar methods can be applied to cases you mention. Sequences such as 987654 or 010409162536 for example are generated by successive values from polynomials, and can be represented as the coefficients of that polynomial, the first one as (9, -1) for 9-x, and the second one as (1,2,1) for 1+2x+x².

The choice of algorithm(s) used to generate the input tends to be fixed for simplicity, and tailored for the use case. For example if you are processing photographic images taken with a digital camera there's little point in even attempting to produce a vectorial output.

Question 2

When trying to losslessly compress some data you always start with creating a model, for example when compressing some text in a human language, you assume, that there are actually not so many words, which repeat over and over. But then, many algorithms try to learn the parameters of the model on the go. Like it doesn't rely on what these words will actually be, it tries to find them for a given input. So the algorithm doesn't rely on the actual language used, but it does rely on the fact, that it is actually a human language, which follows some patterns.

In general, there isn't any perfect algorithm, which can compress anything losslessly, it is mathematically proven. For any algorithm there exist some data for which the compression result is bigger, than the data itself.

Question 3

You can try data de-duplication:http://en.m.wikipedia.org/wiki/Data_deduplication. It's a little different and more intelligent data compression.