Вопрос

Can someone explain this?

pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00h\x00\x86q\x01.') == pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00X\x01\x00\x00\x00.q\x01\x86q\x02.')
>>>True

pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00h\x00\x86q\x01.')
>>>('.', '.')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00.q\x00X\x01\x00\x00\x00.q\x01\x86q\x02.')
>>>('.', '.')

There seems to be a long and short pickled version of tuples with the same element repeatedly.

Other examples:

pickle.loads(b'\x80\x03X\x01\x00\x00\x00#q\x00X\x01\x00\x00\x00#q\x01\x86q\x02.')
>>>('#', '#')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00#q\x00h\x00\x86q\x01.')
>>>('#', '#')

pickle.loads(b'\x80\x03X\x01\x00\x00\x00$q\x00X\x01\x00\x00\x00$q\x01\x86q\x02.')
>>>('$', '$')
pickle.loads(b'\x80\x03X\x01\x00\x00\x00$q\x00h\x00\x86q\x01.')
>>>('$', '$')

I'm trying to index items by their pickle but I'm not finding the items because their pickles seem to be changing.

I'm using Python 3.3.2 on Ubuntu.

Это было полезно?

Решение

Pickles aren't unique; the pickle format is actually a tiny little programming language, and different programs (pickles) can produce the same output (unpickled object). From the docs:

Since the pickle data format is actually a tiny stack-oriented programming language, and some freedom is taken in the encodings of certain objects, it is possible that the two modules [pickle and cPickle] produce different data streams for the same input objects. However it is guaranteed that they will always be able to read each other’s data streams.

There's even a pickletools.optimize function that will take a pickle and output a better pickle. You're going to need to redesign your program.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top