Update 6/8/17
Though 3 years passed, my PR is still pending as a temporary solution by enforcing the output order. Stream-Framework might reconsider its design on using content as key for notifications. GitHub Issue #153 references this.
Question
See following sample:
import pickle
x = {'order_number': 'X', 'deal_url': 'J'}
pickle.dumps(x)
pickle.dumps(pickle.loads(pickle.dumps(x)))
pickle.dumps(pickle.loads(pickle.dumps(pickle.loads(pickle.dumps(x)))))
Results:
(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.
(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns.
(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.
Clearly, serialized output changes for every dump. When I remove a character from any of keys, this doesn't happen. I discovered this as Stream-Framework use pickled output as key for storage of notifications on its k/v store. I will pull request if we get a better understanding what is going on here. I have found two solutions to prevent it:
A - Convert to dictionary after sorting (yes, somehow provides the intended side effect)
import operator
sorted_x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))
B - Remove underscores (but not sure if this always works)
So what causes the mystery under dictionary sorting for pickle?
Proof that calling sort over dict provides dump to produce same result:
import operator
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))
pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."
x = pickle.loads(pickle.dumps(x))
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))
pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."