Is there some way to test whether two pkl files have the same data in python?

Question 1

It's not even guaranteed that two objects that compare equal with == pickle the same:

>>> x = (1,)
>>> y = (x, x)
>>> z = ((1,), (1,))
>>> y == z
True
>>> pickle.dumps(y) == pickle.dumps(z)
False
>>> {-1, -2} == {-2, -1}
True
>>> pickle.dumps({-1, -2}) == pickle.dumps({-2, -1})
False

Serializing objects to compare their serialized forms is not a workable general-purpose equality comparison. If you want to define your own concept of equality, writing your own equality comparison function is probably your best bet.

Question 2

If the object doesn't implement __eq__, then it's probably not valid to do an equals comparison.

If you have some way of defining if they are equal, simply define your own comparison method that looks at the attributes of the two objects and returns true if they are equal. I.E.:

 def cmp(obj_a, obj_b):
     return a.att1 == b.att1 and a.att2 == b.att2 ... etc

With respect to Pickle, it makes no guarantees about the contents of its raw data, only that if you unpickle it it will result in the same object.

Question 3

There is a good module called File Compare that I've used a few times. I'm not really a programming whiz so I don't want to give you some wack advice. In my limited experience with this sort of application, the python module works well roughly 90% of the time. Here is the code I used:

  injury_compare =  filecmp.cmp('/Users/MacBookPro15/injuryc', '/Users/MacBookPro15/injury")

  print "injury files are %s" % inury_compare

The compare returns a true/false, but I also think there is something in the module that returns a "+" for a different line so you could also work with that. Basically, if you get a "+" returned the files are different. I could also recommend using the bash/linux utility hexdump which shows you the low level bytes in a pretty spartan bull illustrative fashion. It's simple too....hexdump file1. Even For someone like me who lacks even a modicum of understanding regarding what hexdump outputs, one can still discern some patterns even without knowing exactly what the bytes actually mean.There is also a difference function in bash/linux which I think you run accordingly (not 100 percent sure but it sounds familiar): diff file1 file2

Sorry I can't articulate some of the finer points but I hope something there helps. Good Luck!