My guess is that the html file has the text representation of the data instead of the actual binary data in the file itself.
For instance take a look at the following code:
>>> t = '\x80'
>>> print t
>>> '\x80'
But say I create a text file with the contents \x80
and do:
with open('file') as f:
t = f.read()
print t
I would get back:
'\\x80'
If this is the case, you could use eval to get the desired result:
result = bz2.decompress(eval('"'+parsedString'"'))
Just make sure that you only do this for trusted data.