Question

I'm having some problems with python's built-in library gzip. Looked through almost every other stack question about it, and none of them seem to work.

MY PROBLEM IS THAT WHEN I TRY TO DECOMPRESS I GET THE IOError

I'm Getting:

Traceback (most recent call last):
File "mymodule.py", line 61, in
  return gz.read()
File "/usr/lib/python2.7/gzip.py", line 245,
  readself._read(readsize)
File "/usr/lib/python2.7/gzip.py", line 287, in
  _readself._read_gzip_header()
File "/usr/lib/python2.7/gzip.py", line 181, in
  _read_gzip_header
raise IOError, 'Not a gzipped file'IOError: Not a gzipped file

This is my code to send it over the network, it might not make sense why i do things, but it's normally in a while loop and memory efficient, I just simplified it.

buffer = cStringIO.StringIO(output) #output is from a subprocess call
small_buffer = cStringIO.StringIO()
small_string = buffer.read() #need a string to write to buffer 
gzip_obj = gzip.GzipFile(fileobj=small_buffer,compresslevel=6, mode='wb')
gzip_obj.write(small_string)
compressed_str = small_buffer.getvalue()

blowfish = Blowfish.new('abcd', Blowfish.MODE_ECB)
remainder = '|'*(8 - (len(compressed_str) % 8))
compressed_str += remainder
encrypted = blowfish.encrypt(compressed_str)
#i send it over smb, then retrieve it later

Then this is the code that retrieves it:

#buffer is a cStringIO object filled with data from  retrieval
decrypter = Blowfish.new('abcd', Blowfish.MODE_ECB)
value = buffer.getvalue()
decrypted = decrypter.decrypt(value)
buff = cStringIO.StringIO(decrypted)
buff.seek(0)
gz = gzip.GzipFile(fileobj=buff)
return gz.read()

Here's the problem

return gz.read()

Was it helpful?

Solution

Edit: I think... you forgot to remove the padding before unzipping it. The code below works for me and gives me the same error if I don't remove the padding.

Edit 2: Padding specification: The way you are doing padding I think requires you to pass the size of the padding since I assume the encryption algorithm could use the pipe character also. According to RFC 3852, Section 6.3, you should pad with the binary representation (not ASCII numbers) of the number of padding bytes needed. I have updated the code below to do my interpretation of the specification.

import gzip
import cStringIO
from Crypto.Cipher import Blowfish

#gzip and encrypt
small_buffer = cStringIO.StringIO()
small_string = "test data"
with gzip.GzipFile(fileobj=small_buffer,compresslevel=6, mode='wb') as gzip_obj:
    gzip_obj.write(small_string)
compressed_str = small_buffer.getvalue()
blowfish = Blowfish.new('better than bad')
#remainder = '|'*(8 - (len(compressed_str) % 8))
pad_bytes = 8 - (len(compressed_str) % 8)
padding = chr(pad_bytes)*pad_bytes
compressed_str += padding
encrypted = blowfish.encrypt(compressed_str)
print("encrypted: {}".format(encrypted))



#decrypt and ungzip (pretending to be in a separate space here)
value = encrypted
blowfish = Blowfish.new('better than bad')
decrypted = blowfish.decrypt(value)
buff = cStringIO.StringIO(decrypted)
buff.seek(-1,2) #move to the last byte
pad_bytes = ord(buff.read(1)) #get the size of the padding from the last byte
buff.truncate(len(buff.getvalue()) - pad_bytes) #probably a better way to do this.
buff.seek(0)
with gzip.GzipFile(fileobj=buff) as gz:
    back_home = gz.read()
print("back home: {}".format(back_home))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top