Is there a special trick to downloading a zip file and writing it to disk with Python?

StackOverflow https://stackoverflow.com/questions/576238

  •  05-09-2019
  •  | 
  •  

Question

I am FTPing a zip file from a remote FTP site using Python's ftplib. I then attempt to write it to disk. The file write works, however most attempts to open the zip using WinZip or WinRar fail; both apps claim the file is corrupted. Oddly however, when right clicking and attempting to extract the file using WinRar, the file will extract.

So to be clear, the file write will work, but will not open inside the popular zip apps, but will decompress using those same apps. Note that the Python zipfile module never fails to extract the zips.

Here is the code that I'm using to get the zip file from the FTP site (please ignore the bad tabbing, that's not the issue).

filedata = None
def appender(chunk):
    global filedata
    filedata += chunk


def getfile(filename):
  try:
      ftp = None

      try:
          ftp = FTP(address)
          ftp.login('user', 'password')

      except Exception, e:
          print e

      command = 'RETR ' + filename

      idx = filename.rfind('/')
      path = filename[0:idx]
      ftp.cwd(path)
      fileonly = filename[idx+1:len(filename)]

      ftp.retrbinary('RETR ' + filename, appender)

      global filedata
      data = filedata

      ftp.close()

      filedata = ''
      return data

  except Exception, e:
      print e

data = getfile('/archives/myfile.zip')    
file = open(pathtoNTFileShare, 'wb')
file.write(data)
file.close()
Was it helpful?

Solution

Pass file.write directly inside the retrbinary function instead of passing appender. This will work and it will also not use that much RAM when you are downloading a big file.

If you'd like the data stored inside a variable though, you can also have a variable named:

blocks = []

Then pass to retrbinary instead of appender:

blocks.append

Your current appender function is wrong. += will not work correctly when there is binary data because it will try to do a string append and stop at the first NULL it sees.

As mentioned by @Lee B you can also use urllib2 or Curl. But your current code is almost correct if you make the small modifications I mentioned above.

OTHER TIPS

I've never used that library, but urllib2 works fine, and is more straightforward. Curl is even better.

Looking at your code, I can see a couple of things wrong. Your exception catching only prints the exception, then continues. For fatal errors like not getting an FTP connection, they need to print the message and then exit. Also, your filedata starts off as None, then your appender uses += to add to that, so you're trying to append a string + None, which gives a TypeError when I try it here. I'm surprised it's working at all; I would have guessed that the appender would throw an exception, and so the FTP copy would abort.

While re-reading, I just noticed another answer about use of += on binary data. That could well be it; python tries to be smart sometimes, and could be "helping" when you join strings with whitespace or NULs in them, or something like that. Your best bet there is to have the file open (let's call it outfile), and use your appender to just outfile.write(chunk).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top