Question

I'm learning python and trying to write a code to sync two directories: one is on ftp server, the other is on my local disk. So far, I wrote a working code but I have a question or two about it :)

import os
from ftplib import FTP

h_local_files = [] # create local dir list
h_remote_files = [] # create remote dir list

h_local = 'C:\\something\\bla\\' # local dir

ftp = FTP('ftp.server.com')
ftp.login('user', 'pass')

if os.listdir(h_local) == []:
    print 'LOCAL DIR IS EMPTY'
else:
    print 'BUILDING LOCAL DIR FILE LIST...'
    for file_name in os.listdir(h_local):
        h_local_files.append(file_name) # populate local dir list

ftp.sendcmd('CWD /some/ftp/directory')
print 'BUILDING REMOTE DIR FILE LIST...\n'
for rfile in ftp.nlst():
    if rfile.endswith('.jpg'): # i need only .jpg files
        h_remote_files.append(rfile) # populate remote dir list

h_diff = sorted(list(set(h_remote_files) - set(h_local_files))) # difference between two lists

for h in h_diff:
    with open(os.path.join(h_local,h), 'wb') as ftpfile:
        s = ftp.retrbinary('RETR ' + h, ftpfile.write) # retrieve file
        print 'PROCESSING', h
        if str(s).startswith('226'): # comes from ftp status: '226 Transfer complete.'
            print 'OK\n' # print 'OK' if transfer was successful
        else:
            print s # if error, print retrbinary's return

This piece of code should make two python lists: a list of files in local directory and a list of files in ftp directory. After removing duplicates from lists, the script should download 'missing' files to my local directory.

For now, this piece of code is doing what I need, but I have noticed that when I run it my output is not acting how I imagine it would act :)

For example, my current output goes:

PROCESSING 2012-01-17_07.05.jpg
OK

# LONG PAUSE HERE

PROCESSING 2012-01-17_07.06.jpg
OK

# LONG PAUSE HERE

PROCESSING 2012-01-17_07.06.jpg
OK

etc...

but I imagine that it should work like this:

PROCESSING 2012-01-17_07.05.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

etc...

As I said, I just started to learn python, and maybe I'm doing some stuff here completely wrong (if str(s).startswith('226')????). Maybe I cannot achieve this withftplib only? So in the end my questions are:

What am I doing wrong here? :)
How to produce 'proper' output and is there a way to print some kind of status while downloading a file (at least a line of dots), for example:

PROCESSING 2012-01-17_07.05.jpg
..........
OK

PROCESSING 2012-01-17_07.06.jpg
......
OK

PROCESSING 2012-01-17_07.06.jpg
...............
OK

etc...

Thanks a lot for helping!

Was it helpful?

Solution

retrybinary blocks until it is complete. This is why you see Processing ZZZ\n OK immediately, because it occurs after the call to retrbinary has completed.

If you want to print . for each call, then you need to provide a callback function to do this. here is the docstring for retrbinary:

    """Retrieve data in binary mode.  A new port is created for you.

    Args:
      cmd: A RETR command.
      callback: A single parameter callable to be called on each
                block of data read.
      blocksize: The maximum number of bytes to read from the
                 socket at one time.  [default: 8192]
      rest: Passed to transfercmd().  [default: None]

    Returns:
      The response code.
    """

So, you need to provide a different callback that both writes the file and prints out '.'

import sys # At the top of your module.

# Modify your retrbinary    
ftp.retrbinary('RETR ' + h, lambda s: ftpfile.write(s) and sys.stdout.write('.'))

You may have to edit that snippet of code, but it ought to give you an idea of what to do.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top