Syncing directories with python's ftplib
-
14-11-2019 - |
Question
I'm learning python and trying to write a code to sync two directories: one is on ftp server, the other is on my local disk. So far, I wrote a working code but I have a question or two about it :)
import os
from ftplib import FTP
h_local_files = [] # create local dir list
h_remote_files = [] # create remote dir list
h_local = 'C:\\something\\bla\\' # local dir
ftp = FTP('ftp.server.com')
ftp.login('user', 'pass')
if os.listdir(h_local) == []:
print 'LOCAL DIR IS EMPTY'
else:
print 'BUILDING LOCAL DIR FILE LIST...'
for file_name in os.listdir(h_local):
h_local_files.append(file_name) # populate local dir list
ftp.sendcmd('CWD /some/ftp/directory')
print 'BUILDING REMOTE DIR FILE LIST...\n'
for rfile in ftp.nlst():
if rfile.endswith('.jpg'): # i need only .jpg files
h_remote_files.append(rfile) # populate remote dir list
h_diff = sorted(list(set(h_remote_files) - set(h_local_files))) # difference between two lists
for h in h_diff:
with open(os.path.join(h_local,h), 'wb') as ftpfile:
s = ftp.retrbinary('RETR ' + h, ftpfile.write) # retrieve file
print 'PROCESSING', h
if str(s).startswith('226'): # comes from ftp status: '226 Transfer complete.'
print 'OK\n' # print 'OK' if transfer was successful
else:
print s # if error, print retrbinary's return
This piece of code should make two python lists: a list of files in local directory and a list of files in ftp directory. After removing duplicates from lists, the script should download 'missing' files to my local directory.
For now, this piece of code is doing what I need, but I have noticed that when I run it my output is not acting how I imagine it would act :)
For example, my current output goes:
PROCESSING 2012-01-17_07.05.jpg
OK
# LONG PAUSE HERE
PROCESSING 2012-01-17_07.06.jpg
OK
# LONG PAUSE HERE
PROCESSING 2012-01-17_07.06.jpg
OK
etc...
but I imagine that it should work like this:
PROCESSING 2012-01-17_07.05.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK
PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK
PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK
etc...
As I said, I just started to learn python, and maybe I'm doing some stuff here completely wrong (if str(s).startswith('226')
????). Maybe I cannot achieve this withftplib
only? So in the end my questions are:
What am I doing wrong here? :)
How to produce 'proper' output and is there a way to print some kind of status while downloading a file (at least a line of dots), for example:
PROCESSING 2012-01-17_07.05.jpg
..........
OK
PROCESSING 2012-01-17_07.06.jpg
......
OK
PROCESSING 2012-01-17_07.06.jpg
...............
OK
etc...
Thanks a lot for helping!
Solution
retrybinary blocks until it is complete. This is why you see Processing ZZZ\n OK
immediately, because it occurs after the call to retrbinary has completed.
If you want to print .
for each call, then you need to provide a callback function to do this. here is the docstring for retrbinary:
"""Retrieve data in binary mode. A new port is created for you.
Args:
cmd: A RETR command.
callback: A single parameter callable to be called on each
block of data read.
blocksize: The maximum number of bytes to read from the
socket at one time. [default: 8192]
rest: Passed to transfercmd(). [default: None]
Returns:
The response code.
"""
So, you need to provide a different callback that both writes the file and prints out '.'
import sys # At the top of your module.
# Modify your retrbinary
ftp.retrbinary('RETR ' + h, lambda s: ftpfile.write(s) and sys.stdout.write('.'))
You may have to edit that snippet of code, but it ought to give you an idea of what to do.