Question

You can use ftplib for full FTP support in Python. However the preferred way of getting a directory listing is:

# File: ftplib-example-1.py

import ftplib

ftp = ftplib.FTP("www.python.org")
ftp.login("anonymous", "ftplib-example-1")

data = []

ftp.dir(data.append)

ftp.quit()

for line in data:
    print "-", line

Which yields:

$ python ftplib-example-1.py
- total 34
- drwxrwxr-x  11 root     4127         512 Sep 14 14:18 .
- drwxrwxr-x  11 root     4127         512 Sep 14 14:18 ..
- drwxrwxr-x   2 root     4127         512 Sep 13 15:18 RCS
- lrwxrwxrwx   1 root     bin           11 Jun 29 14:34 README -> welcome.msg
- drwxr-xr-x   3 root     wheel        512 May 19  1998 bin
- drwxr-sr-x   3 root     1400         512 Jun  9  1997 dev
- drwxrwxr--   2 root     4127         512 Feb  8  1998 dup
- drwxr-xr-x   3 root     wheel        512 May 19  1998 etc
...

I guess the idea is to parse the results to get the directory listing. However this listing is directly dependent on the FTP server's way of formatting the list. It would be very messy to write code for this having to anticipate all the different ways FTP servers might format this list.

Is there a portable way to get an array filled with the directory listing?

(The array should only have the folder names.)

Was it helpful?

Solution

Try to use ftp.nlst(dir).

However, note that if the folder is empty, it might throw an error:

files = []

try:
    files = ftp.nlst()
except ftplib.error_perm, resp:
    if str(resp) == "550 No files found":
        print "No files in this directory"
    else:
        raise

for f in files:
    print f

OTHER TIPS

The reliable/standardized way to parse FTP directory listing is by using MLSD command, which by now should be supported by all recent/decent FTP servers.

import ftplib
f = ftplib.FTP()
f.connect("localhost")
f.login()
ls = []
f.retrlines('MLSD', ls.append)
for entry in ls:
    print entry

The code above will print:

modify=20110723201710;perm=el;size=4096;type=dir;unique=807g4e5a5; tests
modify=20111206092323;perm=el;size=4096;type=dir;unique=807g1008e0; .xchat2
modify=20111022125631;perm=el;size=4096;type=dir;unique=807g10001a; .gconfd
modify=20110808185618;perm=el;size=4096;type=dir;unique=807g160f9a; .skychart
...

Starting from python 3.3, ftplib will provide a specific method to do this:

I found my way here while trying to get filenames, last modified stamps, file sizes etc and wanted to add my code. It only took a few minutes to write a loop to parse the ftp.dir(dir_list.append) making use of python std lib stuff like strip() (to clean up the line of text) and split() to create an array.

ftp = FTP('sick.domain.bro')
ftp.login()
ftp.cwd('path/to/data')

dir_list = []
ftp.dir(dir_list.append)

# main thing is identifing which char marks start of good stuff
# '-rw-r--r--   1 ppsrt    ppsrt      545498 Jul 23 12:07 FILENAME.FOO
#                               ^  (that is line[29])

for line in dir_list:
   print line[29:].strip().split(' ') # got yerself an array there bud!
   # EX ['545498', 'Jul', '23', '12:07', 'FILENAME.FOO']

There's no standard for the layout of the LIST response. You'd have to write code to handle the most popular layouts. I'd start with Linux ls and Windows Server DIR formats. There's a lot of variety out there, though.

Fall back to the nlst method (returning the result of the NLST command) if you can't parse the longer list. For bonus points, cheat: perhaps the longest number in the line containing a known file name is its length.

I happen to be stuck with an FTP server (Rackspace Cloud Sites virtual server) that doesn't seem to support MLSD. Yet I need several fields of file information, such as size and timestamp, not just the filename, so I have to use the DIR command. On this server, the output of DIR looks very much like the OP's. In case it helps anyone, here's a little Python class that parses a line of such output to obtain the filename, size and timestamp.

import datetime

class FtpDir:
    def parse_dir_line(self, line):
        words = line.split()
        self.filename = words[8]
        self.size = int(words[4])
        t = words[7].split(':')
        ts = words[5] + '-' + words[6] + '-' + datetime.datetime.now().strftime('%Y') + ' ' + t[0] + ':' + t[1]
        self.timestamp = datetime.datetime.strptime(ts, '%b-%d-%Y %H:%M')

Not very portable, I know, but easy to extend or modify to deal with various different FTP servers.

This is from Python docs

>>> from ftplib import FTP_TLS
>>> ftps = FTP_TLS('ftp.python.org')
>>> ftps.login()           # login anonymously before securing control 
channel
>>> ftps.prot_p()          # switch to secure data connection
>>> ftps.retrlines('LIST') # list directory content securely
total 9
drwxr-xr-x   8 root     wheel        1024 Jan  3  1994 .
drwxr-xr-x   8 root     wheel        1024 Jan  3  1994 ..
drwxr-xr-x   2 root     wheel        1024 Jan  3  1994 bin
drwxr-xr-x   2 root     wheel        1024 Jan  3  1994 etc
d-wxrwxr-x   2 ftp      wheel        1024 Sep  5 13:43 incoming
drwxr-xr-x   2 root     wheel        1024 Nov 17  1993 lib
drwxr-xr-x   6 1094     wheel        1024 Sep 13 19:07 pub
drwxr-xr-x   3 root     wheel        1024 Jan  3  1994 usr
-rw-r--r--   1 root     root          312 Aug  1  1994 welcome.msg

That helped me with my code.

When I tried feltering only a type of files and show them on screen by adding a condition that tests on each line.

Like this

elif command == 'ls':
    print("directory of ", ftp.pwd())
    data = []
    ftp.dir(data.append)

    for line in data:
        x = line.split(".")
        formats=["gz", "zip", "rar", "tar", "bz2", "xz"]
        if x[-1] in formats:
            print ("-", line)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top