Python: Get single file in a TAR from FTP

Question

This answer is not specific to python, because the problem is not specific to python: In theory you could read the part of the Tar-file where your data are. With FTP (and also with pythons ftplib) this is possible by doing first a REST command to specify the start position in the file, then RETR to start the download of the data and after you got the amount of data you need you can close the data connection.

But, Tar is a file format without a central index, e.g. each file in Tar is prefixed with a small header with information about name, size and other. So to get a specific file you must read the first header, check if it is the matching file and if it is not you skip the size of the unwanted file and try with the next one. With lots of smaller files in Tar this will be less effective than downloading the complete file (or at least downloading up to the relevant part - you might parse the file while downloading) because all these new data connections for each read cause lots f overhead. But if you have large files in the Tar this might work.

But, you are completely out of luck if it is not a TAR (*.tar), but a TGZ (*.tgz or *.tar.gz) file. These are compressed Tar-files and to get any part of the file you would need to decompress everything you have before. So in this case there is no way around downloading the file or at least downloading everything up to the relevant part.