So I have a Python program that pulls access logs from remote servers and processes them. There are separate log files for each day. The files on the servers are in this format:
access.log
access.log-20130715
access.log-20130717
The file "access.log" is the log file for the current day, and is modified throughout the day with new data. The files with the timestamp appended are archived log files, and are not modified. If any of the files in the directory are ever modified, it is either because (1) data is being added to the "access.log" file, or (2) the "access.log" file is being archived, and an empty file takes its place. Every minute or so, my program checks for the most recent modification time of any files in the directory, and if it changes it pulls down the "access.log" file and any newly archived files
All of this currently works fine. However, if a lot of data is added to the log file throughout the day, downloading the whole thing over and over just to get some of the data at the end of the file will create a lot of traffic on the network, and I would like to avoid that. Is there any way to only download a part of the file? If I have already processed, say 1 GB of the file, and another 500 bytes suddenly get added to the log file, is there a way to only download the 500 bytes at the end?
I am using Python 3.2, my local machine is running Windows, and the remote servers all run Linux. I am using Chilkat for making SSH and SFTP connections. Any help would be greatly appreciated!