Question

I am working on analyzing data of a log file on frequent intervals and process accordingly. The log file which is the input, is an infinitely growing file. A long running process writes to it and it belongs to the root user.

I have all the file permissions for the log file. What I want to do is to move only the file contents until that point(take the file contents and clear the file) without disturbing the another process. Preferably through a python script.

[EDIT] (i.e)., I need to cut & paste all the contents from the log file(primary) until that point of time and put them into another(secondary) log file. I will use this secondary log file for my data analysis. In the mean time, if the long running process writes anything to the primary log file, it should not be lost. It will not be a problem, if I take the new data to the secondary log file along with the other contents.

[EDIT 2] The main problem I face is to clear the file contents once they are fetched from the primary log file. I need to ensure that any log written to the file will not be lost while I read from the primary log & write them in the secondary log and remove those contents from the file.

I looked into the TimedRotatingFileHandler but it doesn't help me in this regard. Any other suggestions?

Thanks

Was it helpful?

Solution

The linux way to tail a file is simple. Use this command on your log file as soon as the logging process starts:

  tail -f log_file_name.log >> /tmp/new_file_name.log &


[EDIT] tail -f log_file_name.log >> /tmp/new_file_name.log | tail -f /tmp/new_file_name.log | xargs -I TailOutput echo sed -i '/TailOutput/d' log_file_name.log

Then you can use this new_file_name.log to do whatever you want to do with this new file. Also your original log file is intact. I understand this is getting little twisted, but that's the way I can think now!!!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top