Ideas on how to retrieve and analyze server logs with python?
-
13-10-2019 - |
Question
To start off, this desktop app is really to give myself an excuse to learn python and how a gui works.
Im trying to help my clients visualize how much bandwidth they are going through, when its happening and where their visitors are. All of this would be displayed by graphs or whatever would be most convienient. (Down the road, I'd like to add cpu/mem usage)
I was thinking the easiest way would be for the app to connect via sftp, download the specified log and then use regex to filter the necessary information.
I was thinking of using :
Python 2.6
Pyside
Paramiko
to start out with. I was looking at twisted for the sftp part but I though maybe keeping it simple for now would be a better choice.
Does this seem right? Should I be trying to use sftp? Or should I try to interact with some subdomain from my site to push the logs to the client? (i.e app.mysite.com)
How about regular expressions to parse the logs?
Solution
sftp
or shelling out to rsync
seems like a reasonable way to retrieve the logs. As for parsing them, regular expressions are what most people tend to use. However, there are other approaches, too. For instance:
- Parse Apache logs to SQLite database
- Using
pyparsing
to parse logs. This one is parsing a different kind of log file, but the approach is still interesting. - Parsing Apache access logs with Python. The author actually wrote a little parser, which is available in an
apachelogs
module.
You get the idea.