Question

I develop and maintain a paywalled publication with 2000+ users. The most common support request relates to log in. Most times these can be solved with a couple of support emails. Every once in a while though, there's that odd user that just can't log in. As a last resort the support person resets the users password, verify that they can log in themselves and send the new credentials of to the user. Every now and then we get at user that still can not log in. At that point I'm out of troubleshooting tools.

So I'd like to have a tool that:

  1. Logs all HTTP requests in full (except for users passwords).

  2. Let's me search the log for a POST request to my login page containing the users name.

  3. Let me look at all requests from the IP-address that I found in step 2 within a certain timeframe, and then analyse those requests very closely.

And I need to be able to do smart log rotation, like: "Hang on to everything you can fit into 30 GB, then start throwing out the old stuff".

Our publication happens to be built with Django and nginx, but I don't think that the tool I'm looking for will be specific to those tools. And I definitely don't want to throw all the request data in the same SQL database as my Django app.

So far I've found Logstash, but I haven't look at it closely enough to know if it's right for me. The important thing to me isn't to get nice graph of all usage, user trends, conversion funnels etc. What I need is better ways to troubleshoot a problem that's affecting a single user.

Was it helpful?

Solution

I think the best option is to use a suite of Logstash (event collecting) + Elasticsearch (event storage) + Kibana (analytics). All three are really good opensource projects with a lot of documentation and very active communities.

And if you need commercial support for any you can request help from: http://www.elasticsearch.org/

Logstash its flexible enough to allow you parse many log file formats out of the box. Moreover, storing all your logs on elastic search will allow you to create custom queries, reports and stuff.

You can check a kibana demo on: http://demo.kibana.org/

Links: http://www.elasticsearch.org/overview/kibana/ http://logstash.net/

OTHER TIPS

As a temporary thing this probably does not require any sophisticated solution.

I successfully used this quick and dirty Django middleware for quite similar purpose - https://gist.github.com/Suor/7870909

Have you given Sentry a look? https://getsentry.com/welcome/ and https://github.com/getsentry/raven-python. It might be overkill for what your issue is though. Why not just implement more verbose logging in your authentication methods and setup a separate logger just for authentication failures?

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'verbose': {
            'format': ('[%(levelname)s] - %(asctime)s - %(module)s: '
                       '%(process)d %(thread)d %(message)s')
        },
    },
    'handlers': {
        'auth': {
            'level': 'DEBUG',
            'class': 'logging.handlers.RotatingFileHandler',
            'formatter': 'verbose',
            'filename': os.path.join(someloglocation, 'auth.log'),
            'maxBytes': 1024*1024*3,
            'backupCount': 10,
        },
    },
    'loggers': {
        'auth': {
            'handlers': ['console', 'auth'],
            'level': 'DEBUG',
            'propagate': True,
        },
    },
}

Seems like it'd be easier to change the log level from DEBUG to WARN or vice-versa than to implement some logging setup that's overkill for what your needs are. As far as searching by ip or username goes:

cat auth.log | grep <ip address> | egrep -v ' (200|301|302|304) '
cat auth.log | grep <username> | egrep -v ' (200|301|302|304) '

That, however, requires that you're logging all of that information in the first place. I don't think there's a one-size fits all model for this because with middleware you're going to log everything for every request, not just your authentication views. By logging more verbosely where you're authenticating you'll actually get valid results.

There are a couple of ways this can be handled within django

Add a middleware to the django app that logs the request and all data you need from the request i.e request.POST if it is a POST and request.GET if it is a GET.

In a file called middleware.py

import logging
logger = logging.getLogger('app')

class RequestLoggingMiddleware:
    def process_request(self, request):
        logger.debug(request)
        logger.debug('Logged Request')
        return None

In your settings.py, add middleware.RequestLoggingMiddleware to the MIDDLEWARE_CLASSES.

Somethings related to this are listed on Is there a Django middleware/plugin that logs all my requests in a organized fashion? and https://github.com/kylef/django-request

The other option is to add a log handler that logs requests that error out. The logging level can be changed to debug to log all requests

From official docs

django.request Log messages related to the handling of requests. 5XX responses are raised as ERROR messages; 4XX responses are raised as WARNING messages.

Messages to this logger have the following extra context:

status_code: The HTTP response code associated with the request. request: The request object that generated the logging message.

Add the below handler to your log dict config file.

'django.request':
        {
            'handlers':
                ['mail_admins', 'console', 'file'],
            'level':
                'ERROR',
            'propagate':
                False,
        },

Once you have setup logging requests, and have your logs being collected somewhere, there are different ways to analyze them. I use Logentries which collects my logs and provides an interface where I can filter by time and do a grep like search. Sometimes when this is not adequate, I download the logs as a tar and use a version of Splunk running locally that has better search tools. But as long as you can filter by time and find the appropriate request logs, you should be able to debug what is happening.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top