Вопрос

My supervisor wants me to fix the "concurrency problem in logs", by which he means that the log files we are producing have mixed timestamps at the beginning/end of different files. That is:

  • First log file has at the end :

    [03/Dec/2013:13:55:19]---------------------
    [03/Dec/2013:13:55:20]---------------------
    [03/Dec/2013:13:55:20]---------------------
    
  • Second file starts from:

    [03/Dec/2013:13:40:16]---------------------
    [03/Dec/2013:13:40:16]---------------------
    [03/Dec/2013:13:40:23]---------------------
    

We use rotational file handler and the second file should have timestamps starting when the first one ends, but it doesn't. How can I make the timestamps flush into logs in right order during rotation of files?

"Logger" class, which just uses Python logging module:

class logger:
    def __init__(self, logger_name='prod'):
        self.error_logger = logging.getLogger(logger_name+'_error')

    def error(self, msg='', level='error'):
        if msg:
            getattr(self.error_logger,level)(msg)

    def log(self, msg='', level='info'):
        if msg:
            getattr(self.error_logger,level)(msg)

Formatting of the logs:

class our_formatter(logging.Formatter):

def find_topmost_stack_frame(self):
    i = 0
    stack = []
    while True:
        try:
            fr = sys._getframe(i)
            if fr.f_code.co_name == '__call__':
                    break
            stack.append(fr)
        except:
            break
        i += 1
    return "%s:%s" % (stack[-4].f_code.co_filename, stack[-4].f_lineno)

def format(self, record):
    try:
        if record.done:
                return record.msg
    except:
        record.done = False

    rtime = time.strftime("%d/%b/%Y:%H:%M:%S", time.localtime(record.created))
    from tools.user_management import user_pack
    email = user_pack().get_email()

    if record.levelno > 20:
        if email:
            record.msg = '[%s][user:%s][%s] {%s} %s' % ( rtime, email, record.levelname, self.find_topmost_stack_frame(),
                                                         record.msg)
        else:
            record.msg = '[%s][%s] {%s} %s' % ( rtime, record.levelname, self.find_topmost_stack_frame(), record.msg)
    else:
        if email:
            record.msg = '[%s][user:%s][%s] %s' % ( rtime, email, record.levelname, record.msg)
        else:
            record.msg = '[%s][%s] %s' % ( rtime, record.levelname, record.msg)

    record.done = True
    return logging.Formatter.format(self, record)

And finally the configuration of logger:

log = cherrypy.log
log.error_file = None

maxBytes = getattr(log, "rot_maxBytes", 10000000)
backupCount = getattr(log, "rot_backupCount", 1000)
fname = getattr(log, "rot_error_file", "logs/error.log")

logger = logging.getLogger()
logger.setLevel(0)

# Make a new RotatingFileHandler for the error log.
h = logging.handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setFormatter(rest_formatter())
log.error_log.addHandler(h)

# set up custom ReST logger
logger = logging.getLogger("rest_error")
logger.addHandler(h)

# set up our custom logger
ha = logging.handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
ha.setFormatter(our_formatter())
logger = logging.getLogger("prod_error")
logger.addHandler(ha)

The application is multithreaded, however the built-in logging should thread-safe (I was reading some parts of its code today and it surely has some use of locks).

The problem is only between beginning of one file and end of the previous one (not in the middle), so I think it's some case of reservation of file space by logger, however I understand it should still keep the right order, as there should be only one instance of the logger per file handler specified.

We have A LOT of logging. By a lot I mean there are often 10+ logs every second.

Это было полезно?

Решение

The problem laid in multiple handlers for single file. Loggers used different handlers and so they were trying to write in the same file at the same time, occasionally causing a new file to be created (and then they were writing to two distinguish files for some period of time).

Removing the "ha" handler seems to have fix the problem!

Другие советы

Reading your question, it seems to me that you need to flush the writing buffer.

To oblige the system to write something in a file without waiting it (the system) triggers by itself the writing of the buffer's content, it's like that:

from os import fsync
with open(filename,'a') as fh:
    fh.write(something)
    fh.flush()
    fsync(fh.fileno())
    # continued code

It's a necessary procedure when something is a tiny quantity to write.

However, when a file is closed, normally the writing buffer is emptied and it's content is written in the file before it is entirely closed.
Si, I don't know if this answer really brings something usefull for you.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top