How to manage nohup.out file in Tornado?

Question 1

@jujaro 's answer is quite helpful and I tried logging module in my web service. However, there are still some restrictions to use logging in Tornado. See the other question asked.

As a result, I tried crontab in linux to create a cron job at midnight (use crontab -e in linux shell):

59 23 * * * source /home/zfz/cleanlog.sh

This cron job launches my script cleanlog.sh at 23:59 everyday.

The contents of clean.sh:

fn=$(date +%F_service_log.out)
cat /home/zfz/nohup.out >> "/home/zfz/log/$fn"
echo '' > /home/zfz/nohup.out

This script creates a log file with the date of the day ,and echo '' clears the nohup.out in case it grows to large. Here are my log files split from nohup.out by now:

-rw-r--r-- 1 zfz zfz  54474342 May 22 23:59 2013-05-22_service_log.out
-rw-r--r-- 1 zfz zfz  23481121 May 23 23:59 2013-05-23_service_log.out

Question 2

Yes, there is. Put a cron-job in effect, which truncates the file (by something like "cat /dev/null > nohup.out"). How often you will have to run this job depends on how much output your process generates.

But if you do not need the output of the job altogether (maybe its garbage anyways, only you can answer that) you could prevent writing to file nohup.out in first place. Right now you start the process in a way like this:

nohup command &

replace this by

nohup command 2>/dev/null 1>/dev/null &

and the file nohup.out won't even get created.

The reason why the output of the process is being directed to a file is:

Normally all processes (that is: commands you enter from commandline, there are exceptions, but they don't matter here) are attached to a terminal. Per default (this is how Unix is handling this) this is something which can display text and is connect to the host via a serial line. If you enter a command and switch off the terminal you entered it from the process gets terminated too - because it lost its terminal. Because in serial communication the technicians traditionally employed the words from telephone communication (where it came from) the termination of a communication was not called an "interruption" or "termination" but a "hangup". So programs got terminated on "hangups" and to program to prevent this was "nohup", the "no-termination-upon-hangup"-program.

But as it may well be that such an orphaned process has no terminal to write to it nohup uses the file nohup.out as a "screen-replacement", redirecting the output there, which would normally go to the screen. If a command has no output whatsoever though nohp.out won't get created.

Question 3

Simple:

nohup python my_service.py >service_log_1.txt 2>&1 &

Question 4

If the output is being produced by some library that you are using and you don't have control over it, you can re-direct the standard output to a file when the program starts. Then you can close and re-open the file periodically to prevent the files to grow forever. You can use a timestamp for the name of the file so it is always diferent. Something like:

import sys
import datetime
current_stdout = None
def reset_stdout():
    global current_stdout
    if current_stdout:
        current_stdout.close()
    sys.stdout = open(
        "{}.log".format(
            datetime.datetime.now().strftime('%Y%m%d%H%M%S')
            ),
        "w")
    current_stdout = sys.stdout
reset_stdout()

# Then call this every day.. week ..
reset_stdout()

If you prefer to reset the file depending on the size of it, you can monitor periodically its size with something like this:

if current_stdout.tell() > 1000000:
    reset_stdout()

But if you do have control over the output that you are sending, I strongly recommend to use the logging library. It has a lot of flexibility on what you can do with the output. The messages sent to the log can be processed by objects called handlers. One of the handlers included in the library do what you want, is called the RotatingFileHandler. From the documentation:

class logging.handlers.RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=0)

"You can use the maxBytes and backupCount values to allow the file to rollover at a predetermined size. When the size is about to be exceeded, the file is closed and a new file is silently opened for output. Rollover occurs whenever the current log file is nearly maxBytes in length; if maxBytes is zero, rollover never occurs."

So you can do all the logging using something like this:

import logging
import logging.handlers
# Create a logger
log = logging.getLogger('example')
# Set the level of the logger. By doing this all the messages with a "level" of INFO or higher will be 
# sent to the log
log.setLevel(logging.INFO)
# Create the handler and set 20 files to rotate and a maximum size of 1MB
handler = logging.handlers.RotatingFileHandler('log',maxBytes = 1000000,backupCount=20)
# Now attach the handler to the logger object
log.addHandler(handler)

# Now you can send your output like:
log.info('text text text text text text text text text')
log.info(....