Custom Scheduler to have sequential + semi-sequential scripts with timeouts/kill switches?

Question 1

If I understood what you are trying to do, subprocess.Popen() is the way to go. Here's a simple class which I think can provide all functionality you want:

from time import sleep
import subprocess
import datetime
import os

class Worker:

    def __init__(self, cmd):

        print datetime.datetime.now(), ":: starting subprocess :: %s"%cmd
        self.cmd = cmd
        self.log = "[running :: %s]\n"%cmd
        self.subp = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        self.start_time = datetime.datetime.now()

    def wait_to_finish(self, timeout_seconds = None):

        while True:
            retcode = self.subp.poll()
            if retcode is not None:
                self.get_process_output()
                self.log += "\n[subprocess finished, return code: %d]\n"%retcode
                print datetime.datetime.now(), ":: subprocess %s exited, retcode=%d"%(self.cmd, retcode)
                return
            else:
                # process hasn't finished yet
                sleep(1)
                if timeout_seconds is not None:
                    cur_time = datetime.datetime.now()
                    if (cur_time - self.start_time).seconds > timeout_seconds:
                        print datetime.datetime.now(), ":: subprocess %s :: killing after %d seconds"%(self.cmd, timeout_seconds)
                        self.kill()
                        return

    def still_running(self):
        return (self.subp.poll() is None)

    def kill(self):
        self.subp.terminate()
        self.get_process_output()
        self.log += "\n[subprocess killed by explicit request]\n"
        return

    def get_process_output(self):
        out, err = self.subp.communicate()
        self.log += out
        self.log += err

You give the command and the class starts it in the background. You can then wait while it finishes, with optional timeout (counted from the time process has been started). You can get process output, and if needed explicitly kill the process.

Here's just a quick example showing it's functionality:

# Start two subprocesses in the background
worker1 = Worker([r'c:\python26\python.exe', 'sub1.py'])
worker2 = Worker([r'c:\python26\python.exe', 'sub2.py'])

# Wait for both to finish, kill after 10 seconds timeout
worker1.wait_to_finish(timeout_seconds = 10)
worker2.wait_to_finish(timeout_seconds = 10)

# Start another subprocess giving it 5 seconds to finish
worker3 = Worker([r'c:\python26\python.exe', 'sub3.py'])
worker3.wait_to_finish(timeout_seconds = 5)

print "----LOG1----\n" + worker1.log
print "----LOG2----\n" + worker2.log
print "----LOG3----\n" + worker3.log

sub1.py:

from time import sleep
print "sub1 output: start"
sleep(5)
print "sub1 output: finish"

sub2.py:

print "sub2 output: start"
erroneous_command()

sub3.py:

from time import sleep
import sys
print "sub3 output: start, sleeping 15 sec"
sys.stdout.flush()
sleep(15)
print "sub3 output: finish"

Here's the output:

2013-11-06 15:31:17.296000 :: starting subprocess :: ['c:\\python26\\python.exe', 'sub1.py']
2013-11-06 15:31:17.300000 :: starting subprocess :: ['c:\\python26\\python.exe', 'sub2.py']
2013-11-06 15:31:23.306000 :: subprocess ['c:\\python26\\python.exe', 'sub1.py'] exited, retcode=0
2013-11-06 15:31:23.309000 :: subprocess ['c:\\python26\\python.exe', 'sub2.py'] exited, retcode=1
2013-11-06 15:31:23.310000 :: starting subprocess :: ['c:\\python26\\python.exe', 'sub3.py']
2013-11-06 15:31:29.314000 :: subprocess ['c:\\python26\\python.exe', 'sub3.py'] :: killing after 5 seconds
----LOG1----
[running :: ['c:\\python26\\python.exe', 'sub1.py']]
sub1 output: start
sub1 output: finish

[subprocess finished, return code: 0]

----LOG2----
[running :: ['c:\\python26\\python.exe', 'sub2.py']]
sub2 output: start
Traceback (most recent call last):
  File "sub2.py", line 2, in <module>
    erroneous_command()
NameError: name 'erroneous_command' is not defined

[subprocess finished, return code: 1]

----LOG3----
[running :: ['c:\\python26\\python.exe', 'sub3.py']]
sub3 output: start, sleeping 15 sec

[subprocess killed by explicit request]

As far as implementing the scheduling goes, I can suggest couple of options but the choice really depends on what your task is:

1) If you can specify the precise scheduling at any point in time, then you can implement a fully synchronous scheduler:

while True:
    # check time
    # check currently running processes :: workerX.still_running()
    #   -> if some are past their timeout, kill them workerX.kill()
    # start new subprocesses according to your scheduling logic
    sleep(1)

2) If you have several well-defined sequences of scripts which you want just "fire-and-forget" every 10 seconds, then put each sequence in its own .py script (with 'import Worker'), and start all sequences every 10 seconds, also periodically checking which sequences have exited to collect their logs.

3) If your sequences are defined dynamically and you prefer "fire-and-forget" approach, then threads would be best approach.

Question 2

As you already indicated in your question, you are actually asking two different questions (running in background, and enforcing a timeout). Fortunately, the short answer for both is one and the same:

Use Plumbum!

Plumbum greatly simplifies shell-scripting-like elements of your python script, and among other things, provides clean interfaces for running commands in the background, and for enforcing timeouts.

Below is an example using plumbum for this.

In this example, the subprocesses will all run the same script -- subscript1.py. It does some printing, some sleeping, and it sometimes fails, randomly.

subscript1.py

import os, sys, time, random
print '[pid=%s] STARTING %s' % (os.getpid(), sys.argv[0])
for i in range(3):
    t = random.randint(1,5)
    print '[pid=%s] sleeping for %s seconds' % (os.getpid(), t)
    time.sleep(t)
# fail randomly
if t == 5:
    raise RuntimeError('random error...')
print '[pid=%s] DONE %s' % (os.getpid(), sys.argv[0])

Now, the main script below, main.py, demonstrates how to run subprocesses, in the foreground and background, with and without a timeout, wait for background processes to finish, and handle subprocess errors and timeouts.

main.py

import os, sys, time
from plumbum import FG, BG, ProcessExecutionError, ProcessTimedOut
from plumbum.cmd import python

cmd = python['subscript1.py']  # create the command to run (several times)

def run_subscript(cmd, is_bg = False):
    print '[pid=%s] main running command: %s (is_bg=%s)' % (os.getpid(), cmd, is_bg)
    if is_bg:
        return (cmd > sys.stdout) & BG  # run in background
    else:
        try:
            return cmd & FG  # run in foreground
        except ProcessExecutionError, e:
            print >>sys.stderr, e

# run a process in the foreground        
run_subscript(cmd, is_bg = False)

# run two processes in the background, and one in the foreground
bg_proc1 = run_subscript(cmd, is_bg = True)
time.sleep(1)
bg_proc2 = run_subscript(cmd, is_bg = True)
time.sleep(1)
run_subscript(cmd, is_bg = False)

# wait for the background processes to finish
for bg_proc in ( bg_proc1, bg_proc2 ):
    try:
        bg_proc.wait()
    except ProcessExecutionError, e:
        print >>sys.stderr, e

# run a foreground process, which will time out
print '[pid=%s] main running command: %s (will time out)' % (os.getpid(), cmd)
try:
    cmd.run(timeout = 2)
except ProcessTimedOut, e:
    # command timed out
    print >>sys.stderr, e
except ProcessExecutionError, e:
    # command failed (but did not time out)
    print >>sys.stderr, e

Output:

% python main.py
[pid=77311] main running command: /usr/local/bin/python subscript1.py (is_bg=False)
[pid=77314] STARTING subscript1.py
[pid=77314] sleeping for 1 seconds
[pid=77314] sleeping for 5 seconds
[pid=77314] sleeping for 3 seconds
[pid=77314] DONE subscript1.py
[pid=77311] main running command: /usr/local/bin/python subscript1.py (is_bg=True)
[pid=77316] STARTING subscript1.py
[pid=77316] sleeping for 5 seconds
[pid=77311] main running command: /usr/local/bin/python subscript1.py (is_bg=True)
[pid=77317] STARTING subscript1.py
[pid=77317] sleeping for 1 seconds
[pid=77311] main running command: /usr/local/bin/python subscript1.py (is_bg=False)
[pid=77317] sleeping for 5 seconds
[pid=77318] STARTING subscript1.py
[pid=77318] sleeping for 5 seconds
[pid=77316] sleeping for 2 seconds
[pid=77316] sleeping for 4 seconds
[pid=77317] sleeping for 5 seconds
[pid=77318] sleeping for 2 seconds
[pid=77318] sleeping for 3 seconds
[pid=77316] DONE subscript1.py
[pid=77318] DONE subscript1.py
Command line: ['/usr/local/bin/python', 'subscript1.py']
Exit code: 1
Stderr:  | Traceback (most recent call last):
         |   File "subscript1.py", line 13, in <module>
         |     raise RuntimeError('random error...')
         | RuntimeError: random error...
[pid=77311] main running command: /usr/local/bin/python subscript1.py (will time out)
('Process did not terminate within 2 seconds', ['/usr/local/bin/python', 'subscript1.py'])

EDIT:

I now realize my sample code does not demonstrate running a command in the background and enforcing a timeout on it. For that, simply use cmd.bgrun(...) instead of cmd.run(...).

The error you are getting is about the redirection, and must be related to the fact you are running on Windows. This is either a compatability problem of plumbum on Windows, or my code might not be perfect, i.e. there may be another way to use plumbum to make it work. Unfortunately, I don't have a windows machine to test it on...

I hope this helps.

Custom Scheduler to have sequential + semi-sequential scripts with timeouts/kill switches?

Example of what I want the script to be able to do...

Prints and Traceback for shx2