How to you check the status or kill an external process with python

https://stackoverflow.com/questions/8711599

13-04-2021
|

Вопрос

I have a python script that runs on my web server. The main function is called then when it returns it just sleeps for a few seconds and gets called again. It's purpose is to pick up any new uploaded videos that users have added and convert them to webm, pull out the middle frame as an image and a bunch of other funky stuff. I am using an external call to ffmpeg. The code clip below shows how I call it.

    duration = output[durationIndex+10:durationIndex+18]
    durationBits = duration.split(":")
    lengthInSeconds = (int(durationBits[0])*60*60) + (int(durationBits[1])*60) + (int(durationBits[2]))

    child = subprocess.Popen(["ffmpeg","-y","-i",sourceVideo,"-f","mjpeg","-vframes","1","-ss",str(lengthInSeconds/2),destination], shell=True, stderr=subprocess.PIPE)
    output = ""
    while True:
        out = child.stderr.read(1)
        if out == '' and child.poll() != None:
            break
        if out != '':
            output += out

    updateSQL = "update `videos_graduatevideo` set thumbnail = '" + str(destination) + "' where `original_video` = '" + sourceVideo + "'"
    cursor.execute(updateSQL)

This script is running on a Windows machine atm but I will probably deploy it on a Unix system when it is dev complete.

The problem is. I need this python script to keep running. If something goes wrong with ffmpeg and my script hangs, user uploaded videos will just sit in a "pending" status until I go poke the python script. I know a certain mov file I have makes ffmpeg hang indefinately. Is there someway I can check how long a process has been running and then kill it off if it has been running for too long?

Решение

I agree with S. Lott in that it would seem you'd benefit from considering a MQ for your architecture, but for this particular issue I think your use of Popen is OK.

For each process you create, save the creating time (something like datetime.datetime.today() would suffice). Then every minute or so go over the list of open processes and times and reap the ones that shouldn't be there using Popen.send_signal(signal), terminate(), or kill().

Example:

import time
from subprocess import Popen
from datetime import datetime
jobs = []
max_life = 600 # in seconds

def reap_jobs(jobs):
  now = datetime.datetime.today()
  for job in jobs:
    if job[0] < now - datetime.timedelta(seconds=max_life)
      job[1].kill()
      # remove the job from the list if you want. 
      # but remember not to do it while iterating over the list

for video in list_of_videos:
  time = datetime.datetime.today()
  job = Popen(...)
  jobs.append((time,child))

while True:
  reap_jobs(jobs)
  time.sleep(60)

Другие советы

Since the controlling script is the one that started it, and since you want it killed based on time, not system resource useage, it should be fairly simple. Below is your example code with some modifications; look for the lines with comments.

import time
timeout = 60 #child is allowed to run for 1 minute.
duration = output[durationIndex+10:durationIndex+18]
durationBits = duration.split(":")
lengthInSeconds = (int(durationBits[0])*60*60) + (int(durationBits[1])*60) + (int(durationBits[2]))

child = subprocess.Popen(["ffmpeg","-y","-i",sourceVideo,"-f","mjpeg","-vframes","1","-ss",str(lengthInSeconds/2),destination], shell=True, stderr=subprocess.PIPE)
killtime = time.time() + timeout #timestamp after which the child process should be killed
output = ""
while True:
    out = child.stderr.read(1)
    if out == '' and child.poll() != None:
        break
    if out != '':
        output += out
    if time.time() > killtime: #check if 60 seconds have passed
        child.kill() #tell the child to exit
        raise RuntimeError("Child process still going %i seconds after launch" %killtime) #raise an exception so that updateSQL doesn't get executed

updateSQL = "update `videos_graduatevideo` set thumbnail = '" + str(destination) + "' where `original_video` = '" + sourceVideo + "'"
cursor.execute(updateSQL)

You could change the RuntimeError to something else, or have it set a flag instead of raising an exception, depending on what else you need it to do. The child.kill() line will cause the child process to die, but it may not be the most graceful way to end it. If you deploy it on a posix system, you could use os.system('kill -s 15 %i' %child.pid) instead, to kill it more gracefully.

There is a python module that provides an interface for retrieving information on all running processes and system utilization (CPU, disk, memory) in a portable way, implementing many functionalities offered by command line tools such as: ps, top, df, kill, free, lsof, free, netstat, ifconfig, nice, ionice, iostato, iotop, uptime, tty: psutil. It should help.

Take a look at God - A Process Monitor，which monitors the process you specified, and perform some actions according to your monitoring condition. For example, it can keep an eye on the cpu usage and restart the process if the cpu usage is above 50%:

# code in Ruby
# copyied from the documentation
w.restart_if do |restart|   
  restart.condition(:cpu_usage) do |c|
    c.above = 50.percent
    c.times = 5
  end
end

Step 1. Don't use CGI scripts. Use a framework.

Step 2. Don't start the subprocess directly in the function which creates the response. Use celery.

this process is just running on the server all the time. It's independent of any framework and reads from the same db that django populates

Step 2, again. Don't leave this subprocess running all the time. Use Celery so that it is started when a request arrives, handles that request (and only that request) and then stops.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow