Persistent python subprocess

https://stackoverflow.com/questions/8980050

12-11-2019
|

Question

Is there a way to make a subprocess call in python "persistent"? I'm calling a program that takes a while to load multiple times. So it would be great if I could just leave that program open and communicate with it without killing it.

The cartoon version of my python script looks like this:

for text in textcollection:
    myprocess = subprocess.Popen(["myexecutable"],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE,
                stderr = None)
    myoutputtext, err = myprocess.communicate(input=text)

I need to process each text separately, so joining it all into one large text file and processing it once is not an option.

Preferably, if there's an option like this

myprocess = subprocess.Popen(["myexecutable"],
            stdin = subprocess.PIPE, stdout = subprocess.PIPE,
            stderr = None)    for text in textcollection:
for text in textcollection:
    myoutputtext, err = myprocess.communicate(input=text)

where I can leave the process open, I'd really appreciate it.

Solution

You can use myprocess.stdin.write() and myprocess.stdout.read() to communicate with your subprocess, you just need to be careful to make sure you handle buffering correctly to prevent your calls from blocking.

If the output from your subprocess is well-defined, you should be able to reliably communicate with it using line-buffering and myprocess.stdout.readline().

Here is an example:

>>> p = subprocess.Popen(['cat'], bufsize=1, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
>>> p.stdin.write('hello world\n')
>>> p.stdout.readline()
'hello world\n'
>>> p.stdout.readline()        # THIS CALL WILL BLOCK

An alternative to this method for Unix is to put the file handle in non-blocking mode, which will allow you to call functions like myprocess.stdout.read() and have it return data if any is available, or raise an IOError if there isn't any data:

>>> p = subprocess.Popen(['cat'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
>>> import fcntl, os
>>> fcntl.fcntl(p.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
0
>>> p.stdout.read()         # raises an exception instead of blocking
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 11] Resource temporarily unavailable

This would allow you to do something like this:

fcntl.fcntl(p.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
for text in textcollection:
    myprocess.stdin.write(text + '\n')
    while True:
        myoutputtext = ''
        try:
            myoutputtext += myprocess.stdout.read()
        except IOError:
            pass
        if validate_output(myoutputtext):
            break
        time.sleep(.1)    # short sleep before attempting another read

In this example, validate_output() is a function you would need to write that returns True if the data you have received so far is all of output that you expect to get.

OTHER TIPS

It is the call to communicate() that is killing your subprocess. According to the subprocess documentation the communicate() method will:

Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate.

What you want to do is interact directly with the POpen object's stdin and stdout properties directly to communicate with the subprocess. However, the documentation advises against this saying:

Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

So you either need to implement your own workarounds for potential deadlocks, or hope that someone has written an asynchronous subprocess module for you.

Edit: Here's a quick'n'dirty example of how the asynchronous subprocess module could be used:

import asyncsubprocess

textcollection = ['to', 'be', 'or', 'not', 'to be', 'that is the', 'question']

myprocess = asyncsubprocess.Popen(["cat"],
     stdin = asyncsubprocess.PIPE,
     stdout = asyncsubprocess.PIPE,
     stderr = None)

for text in textcollection:
    bytes_sent, myoutput, err = myprocess.listen(text)
    print text, bytes_sent, myoutput, err

When I run this, it prints:

to 2 to 
be 2 be 
or 2 or 
not 3 not 
to be 5 to be 
that is the 11 that is the 
question 8 question

I think you're looking for

myprocess.stdin.write(text)

you could create a list of Popens and then call communicate on each element in another loop. something like this

processes=[]
for text in textcollection:
    myprocess = subprocess.Popen(["myexecutable"],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE,
                stderr = None)
    myprocess.stdin.write(text)
    processes.append(myprocess)

for proc in processes:
    myoutput, err=proc.communicate()
    #do something with the output here

this way it won't have to wait until after all the Popens have started

if os.name == 'nt':
 startupinfo = subprocess.STARTUPINFO()
 startupinfo.dwFlags |= subprocess._subprocess.STARTF_USESHOWWINDOW
 subprocess.call(os.popen(tempFileName), shell=True)
 os.remove(tempFileName)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow