Question

Python subprocess module states regarding the communicate() function:

Note

The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

How can I execute a process that reads a lot of data (e.g. communicate() is contraindicated) and yet still have access to the stderr output?

Was it helpful?

Solution

To get possibly unlimited subprocess' stdout/stderr output separately as soon as it becomes available, you could use twisted spawnProcess():

#!/usr/bin/env python
from twisted.internet import protocol
from twisted.internet import reactor

class ProcessProtocol(protocol.ProcessProtocol):
    def outReceived(self, data):
        print 'got stdout:', data
    def errReceived(self, data):
        print 'got stderr:', data
    def processEnded(self, reason):
        reactor.stop()

process = ProcessProtocol()
reactor.spawnProcess(process, 'cmd', ['cmd', 'arg 1', 'arg 2'])
reactor.run()

An alternative is to use threads e.g., teed_call() or use OS specific code e.g., fcntl module to make the pipes non-blocking on POSIX systems or use Overlapped I/O with named pipes on Windows.

OTHER TIPS

"communicate()" solves this problem by using threads. That means, you need an extra stderr reading thread, while doing the main work (reading stdout) in the main thread. Alternatively you can use select.select, but this doesn't work with windows.

Depending on your type of problem you may be able to rearrange the code to pipe the stderr process into your python code. This page has some pointers.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top