How do I retrieve stderr for a shell command with a large data buffer?

Question 1

To get possibly unlimited subprocess' stdout/stderr output separately as soon as it becomes available, you could use twisted spawnProcess():

#!/usr/bin/env python
from twisted.internet import protocol
from twisted.internet import reactor

class ProcessProtocol(protocol.ProcessProtocol):
    def outReceived(self, data):
        print 'got stdout:', data
    def errReceived(self, data):
        print 'got stderr:', data
    def processEnded(self, reason):
        reactor.stop()

process = ProcessProtocol()
reactor.spawnProcess(process, 'cmd', ['cmd', 'arg 1', 'arg 2'])
reactor.run()

An alternative is to use threads e.g., teed_call() or use OS specific code e.g., fcntl module to make the pipes non-blocking on POSIX systems or use Overlapped I/O with named pipes on Windows.

Question 2

"communicate()" solves this problem by using threads. That means, you need an extra stderr reading thread, while doing the main work (reading stdout) in the main thread. Alternatively you can use select.select, but this doesn't work with windows.

Question 3

Depending on your type of problem you may be able to rearrange the code to pipe the stderr process into your python code. This page has some pointers.