Python Popen, closing streams and multiple processes

https://stackoverflow.com/questions/617308

03-07-2019
|

Question

I have some data that I would like to gzip, uuencode and then print to standard out. What I basically have is:

compressor = Popen("gzip", stdin = subprocess.PIPE, stdout = subprocess.PIPE)
encoder    = Popen(["uuencode", "dummy"], stdin = compressor.stdout)

The way I feed data to the compressor is through compressor.stdin.write(stuff).

What I really need to do is to send an EOF to the compressor, and I have no idea how to do it.

At some point, I tried compressor.stdin.close() but that doesn't work -- it works well when the compressor writes to a file directly, but in the case above, the process doesn't terminate and stalls on compressor.wait().

Suggestions? In this case, gzip is an example and I really need to do something with piping the output of one process to another.

Note: The data I need to compress won't fit in memory, so communicate isn't really a good option here. Also, if I just run

compressor.communicate("Testing")

after the 2 lines above, it still hangs with the error

  File "/usr/lib/python2.4/subprocess.py", line 1041, in communicate
    rlist, wlist, xlist = select.select(read_set, write_set, [])

Solution

I suspect the issue is with the order in which you open the pipes. UUEncode is funny is that it will whine when you launch it if there's no incoming pipe in just the right way (try launching the darn thing on it's own in a Popen call to see the explosion with just PIPE as the stdin and stdout)

Try this:

encoder = Popen(["uuencode", "dummy"], stdin=PIPE, stdout=PIPE)
compressor = Popen("gzip", stdin=PIPE, stdout=encoder.stdin)

compressor.communicate("UUencode me please")
encoded_text = encoder.communicate()[0]
print encoded_text

begin 644 dummy
F'XL(`%]^L$D``PL-3<U+SD])5<A-52C(24TL3@4`;2O+"!(`````
`
end

You are right, btw... there is no way to send a generic EOF down a pipe. After all, each program really defines its own EOF. The way to do it is to close the pipe, as you were trying to do.

EDIT: I should be clearer about uuencode. As a shell program, it's default behaviour is to expect console input. If you run it without a "live" incoming pipe, it will block waiting for console input. By opening the encoder second, before you had sent material down the compressor pipe, the encoder was blocking waiting for you to start typing. Jerub was right in that there was something blocking.

OTHER TIPS

This is not the sort of thing you should be doing directly in python, there are eccentricities regarding the how thing work that make it a much better idea to do this with a shell. If you can just use subprocess.Popen("foo | bar", shell=True), then all the better.

What might be happening is that gzip has not been able to output all of its input yet, and the process will no exit until its stdout writes have been finished.

You can look at what system call a process is blocking on if you use strace. Use ps auxwf to discover which process is the gzip process, then use strace -p $pidnum to see what system call it is performing. Note that stdin is FD 0 and stdout is FD 1, you will probably see it reading or writing on those file descriptors.

if you just want to compress and don't need the file wrappers consider using the zlib module

import zlib
compressed = zlib.compress("text")

any reason why the shell=True and unix pipes suggestions won't work?

from subprocess import *

pipes = Popen("gzip | uuencode dummy", stdin=PIPE, stdout=PIPE, shell=True)
for i in range(1, 100):
    pipes.stdin.write("some data")
pipes.stdin.close()
print pipes.stdout.read()

seems to work

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow