Question

I want to write to a single file from multiple processes. To be precise, I would rather not use the Multiple processing Queue solution for multiprocessing as there are several submodules written by other developers. However, each write to the file for such submodules is associated with a write to a zmq queue. Is there a way I can redirect the zmq messages to a file? Specifically I am looking for something along the lines of http://www.huyng.com/posts/python-logging-from-multiple-processes/ without using the logging module.

Was it helpful?

Solution

It's fairly straightforward. In one process, bind a PULL socket and open a file. Every time the PULL socket receives a message, it writes directly to the file.

EOF = chr(4)
import zmq

def file_sink(filename, url):
    """forward messages on zmq to a file"""
    socket = zmq.Context.instance().socket(zmq.PULL)
    socket.bind(url)
    written = 0
    with open(filename, 'wb') as f:
        while True:
            chunk = socket.recv()
            if chunk == EOF:
                break
            f.write(chunk)
            written += len(chunk)

    socket.close()
    return written

In the remote processes, create a Proxy object, whose write method just sends a message over zmq:

class FileProxy(object):
    """Proxy to a remote file over zmq"""
    def __init__(self, url):
        self.socket = zmq.Context.instance().socket(zmq.PUSH)
        self.socket.connect(url)

    def write(self, chunk):
        """write a chunk of bytes to the remote file"""
        self.socket.send(chunk)

And, just for fun, if you call Proxy.write(EOF), the sink process will close the file and exit.

If you want to write multiple files, you can do this fairly easily either by starting multiple sinks and having one URL per file, or making the sink slightly more sophisticated and using multipart messages to indicate what file is to be written.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top