Question

I have a question about shared resource with file handle between processes. Here is my test code:

from multiprocessing import Process,Lock,freeze_support,Queue
import tempfile
#from cStringIO import StringIO

class File():
    def __init__(self):
        self.temp = tempfile.TemporaryFile()
        #print self.temp

    def read(self):
        print "reading!!!"
        s = "huanghao is a good boy !!"
        print >> self.temp,s
        self.temp.seek(0,0)

        f_content = self.temp.read()
        print f_content

class MyProcess(Process):
    def __init__(self,queue,*args,**kwargs):
        Process.__init__(self,*args,**kwargs)
        self.queue = queue

    def run(self):
        print "ready to get the file object"
        self.queue.get().read()
        print "file object got"
        file.read()

if __name__ == "__main__":
    freeze_support()
    queue = Queue()
    file = File()

    queue.put(file)
    print "file just put"

    p = MyProcess(queue)
    p.start()

Then I get a KeyError like below:

file just put
ready to get the file object
Process MyProcess-1:
Traceback (most recent call last):
  File "D:\Python26\lib\multiprocessing\process.py", line 231, in _bootstrap
    self.run()
  File "E:\tmp\mpt.py", line 35, in run
    self.queue.get().read()
  File "D:\Python26\lib\multiprocessing\queues.py", line 91, in get
    res = self._recv()
  File "D:\Python26\lib\tempfile.py", line 375, in __getattr__
    file = self.__dict__['file']
KeyError: 'file'

I think when I put the File() object into queue , the object got serialized, and file handle can not be serialized, so, i got the KeyError:

Anyone have any idea about that? if I want to share objects with file handle attribute, what should I do?

Was it helpful?

Solution

I have to object (at length, won't just fit in a commentl;-) to @Mark's repeated assertion that file handles just can't be "passed around between running processes" -- this is simply not true in real, modern operating systems, such as, oh, say, Unix (free BSD variants, MacOSX, and Linux, included -- hmmm, I wonder what OS's are left out of this list...?-) -- sendmsg of course can do it (on a "Unix socket", by using the SCM_RIGHTS flag).

Now the poor, valuable multiprocessing is fully right to not exploit this feature (even assuming there might be black magic to implement it on Windows too) -- most developers would no doubt misuse it anyway (having multiple processes access the same open file concurrently and running into race conditions). The only proper way to use it is for a process which has exclusive rights to open certain files to pass the opened file handles to another process which runs with reduced privileges -- and then never use that handle itself again. No way to enforce that in the multiprocessing module, anyway.

Back to @Andy's original question, unless he's going to work on Linux only (AND with local processes only, too) and willing to play dirty tricks with the /proc filesystem, he's going to have to define his application-level needs more sharply and serialize file objects accordingly. Most files have a path (or can be made to have one: path-less files are pretty rare, actually non-existent on Windows I believe) and thus can be serialized via it -- many others are small enough to serialize by sending their content over -- etc, etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top