I have a python script that analyses large sql trace files. To increase performance the processing is distributed to multiple processes. During the initialisation, the input file is loaded into an array, then this array is splited in 4 parts (for 4 processes). Each of this parts is sent to a process using a pipe.

If the array part is larger than about 200MB (whole array about 800MB), I get an "Out of memory" exception.

The exception occurs after the send() method has been called with the following message:

parent_conn.send(data) MemoryError: out of memory

Is there a way to increase the possible size of a pipe?

what i've done:

def main():

    for i in range(0, self.no_threads):
            data = AnalyserData(i, queries)     #queries is a large array that contains about 1mio. elements
            parent_conn, child_conn = Pipe()                
            process = Process(target=self.init_process, args=(child_conn,))             
            process.start()             
            conns.append(parent_conn)
            parent_conn.send(data)          
            threads.append(process)             


    for i  in range(0, len(threads)):       
            res = conns[i].recv()
            res.not_supported_queries = json.loads(res.not_supported_queries)
            thread_results.append(res)  
            threads[i].join()

def init_process(self,conn):            
    data = conn.recv()
    #do some processing...      
    conn.send(result)

没有正确的解决方案

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top