Question

I have a python script that analyses large sql trace files. To increase performance the processing is distributed to multiple processes. During the initialisation, the input file is loaded into an array, then this array is splited in 4 parts (for 4 processes). Each of this parts is sent to a process using a pipe.

If the array part is larger than about 200MB (whole array about 800MB), I get an "Out of memory" exception.

The exception occurs after the send() method has been called with the following message:

parent_conn.send(data) MemoryError: out of memory

Is there a way to increase the possible size of a pipe?

what i've done:

def main():

    for i in range(0, self.no_threads):
            data = AnalyserData(i, queries)     #queries is a large array that contains about 1mio. elements
            parent_conn, child_conn = Pipe()                
            process = Process(target=self.init_process, args=(child_conn,))             
            process.start()             
            conns.append(parent_conn)
            parent_conn.send(data)          
            threads.append(process)             


    for i  in range(0, len(threads)):       
            res = conns[i].recv()
            res.not_supported_queries = json.loads(res.not_supported_queries)
            thread_results.append(res)  
            threads[i].join()

def init_process(self,conn):            
    data = conn.recv()
    #do some processing...      
    conn.send(result)

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top