I need an efficient shared dictionary in a Python multiprocessing environment

https://stackoverflow.com/questions/22685434

22-06-2023
|

문제

I have implemented one-producer-multiple-consumer pattern using Python's multiprocessing package. The consumers should put the results in a dictionary. The keys of this dictionary are words and the values are big Scipy sparse matrix. Each consumer adds its value for each word it sees to the main vector for that word in the shared dictionary.

I have used Manager.dict() to implement this shared dictionary but it is very slow. cpu-utilization is about 15% for each process and it is just a little bit better than a single process. Each consumer fetches an item from the shared dictionary, adds a sparse matrix to the value of that item and updates the item in the shared dictionary.

Is there any more efficient solution?

해결책

import memcache

memc = memcache.Client(['127.0.0.1:11211'], debug=1);
memc.set('top10candytypes', {1 : 2, "3" : [4,5,6]})

bestCandy = memc.get('top10candytypes')
print(bestCandy)

I'm no expert on memcache because i've just started to use it myself. But it's handy as hell if you have multiple threads needing to access the same data or if you simply need to store things efficiently without running out of ram.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow