Can you try threading instead? Multiprocessing is basically for when you are CPU bound. Also, boilerpipe already includes protection when using threading which suggests that it may need protection in multiprocessing also.
If you really need mp, I will try to figure out how to patch boilerpipe.
Here is what I guess will be a drop-in replacement using threading. It uses multiprocessing.pool.ThreadPool (which is a "fake" multiprocessing pool). The only change is from Pool(..)
to multiprocessing.pool.ThreadPool(...)
The problem is that I'm not sure the boilerpipe multithreading test will detect the thread pool () as having activeCount() > 1
.
import multiprocessing
from multiprocessing.pool import ThreadPool # hidden ThreadPool class
# ...
proc_pool = ThreadPool(processes=4) # this is the only difference
for each_link in data:
proc_pool.apply_async(process_link_for_feeds, args=(each_link, ), callback=store_results_to_db)
proc_pool.close()
proc_pool.join()