Python cannot use multiple cores (due to GIL) unless you spawn subprocesses (with multiprocessing
for example). Thus you won't get any performance boost with spawning threads for CPU bound tasks.
Here's an example of a script using multiprocessing
and queue
:
from Queue import Empty # <-- only needed to catch Exception
from multiprocessing import Process, Queue, cpu_count
def loadFile(d, i, queue):
# some other stuff
queue.put(result)
if name == "main":
queue = Queue()
no = cpu_count()
processes = []
for i,d in enumerate(sortedDirSet):
p = Process(target=self.loadFile, args=(d, i, queue))
p.start()
processes.append(p)
if i % no == 0:
for p in processes:
p.join()
processes = []
for p in processes:
p.join()
results = []
while True:
try:
# False means "don't wait when Empty, throw an exception instead"
data = queue.get(False)
results.append(data)
except Empty:
break
# You have all the data, do something with it
The other (more complicated) way would be to use pipe
instead of queue
.
It would be also more efficient to spawn processes, then create a job queue and send them (via pipe
) to subprocesses (so you won't have to create a process each time). But this would be even more complicated, so let's leave it like that.