Question

I have problem with thread limiting. I want to do it using QThread. So SpiderThread is QThread object crawling some urls. But I want to limit working threads to X threads at once. I have done it earlier with threadpool and QRunnable but it's buggy in pyside when numbers of urls are big. So I have this simple code:

 self.threads = []
    for url in self.urls:
        th = SpiderThread(url)
        th.updateresultsSignal.connect(self.update_results)
        self.threads.append(th)
        th.start()

Anyone have working example of limiting threads using QThread ?

Was it helpful?

Solution

So you want to have at most X threads running at any given time? So how about a URL queue shared by 10 threads:

self.threads = []
queueu = Queue(self.urls) # replace with a sync queue
for i in xrange(1,10):
    th = SpiderThread(queue)
    th.updateresultsSignal.connect(self.update_results)
    self.threads.append(th)
    th.start()

Then in the run of each thread, the thread gets a URL off the queue (so removes it from queue), and when it is done processing the URL, it gets a new one. In pseudocode:

class SpiderThread(Thread):
    def __init__(self, queue):
        self.queue = queue
    def run(self):
        while not self.queue.empty():
            maxWait = 100 # miliseconds
            try: 
                url = self.queue.get(true, maxWait)
                process(url)
            except Queue.Empty:
                break # no more URLs, work completed!
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top