What are the advantages of multithreaded programming in Python?

https://stackoverflow.com/questions/12996254

13-07-2021
|

Вопрос

When I hear about multithreaded programming, I think about the opportunity to accelerate my program, but it is not?

import eventlet
from eventlet.green import socket
from iptools import IpRangeList


class Scanner(object):
    def __init__(self, ip_range, port_range, workers_num):
        self.workers_num = workers_num or 1000
        self.ip_range = self._get_ip_range(ip_range)
        self.port_range = self._get_port_range(port_range)
        self.scaned_range = self._get_scaned_range()

    def _get_ip_range(self, ip_range):
        return [ip for ip in IpRangeList(ip_range)]

    def _get_port_range(self, port_range):
        return [r for r in range(*port_range)]

    def _get_scaned_range(self):
        for ip in self.ip_range:
            for port in self.port_range:
                yield (ip, port)

    def scan(self, address):
        try:
            return bool(socket.create_connection(address))
        except:
            return False

    def run(self):
        pool = eventlet.GreenPool(self.workers_num)
        for status in pool.imap(self.scan, self.scaned_range):
            if status:
                yield True

    def run_std(self):
        for status in map(self.scan, self.scaned_range):
            if status:
                yield True


if __name__ == '__main__':
    s = Scanner(('127.0.0.1'), (1, 65000), 100000)
    import time
    now = time.time()
    open_ports = [i for i in s.run()]
    print 'Eventlet time: %s (sec) open: %s' % (now - time.time(),
                                                len(open_ports))
    del s
    s = Scanner(('127.0.0.1'), (1, 65000), 100000)
    now = time.time()
    open_ports = [i for i in s.run()]
    print 'CPython time: %s (sec) open: %s' % (now - time.time(),
                                                len(open_ports))

and results:

Eventlet time: -4.40343403816 (sec) open: 2
CPython time: -4.48356699944 (sec) open: 2

And my question is, if I run this code is not on my laptop but on the server and set more value of workers it will run faster than the CPython's version? What are the advantages of threads?

ADD: And so I rewrite app with use original cpython's threads

import socket
from threading import Thread
from Queue import Queue

from iptools import IpRangeList

class Scanner(object):
    def __init__(self, ip_range, port_range, workers_num):
        self.workers_num = workers_num or 1000
        self.ip_range = self._get_ip_range(ip_range)
        self.port_range = self._get_port_range(port_range)
        self.scaned_range = [i for i in self._get_scaned_range()]

    def _get_ip_range(self, ip_range):
        return [ip for ip in IpRangeList(ip_range)]

    def _get_port_range(self, port_range):
        return [r for r in range(*port_range)]

    def _get_scaned_range(self):
        for ip in self.ip_range:
            for port in self.port_range:
                yield (ip, port)

    def scan(self, q):
        while True:
            try:
                r = bool(socket.create_conection(q.get()))
            except Exception:
                r = False
            q.task_done()

    def run(self):
        queue = Queue()
        for address in self.scaned_range:
                queue.put(address)
        for i in range(self.workers_num):
                worker = Thread(target=self.scan,args=(queue,))
                worker.setDaemon(True)
                worker.start()
        queue.join()


if __name__ == '__main__':
    s = Scanner(('127.0.0.1'), (1, 65000), 5)
    import time
    now = time.time()
    s.run()
    print time.time() - now

and result is

 Cpython's thread: 1.4 sec

And I think this is a very good result. I take as a standard nmap scanning time:

$ nmap 127.0.0.1 -p1-65000

Starting Nmap 5.21 ( http://nmap.org ) at 2012-10-22 18:43 MSK
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00021s latency).
Not shown: 64986 closed ports
PORT      STATE SERVICE
53/tcp    open  domain
80/tcp    open  http
443/tcp   open  https
631/tcp   open  ipp
3306/tcp  open  mysql
6379/tcp  open  unknown
8000/tcp  open  http-alt
8020/tcp  open  unknown
8888/tcp  open  sun-answerbook
9980/tcp  open  unknown
27017/tcp open  unknown
27634/tcp open  unknown
28017/tcp open  unknown
39900/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.85 seconds

And my question is now: how threads implemented in Eventlet as I can understand this is not threads but something special for Eventlet and why they dont speed up tasks?

Eventlet is used by many of the major projects like OpenStack and etc. But why? Just do the heavy queries to a DB in asynchronous manner or something else?

Решение

Cpython threads:

Each cpython thread maps to an OS level thread (lightweight process/pthread in user space)
If there are many cpython threads executing python code concurrently: due to the global interpreter lock, only one cpython thread can interpret python at one time. The remaining threads will be blocked on the GIL when they need to interpret python instructions. When there are many python threads this slows things down a lot.
Now if your python code is spending most of its time inside networking operations (send, connect, etc): in this case there will be less threads fighting for GIL to interpret code. So the effect of GIL is not so bad.

Eventlet/Green threads:

From above we know that cpython has a performance limitation with threads. Eventlets tries to solve the problem by using a single thread running on a single core and using non blocking i/o for everything.
Green threads are not real OS level threads. They are a user space abstraction for concurrency. Most importantly, N green threads will map to 1 OS thread. This avoids the GIL problem.
Green threads cooperatively yield to each other instead of preemptively being scheduled. For networking operations, the socket libraries are patched in run time (monkey patching) so that all calls are non-blocking.
So even when you create a pool of eventlet green threads, you are actually creating only one OS level thread. This single OS level thread will execute all the eventlets. The idea is that if all the networking calls are non blocking, this should be faster than python threads, in some cases.

Summary

For your program above, "true" concurrency happens to be faster (cpython version, 5 threads running on multiple processors ) than the eventlet model (single thread running on 1 processor.).

There are some cpython workloads that will perform badly on many threads/cores (e.g. if you have 100 clients connecting to a server, and one thread per client). Eventlet is an elegant programming model for such workloads, so its used in several places.

Другие советы

The title of your question is "What are the advantages of multithreaded programming in Python?" so I am giving you an example rather than try to solve your problem. I have a python program running on a pentium core duo I bought in 2005, running windows xp that downloads 500 csv files from finance.yahoo.com, each being about 2K bytes, one for each stock in the S&P 500. It uses the urllib2. If I do not use threads it takes over 2 minutes, using standard python threads (40 threads) it is between 3 to 4 seconds with an average of around 1/4 second each (this is wall clock time and includes compute and I/O). When I look at the start and stop times of each thread (wall clock) there is tremendous overlap. I have the same thing running as a java program and the performance is almost identical between python and java. Also same as c++ using curllib but curllib is just a tad slower than java or python. I am using standard python version 2.2.6

Python has a Global Interpreter Lock http://en.wikipedia.org/wiki/Global_Interpreter_Lock which prevents two threads from ever executing at the same time.

If you're using something like cython, the C portions can execute concurrently, which is why you see the speedup.

In pure python programs there's no performance benefit (in terms of amount of computation you can get done), but it's sometimes the easiest way to write code which does a lot of IO (e.g. leave a thread waiting for a socket read to finish while you do something else).

Using the threading or multiprocessing modules enables you to use the multiple cores that are prevalent in modern CPUs.

This comes with a price; added complexity in your program needed to regulate access to shared data (especially writing); If one thread was iterating over a list while another thread was updating it, the result would be undetermined. This also applies to the internal data of the python interpreter.

Therefore the standard cpython has an important limitation with regard to using threads: only one thread at a time can be executing python bytecode.

If you want to paralellize a job that doesn't require a lot of communication between instances, multiprocessing (and especially multiprocessing.Pool) is often a better choice than threads because those jobs run in different processes that do not influence each other.

The main advantages of multithreaded programming, regardless of programming language are:

If you have a system with multiple CPUs or cores, then you can have all CPUs executing application code all at the same time. So for example, if you have a system with four CPUs, a process could potentially run up to 4 times faster with multithreading (though it is highly unlikely it will be that fast in most cases, since typical applications require threads to synchronize their access to shared resources, creation contention).
If the process needs to block for some reason (disk I/O, user input, network I/O) then while a thread or threads are blocked waiting for I/O completion other thread(s) can be doing other work. Note that for this type of concurrency you do not need multiple CPUs or cores, a process running on a single CPU can also benefit greatly from threading.

Whether these benefits can be applied to your process or not will largely depend on what your process does. In some cases you will get a considerable performance improvements, in other cases you won't and the threaded version might be slower. Note that writing good and efficient multithreaded apps is hard.

Now, since you are asking about Python in particular, let's discuss how these benefits apply to Python.

Due to the Global Interpreter Lock that is present in Python, running code in parallel in multiple CPUs is not possible. The GIL ensures that only one thread is interpreting Python code at a time, so there isn't really a way to take full advantage of multiple CPUs.
If a Python thread performs a blocking operation, another thread will get the CPU and continue to run, while the first thread is blocked waiting. When the blocking event completes, the blocked thread will resume. So this is a good reason to implement multithreading in a Python script (though it isn't the only way to achieve this type of concurrency, non-blocking I/O can achieve similar results).

Here are some examples that benefit from using multiple threads:

a GUI program that is doing a lengthy operation can have a thread that continues to keep the application window refreshed and responsive, maybe even showing a progress report on the long operation and a cancel button.
a process that needs to repeatedly read records from disk, then do some processing on them and finally write them back to disk can benefit from threading because while a thread is blocked waiting to get a record from disk another thread can be doing processing another record that was already read, and yet another thread can be writing another record back to disk. Without threads when the process is reading or writing to disk nothing else can happen. For a language that does not have a GIL (say C++) the benefit is even greater, as you can also have multiple threads, each running on a different core, all doing processing of different records.

I hope this helps!

Adding threads won't necessarily make a process faster as there is an overhead associated with the management of the threads which may outweigh any performance gain you get from the threads.

If you are running this on a machine with few CPU's as opposed to one with many you may well find that it runs slower as it swaps each thread in and out of execution. There may be other factors at play as well. If the threads need access to some other subsystem or hardware that can't handle concurrent requests (a serial port for example) then multithreading won't help you improve performance.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow