Question

We're all aware of the horrors of the GIL, and I've seen a lot of discussion about the right time to use the multiprocessing module, but I still don't feel that I have a good intuition about when threading in Python (focusing mainly on CPython) is the right answer.

What are instances in which the GIL is not a significant bottleneck? What are the types of use cases where threading is the most appropriate answer?

Was it helpful?

Solution

Threading really only makes sense if you have a lot of blocking I/O going on. If that's the case, then some threads can sleep while other threads work. If threads are CPU-bound, you're not likely to see much benefit from multithreading.

Note that the multiprocessing module, while more difficult to code for, makes use of separate processes and therefore doesn't suffer the downsides of the GIL.

OTHER TIPS

Since you seem to be looking for examples, here are some off the top of my head and grabbed from searching for CPU-bound and I/O-bound examples (I can't seem to find many). I am no expert, so please feel free to correct anything I've miscategorized. It's also worth noting that advancing technology could move a problem from one category to another.

CPU Bound Tasks (use multiprocessing)

  • Numerical methods/approximations for mathematical functions (calculating digits of pi, etc.)
  • Image processing
  • Performing convolutions
  • Calculating transforms for graphics programming (possibly handled by GPU)
  • Audio/video compression/decompression

I/O Bound Tasks (threading is probably OK)

  • Sending data across a network
  • Writing to/reading from the disk
  • Asking for user input
  • Audio/video streaming

The GIL prevents python from running multiple threads.

If your code releases the GIL before jumping into a C extension, other python threads can continue while the C code runs. Like with the blocking IO, that other people have mentioned.

Ctypes does this automatically, and so does numpy. So if your code uses them a lot, it may not be significantly restricted by the GIL.

Besides the CPU bound and I/O bound tasks, there is still more use cases. For example, thread enables concurrent tasks. A lot of GUI programming fall into this category. The main loop have to be responsive to mouse events. So anytime you have a task that take a while and you don't want to freeze the UI, you do it on a separate thread. It is less about performance and more about parallelism.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top