Question

Coming from Java the whole Global Interpreter Lock (GIL) in Ruby and Python is kind of startling. I have read a bit into the problem and found in the Python documentation the following excerpt:

Can’t we get rid of the Global Interpreter Lock?

The global interpreter lock (GIL) is often seen as a hindrance to Python’s deployment on high-end multiprocessor server machines, because a multi-threaded Python program effectively only uses one CPU, due to the insistence that (almost) all Python code can only run while the GIL is held.

Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive patch set (the “free threading” patches) that removed the GIL and replaced it with fine-grained locking. Unfortunately, even on Windows (where locks are very efficient) this ran ordinary Python code about twice as slow as the interpreter using the GIL. On Linux the performance loss was even worse because pthread locks aren’t as efficient.

What I did not found is the explanation behind the performance impact. I have tried to find out what the technical reasons are, but could not find any good discussion that nails it down.

Similar in Ruby, here I could find even less information. Are the reasons the same?

Was it helpful?

Solution

Simply put, locking and unlocking many locks is more expensive than locking and unlocking a single lock. This shouldn't be surprising, doing anything N times instead of once obvious takes more time (all other things being equal). And for this kind of thing, economics of scale don't really apply, there's no big one-time cost to amortize over all locking operations.

Edit: In principle, Java has the same problem, but due to different focus of everyone involved, history, and perhaps other factors, Java gets by rather well with fine-grained locks. In short, single-threaded performance is not regarded that important, and multi-threaded performance is probably better than a hypothetical free-threaded CPython.

Historically, I don't think there ever was a JVM with a GIL (though it started out with green threads running on a single OS thread - but this was long ago), so there's no historical reasons for keeping a GIL and no base-line single-threaded performance that makes people loathe locks. Instead, a lot of effort was put into making Java good at multi-threading, and this ability is widely used. In contrast, even if you solved the GIL issue with no performance cost for single-threaded Python or Ruby programs, most code out there wouldn't benefit from it and the libraries are... not awful, but not exactly on par with java.util.concurrent either.

Because Java has (now) a memory model which explicitly doesn't give a lot of guarantees, many common operations in Java programs don't need any kind of lock in general. The downside is, of course, that Java programmers have to add locks or other synchronization manually when it is needed. In addition, Java's locks have seen a lot of optimizations (some of which was original research and first introduced in JVM) to locks - thin locks, lock elision, etc. - which make locks with contention very cheap.

Another factor may be that a Java program runs almost entirely Java code (which, as I've described above, only needs very little synchronization if it's not explicitly requested), with only few calls into a runtime library. As a consequence, a free-threaded JVM could even have a global lock (or only a few coarse locks) for the JIT, the classloader, etc. without affecting most Java programs too much. In contrast, a Python program will spend a large part of its time in C code, either of the built-in modules or in third-party extension modules.

OTHER TIPS

Most Python programs are single threaded. Those that aren't have usually gone in the direction of multiprocessing.

For those that are unsuitable for multiprocessing, it is possible to use C extensions to release the GIL, but you do have to be very careful of course.

Every attempt to remove the GIL has had a severe impact on the performance of all those single threaded/multiprocessing apps, so the GIL stays in and everyone tries to leverage multiprocessing and in most cases it's the best solution.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top