Question

So I was reading this article about an attempt to remove the global interpreter lock (GIL) from the Python interpreter to improve multithreading performance and saw something interesting.

It turns out that one of the places where removing the GIL actually made things worse was in memory management:

With free-threading, reference counting operations lose their thread-safety. Thus, the patch introduces a global reference-counting mutex lock along with atomic operations for updating the count. On Unix, locking is implemented using a standard pthread_mutex_t lock (wrapped inside a PyMutex structure) and the following functions...

...On Unix, it must be emphasized that simple reference count manipulation has been replaced by no fewer than three function calls, plus the overhead of the actual locking. It's far more expensive...

...Clearly fine-grained locking of reference counts is the major culprit behind the poor performance, but even if you take away the locking, the reference counting performance is still very sensitive to any kind of extra overhead (e.g., function call, etc.). In this case, the performance is still about twice as slow as Python with the GIL.

and later:

Reference counting is a really lousy memory-management technique for free-threading. This was already widely known, but the performance numbers put a more concrete figure on it. This will definitely be the most challenging issue for anyone attempting a GIL removal patch.

So the question is, if reference counting is so lousy for threading, how does Objective-C do it? I've written multithreaded Objective-C apps, and haven't noticed much of an overhead for memory management. Are they doing something else? Like some kind of per object lock instead of a global one? Is Objective-C's reference counting actually technically unsafe with threads? I'm not enough of a concurrency expert to really speculate much, but I'd be interested in knowing.

Was it helpful?

Solution

There is overhead and it can be significant in rare cases (like, for example, micro-benchmarks ;), regardless of the optimizations that are in place (of which, there are many). The normal case, though, is optimized for un-contended manipulation of the reference count for the object.

So the question is, if reference counting is so lousy for threading, how does Objective-C do it?

There are multiple locks in play and, effectively, a retain/release on any given object selects a random lock (but always the same lock) for that object. Thus, reducing lock contention while not requiring one lock per object.

(And what Catfish_man said; some classes will implement their own reference counting scheme to use class-specific locking primitives to avoid contention and/or optimize for their specific needs.)

The implementation details are more complex.

Is Objectice-C's reference counting actually technically unsafe with threads?

Nope -- it is safe in regards to threads.

In reality, typical code will call retain and release quite infrequently, compared to other operations. Thus, even if there were significant overhead on those code paths, it would be amortized across all the other operations in the app (where, say, pushing pixels to the screen is really expensive, by comparison).

If an object is shared across threads (bad idea, in general), then the locking overhead protecting the data access and manipulation will generally be vastly greater than the retain/release overhead because of the infrequency of retaining/releasing.


As far as Python's GIL overhead is concerned, I would bet that it has more to do with how often the reference count is incremented and decremented as a part of normal interpreter operations.

OTHER TIPS

In addition to what bbum said, a lot of the most frequently thrown around objects in Cocoa override the normal reference counting mechanisms and store a refcount inline in the object, which they manipulate with atomic add and subtract instructions rather than locking.

(edit from the future: Objective-C now automatically does this optimization on modern Apple platforms, by mixing the refcount in with the 'isa' pointer)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top