The consensus on RC vs. tracing in computer science research has been, for a long time, that tracing has superior CPU throughput despite longer (maximum) pause times. (E.g. see here, here, and here.) Only very recently, in 2013, has there been a paper (last link under those three) presenting an RC based system that performs equally or a little better than the best tested tracing GC, with regard to CPU throughput. Needless to say it has no "real" implementations yet.
Here is a tiny benchmark I just did on my iMac with 3.1 GHz i5, in the iOS 7.1 64-bit simulator:
long tenmillion = 10000000;
NSTimeInterval t;
t = [NSDate timeIntervalSinceReferenceDate];
NSMutableArray *arr = [NSMutableArray arrayWithCapacity:tenmillion];
for (long i = 0; i < tenmillion; ++i)
[arr addObject:[NSObject new]];
NSLog(@"%f seconds: Allocating ten million objects and putting them in an array.", [NSDate timeIntervalSinceReferenceDate] - t);
t = [NSDate timeIntervalSinceReferenceDate];
for (NSObject *obj in arr)
[self doNothingWith:obj]; // Can't be optimized out because it's a method call.
NSLog(@"%f seconds: Calling a method on an object ten million times.", [NSDate timeIntervalSinceReferenceDate] - t);
t = [NSDate timeIntervalSinceReferenceDate];
NSObject *o;
for (NSObject *obj in arr)
o = obj;
NSLog(@"%f seconds: Setting a pointer ten million times.", [NSDate timeIntervalSinceReferenceDate] - t);
With ARC disabled (-fno-objc-arc
), this gives the following:
2.029345 seconds: Allocating ten million objects and putting them in an array.
0.047976 seconds: Calling a method on an object ten million times.
0.006162 seconds: Setting a pointer ten million times.
With ARC enabled, that becomes:
1.794860 seconds: Allocating ten million objects and putting them in an array.
0.067440 seconds: Calling a method on an object ten million times.
0.788266 seconds: Setting a pointer ten million times.
Apparently allocating objects and calling methods became somewhat cheaper. Assigning to an object pointer became more expensive by orders of magnitude, though don't forget that I didn't call -retain in the non-ARC example, and note that you can use __unsafe_unretained
should you ever have a hotspot that assigns object pointers like crazy. Nevertheless, if you want to "forget about" memory management and let ARC insert retain/release calls where ever it wants, you will, in the general case, be wasting lots of CPU cycles, repeatedly and in all code pathes that set pointers. A tracing GC on the other hand leaves your code itself alone, and only kicks in at select moments (usually when allocating something), doing its thing in one fell swoop. (Of course the details are a lot more complicated in truth, given generational GC, incremental GC, concurrent GC, etc.)
So yes, since Objective-C's RC uses atomic retain/release, it is rather expensive, but Objective-C also has many more inefficiencies than that imposed by refcounting. (For instance, the fully dynamic/reflective nature of methods, which can be "swizzled" at any time by at run-time, prevents the compiler from doing many cross-method optimizations that would require data flow analysis and such. An objc_msgSend() is always a call to a "dynamically linked" black box from the view of the static analyzer, so to say.) All in all Objective-C as a language is not exactly the most efficient or best optimizable out there; people call it "C's type safety with Smalltalk's blazing speed" for a reason. ;-)
When writing Objective-C, one generally just instruments around well-implemented Apple libraries, which surely use C and C++ and assembly or whatever for their hotspots. Your own code barely ever needs to be efficient. When there is a hot spot, you can make it very efficient by dropping down to lower level constructs like pure C-style code within a single Objective-C method, but one rarely ever needs this. That's why Objective-C can afford the cost of ARC in the general case. I'm not yet convinced that tracing GC has any inherent problems in memory-constrained environments and think one could use a properly high-level language to instrument said libraries just as well, but apparently RC sits better with Apple/iOS. One has to consider the whole of the framework they've built up so far and all their legacy libraries when asking oneself why they didn't go with a tracing GC; for instance I've heard that RC is rather deeply built into CoreFoundation.