mikeash.com: just this guy, you know?

Posted at 2010-07-16 20:18 | RSS feed (Full text feed) | Blog Index
Next article: Introducing MAZeroingWeakRef
Previous article: Friday Q&A 2010-07-02: Background Timers
Tags: fridayqna garbagecollection hack objectivec
Friday Q&A 2010-07-16: Zeroing Weak References in Objective-C
by Mike Ash  

It's that time of the biweek again. For this week's Friday Q&A, Mike Shields has suggested that I talk about weak references in Objective-C, and specifically zeroing weak references. I've gone a bit further and actually implemented a class that provides zeroing weak references in Objective-C using manual memory management.

Weak References
First, what is a weak reference? Simply put, a weak reference is a reference (pointer, in Objective-C land) to an object which does not participate in keeping that object alive. For example, using memory management, this setter creates a weak reference to the new object:

    - (void)setFoo: (id)newFoo
    {
        _foo = newFoo;
    }
Because the setter does not use retain, the reference does not keep the new object alive. It will stay alive as long as it's retained by other references, of course. But once those go away, the object will be deallocated even if _foo still points to it.

Weak references are common in Cocoa in order to deal with retain cycles. Delegates in Cocoa are almost always weak references for exactly this reason.

Zeroing Weak References
Weak references are useful for things like avoiding retain cycles, but their utility is limited due to their inherent danger. With a plain weak reference in Objective-C, when the target object is destroyed, you're left with a dangling pointer. If your code tries to use that pointer, it will crash or worse.

Zeroing weak references eliminate this danger. They work just like a regular weak reference, except that when the target object is destroyed, they automatically become nil. At any time you access an object through a zeroing weak reference, you're guaranteed to either access a valid, live object, or get nil. As long as your code can handle nil, then you're perfectly safe.

Because of this safety, a zeroing weak reference can be useful for much more than the unsafe kind. One example is an object cache. An object cache using weak references can refer to objects as long as they're alive, and then let them deallocate when no longer needed. If a client requests an object that's still alive, it can obtain it without having to create a new object. If the object has already been destroyed, the cache can safely create a new object.

They can be used for much more mundane purposes as well, for any case where you want to keep a reference to an object but don't want to keep that object in memory beyond its normal lifetime. For example, you might track a window but not want to keep it in memory after it's closed. You could deal with this by setting up a notification observer and seeing when the window goes away, but a zeroing weak reference is a much simpler way to do it. As another example, a zeroing weak reference to self used in a block can prevent a retain cycle while ensuring that your program doesn't crash if the block is called after self is deallocated. Even a standard delegate pointer is made better with a zeroing weak reference, as it eliminates rare but annoying bugs which can appear if the delegate is deallocated before the object that points to it.

If you're using garbage collection in Objective-C, then good news! The Objective-C garbage collector already supports zeroing weak references using the type modifier __weak. You can just declare any instance variable like so:

    __weak id _foo;
And it's automatically a zeroing weak reference. The compiler takes care of emitting the appropriate read/write barriers so that access is always safe.

What if you aren't using garbage collection, though? While it would be great if we all could, many of us can't for various reasons, one of the most common being that garbage collection simply isn't supported on iOS. Well, until now you've been out of luck when it comes to zeroing weak references with manual memory management in Objective-C.

Introducing MAZeroingWeakRef
Those of us who use manual memory management can now benefit from zeroing weak references! MAZeroingWeakRef implements the following interface:

    @interface MAZeroingWeakRef : NSObject
    {
        id _target;
    }
    
    + (id)refWithTarget: (id)target;
    
    - (id)initWithTarget: (id)target;
    
    - (void)setCleanupBlock: (void (^)(id target))block;
    
    - (id)target;
    
    @end
Usage is extremely simple. Initialize it with a target object. Retrieve the target object when you need to use it. The -target method will either return the target object (retained/autoreleased to guarantee that it will stay alive until you're done with it) or, if the target has already been destroyed, it will return nil.

The -setCleanupBlock: method exists for more advanced uses. Normally a zeroing weak reference is a passive object. You can query its target at any time, and it either gives you an object or nil. But sometimes you want to take some additional action when the reference is zeroed out, such as unregistering a notification observer. The block passed to -setCleanupBlock: runs when the reference is zeroed out, allowin gyou to set up additional actions like that.

As an example, here's how to write the standard delegate pattern using MAZeroingWeakRef:

    // instance variable
    MAZeroingWeakRef *_delegateRef;
    
    // setter
    - (void)setDelegate: (id)newDelegate
    {
        [_delegateRef release];
        _delegateRef = [[MAZeroingWeakRef alloc] initWithTarget: newDelegate];
    }
    
    - (void)doSomethingAndCallDelegate
    {
        [self _doSomething];
        
        id delegate = [_delegateRef target];
        if([delegate respondsToSelector: @selector(someDelegateMethod)])
            [delegate someDelegateMethod];
    }
This is only slightly harder than using normal, dangerous weak references, and provides complete safety. (If you use this pattern, remember that you must now release _delegateRef in -dealloc!)

MAZeroingWeakRef is completely thread safe, both in terms of accessing it from multiple threads, and in terms of having the target object be destroyed in one thread while the weak reference is accessed from another thread.

How Does it Work?
The concept of how a zeroing weak reference works is pretty straightforward. Track all such references to a target. When an object is destroyed, zero out all of those references before calling dealloc. Wrap everything in a lock so that it's thread safe.

The details of how to accomplish each step can get tricky, though.

Tracking all zeroing weak references to a target isn't too tough. A global CFMutableDictionary maps targets to CFMutableSet objects which hold the zeroing weak references to each target. I use the CF classes so that I can customize the memory management; I don't want the targets or weak references to be retained.

Zeroing all of the weak references before calling dealloc gets a little trickier....

The answer to that is to use dynamic subclassing, as done in the implementation of Key-Value Observing. When an object is targeted by a zeroing weak reference, a new subclass of that object's class is created. The -dealloc method of the new subclass takes care of zeroing out all of the weak references and then calls through to super so that the normal chain of deallocations can occur. The new subclass also overrides -release to take a lock so that everything is thread safe. (Without that override, it would be possible for one thread to release an object with a retain count of 1 at the same time that another thread retrieved the object from a MAZeroingWeakRef. The retrieval would then try to resurrect the object after it had already been marked for destruction, which is illegal.)

Of course you don't want to make a new subclass for every single targeted object, but only one subclass is necessary per target class. A small table of overridden classes ensures that no more than one new subclass is created for each normal class.

As the final step, the class of the target object is set to be the new subclass, ensuring that the new methods take effect.

CoreFoundation Trickiness
The above strategy runs into a snag with toll-free bridged classes like NSCFString. Because of the way they're implemented, changing the class of such an object causes infinite recursion and a crash the moment that something tries to use them. The CoreFoundation code sees the changed class, assumes it's a pure Objective-C class, and calls through to the equivalent Objective-C method. The NSCF method then calls back to CoreFoundation. A crash rapidly ensues.

While I did figure out a solution to this problem, it is so hairy and complicated that I will save it for a separate article to be posted in two weeks.

Code
As usual, you can get the code for MAZeroingWeakRef from my public Subversion repository:

    svn co http://mikeash.com/svn/ZeroingWeakRef/
Or just click the link above to browse the code.

I will be walking through a somewhat abbreviated version of MAZeroingWeakRef. Due to the crazy nature of the CoreFoundation workaround I mentioned above, I will skip over those parts and only discuss the sane Objective-C bits this week. There is a macro called COREFOUNDATION_HACK_LEVEL which allows control over how much CoreFoundation hackery is enabled. At level 2 you get full-on hackery with full support for weak references to CoreFoundation objects. With level 1, some less important private symbols are referenced and used to reliably decide whether an object is bridged or not, and the code simply asserts if trying to create a weak reference to a bridged object. At level 0, the code asserts when trying to create a weak reference to a bridged object, and checks for bridging simply by looking for a prefix of NSCF in the class name. For this week, I will be discussing the code as if it were compiled with level 0.

Globals
MAZeroingWeakRef makes use of some global variables for various housekeeping uses. First off is a mutex:

    static pthread_mutex_t gMutex;
This is used to protect the other global data structures, as well as the table of zeroing weak references that's attached to each target object.

Next up, a CFMutableDictionary is needed to map the target objects to the weak references which target them:

    static CFMutableDictionaryRef gObjectWeakRefsMap; // maps (non-retained) objects to CFMutableSetRefs containing weak refs
Next, an NSMutableSet is used to track the dynamic subclasses that are created, and an NSMutableDictionary is used to map from normal classes to their dynamic subclasses:
    static NSMutableSet *gCustomSubclasses;
    static NSMutableDictionary *gCustomSubclassMap; // maps regular classes to their custom subclasses
Finally, implement +initialize to set up all of these variables. The only tricky business here is that it uses a recursive mutex rather than a regular one. There are cases where the critical section can be re-entered, such as creating a MAZeroingWeakRef pointing to another MAZeroingWeakRef, and using a recursive mutex allows that to function.
    + (void)initialize
    {
        if(self == [MAZeroingWeakRef class])
        {
            CFStringCreateMutable(NULL, 0);
            pthread_mutexattr_t mutexattr;
            pthread_mutexattr_init(&mutexattr;);
            pthread_mutexattr_settype(&mutexattr, PTHREAD_MUTEX_RECURSIVE);
            pthread_mutex_init(&gMutex, &mutexattr;);
            pthread_mutexattr_destroy(&mutexattr;);
            
            gCustomSubclasses = [[NSMutableSet alloc] init];
            gCustomSubclassMap = [[NSMutableDictionary alloc] init];
        }
    }
I also write a quick helper to execute a block of code while holding the lock:
    static void WhileLocked(void (^block)(void))
    {
        pthread_mutex_lock(&gMutex;);
        block();
        pthread_mutex_unlock(&gMutex;);
    }
And three more helpers to deal with adding a weak reference to an object's CFMutableSet, removing a weak reference from an object, and clearing out all weak references to an object:
    static void AddWeakRefToObject(id obj, MAZeroingWeakRef *ref)
    {
        CFMutableSetRef set = (void *)CFDictionaryGetValue(gObjectWeakRefsMap, obj);
        if(!set)
        {
            set = CFSetCreateMutable(NULL, 0, NULL);
            CFDictionarySetValue(gObjectWeakRefsMap, obj, set);
            CFRelease(set);
        }
        CFSetAddValue(set, ref);
    }
    
    static void RemoveWeakRefFromObject(id obj, MAZeroingWeakRef *ref)
    {
        CFMutableSetRef set = (void *)CFDictionaryGetValue(gObjectWeakRefsMap, obj);
        CFSetRemoveValue(set, ref);
    }
    
    static void ClearWeakRefsForObject(id obj)
    {
        CFMutableSetRef set = (void *)CFDictionaryGetValue(gObjectWeakRefsMap, obj);
        [(NSSet *)set makeObjectsPerformSelector: @selector(_zeroTarget)];
        CFDictionaryRemoveValue(gObjectWeakRefsMap, obj);
    }
Implementation of MAZeroingWeakRef
With those basics in place, I'll now take a top-down approach to the rest of the implementation.

First, the convenience constructor and initializer. Mostly straightforward:

    + (id)refWithTarget: (id)target
    {
        return [[[self alloc] initWithTarget: target] autorelease];
    }
    
    - (id)initWithTarget: (id)target
    {
        if((self = [self init]))
        {
            _target = target;
            RegisterRef(self, target);
        }
        return self;
    }
The only tricky bit is that call to RegisterRef. That's an internal utility function which takes care of connecting the weak reference object to the target object, subclassing the target's class if necessary, and changing the target's class to be the custom subclass.

The dealloc implementation similarly calls a utility function to remove the weak reference object:

    - (void)dealloc
    {
        UnregisterRef(self);
        [_cleanupBlock release];
        [super dealloc];
    }
Toss in a simple description method so we can see what's going on internally:
    - (NSString *)description
    {
        return [NSString stringWithFormat: @"<%@: %p -> %@>", [self class], self, [self target]];
    }
And a standard setter for setting the cleanup block:
    - (void)setCleanupBlock: (void (^)(id target))block
    {
        block = [block copy];
        [_cleanupBlock release];
        _cleanupBlock = block;
    }
The target method gets a little more complicated. Because the target can be destroyed at any time, it needs to fetch its value while holding the global weak reference lock. It also needs to retain the target while holding that lock, to ensure that, if the target is alive, it stays alive until the receiver is done using it. This is of course balanced with an autorelease afterwards:
    - (id)target
    {
        __block id ret;
        WhileLocked(^{
            ret = [_target retain];
        });
        return [ret autorelease];
    }
Finally there's a private method used to zero out the target, which is called by the internal machinery when the target object is deallocated. Since the global lock is already held by that machinery, there's no need to explicitly lock it here too. This method simply calls and releases the cleanup block if there is one, and clears out the target;
    - (void)_zeroTarget
    {
        if(_cleanupBlock)
        {
            _cleanupBlock(_target);
            [_cleanupBlock release];
            _cleanupBlock = nil;
        }
        _target = nil;
    }
And that's it! Easy, right? Of course, all the interesting bits are in those utility functions, the utility functions they call, and on and on....

Implementation of Utility Functions
The implementation of UnregisterRef is simple. Get the target out of the MAZeroingWeakRef, get the table of references to the target, and remove the given reference. Wrap it all in a lock to ensure that the target can't be deallocated in the middle of this operation:

    static void UnregisterRef(MAZeroingWeakRef *ref)
    {
        WhileLocked(^{
            id target = ref->_target;
            
            if(target)
                RemoveWeakRefFromObject(target, ref);
        });
    }
RegisterRef is similar. In addition to adding the reference to the table of references, it also calls EnsureCustomSubclass. That function will, if necessary, create a new custom subclass and set the class of the target object to that subclass.
    static void RegisterRef(MAZeroingWeakRef *ref, id target)
    {
        WhileLocked(^{
            EnsureCustomSubclass(target);
            AddWeakRefToObject(target, ref);
        });
    }
The implementation of EnsureCustomSubclass is broken into many pieces. First it checks to see if the object is already an instance of a custom subclass. If it is, then nothing has to be done. If it's not, it then looks up the custom subclass that corresponds to the object's current class, and sets the class of the target object accordingly. If no custom subclass has yet been created, it creates it.
    static void EnsureCustomSubclass(id obj)
    {
        if(!GetCustomSubclass(obj))
        {
            Class class = object_getClass(obj);
            Class subclass = [gCustomSubclassMap objectForKey: class];
            if(!subclass)
            {
                subclass = CreateCustomSubclass(class, obj);
                [gCustomSubclassMap setObject: subclass forKey: class];
                [gCustomSubclasses addObject: subclass];
            }
            object_setClass(obj, subclass);
        }
    }
The implementation of GetCustomSubclass is easy. Get the object's class, and check to see if it's in the gCustomSubclasses set. If not, get the superclass, and follow it up the chain until one is found. If none are found, then there is no custom subclass for this object. (The reason for following the chain is so that this code will still behave correctly even if some other code, such as Key-Value Observing, sets its own custom subclass after MAZeroingWeakRef set one.)
    static Class GetCustomSubclass(id obj)
    {
        Class class = object_getClass(obj);
        while(class && ![gCustomSubclasses containsObject: class])
            class = class_getSuperclass(class);
        return class;
    }
Again, not too hard. The real fun begins in CreateCustomSubclass. The first thing it does is check to see if the object is a CoreFoundation toll-free bridged object. As I discussed above, the subclassing approach breaks for those objects, so they need to be rejected:
    static Class CreateCustomSubclass(Class class, id obj)
    {
        if(IsTollFreeBridged(class, obj))
        {
            NSCAssert(0, @"Cannot create zeroing weak reference to object of type %@ with COREFOUNDATION_HACK_LEVEL set to %d", class, COREFOUNDATION_HACK_LEVEL);
            return class;
        }
        else
        {
(COREFOUNDATION_HACK_LEVEL is the #define which determines how much CoreFoundation hackery to enable. As I mentioned above, I'm going through the code as through it's not enabled.)

The implementation of IsTollFreeBridged simply checks to see if the class name starts with NSCF:

    static BOOL IsTollFreeBridged(Class class, id obj)
    {
        return [NSStringFromClass(class) hasPrefix: @"NSCF"];
    }
For the else branch, the first order of business is to create a name for the new class. Since Objective-C class names have to be unique, it constructs a new name based on the original name and a unique suffix:
            NSString *newName = [NSString stringWithFormat: @"%s_MAZeroingWeakRefSubclass", class_getName(class)];
            const char *newNameC = [newName UTF8String];
Next, call objc_allocateClassPair to create a new class pair. (In Objective-C, each class has a corresponding metaclass, which is related to how the runtime works. The objc_allocateClassPair function creates both in one shot.)
            Class subclass = objc_allocateClassPair(class, newNameC, 0);
The new class implements two methods, release and dealloc. The next step is then to add those two methods to the class, pointing them to the functions which implement them:
            Method release = class_getInstanceMethod(class, @selector(release));
            Method dealloc = class_getInstanceMethod(class, @selector(dealloc));
            class_addMethod(subclass, @selector(release), (IMP)CustomSubclassRelease, method_getTypeEncoding(release));
            class_addMethod(subclass, @selector(dealloc), (IMP)CustomSubclassDealloc, method_getTypeEncoding(dealloc));
Finally, call objc_registerClassPair to register the new class with the runtime, and return the newly created class:
            objc_registerClassPair(subclass);
            
            return subclass;
        }
    }
Next, CustomSubclassRelease. Conceptually, the implementation of this class is simple. Acquire the global weak reference lock, and call [super release] while it's acquired. The purpose of this is to ensure that the final release for an object and its deallocation happens atomically, and an object can't be resurrected in between the two by a weak reference that hasn't yet been zeroed out.

The trouble is that simply writing [super release] won't work, because the compiler only allows that in a true, compile-time method implementation. In order to perform the equivalent action, it's necessary to figure out the superclass of the custom weak reference subclass. This is done using a simple helper function which calls GetCustomSubclass and returns the superclass of that class:

    static Class GetRealSuperclass(id obj)
    {
        Class class = GetCustomSubclass(obj);
        NSCAssert(class, @"Coudn't find ZeroingWeakRef subclass in hierarchy starting from %@, should never happen", object_getClass(obj));
        return class_getSuperclass(class);
    }
With that helper in place, the implementation of CustomSubclassRelease can use it to look up the superclass, use that to look up the superclass's implementation of release, and then call that with the lock held:
    static void CustomSubclassRelease(id self, SEL _cmd)
    {
        Class superclass = GetRealSuperclass(self);
        IMP superRelease = class_getMethodImplementation(superclass, @selector(release));
        WhileLocked(^{
            ((void (*)(id, SEL))superRelease)(self, _cmd);
        });
    }
Almost done! The one remaining function is CustomSubclassDealloc. It gets the table of weak references to the object and tells all of them to _zeroTarget. It then invokes the superclass implementation of dealloc using the same technique as CustomSubclassRelease uses.
    static void CustomSubclassDealloc(id self, SEL _cmd)
    {
        ClearWeakRefsForObject(self);
        Class superclass = GetRealSuperclass(self);
        IMP superDealloc = class_getMethodImplementation(superclass, @selector(dealloc));
        ((void (*)(id, SEL))superDealloc)(self, _cmd);
    }
That's it! You now have zeroing weak references to Objective-C objects (except to bridged CoreFoundation objects, which I'll get to next week).

Examples:
Basic usage of MAZeroingWeakRef is simple:

    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    NSObject *obj = [[NSObject alloc] init];
    MAZeroingWeakRef *ref = [[MAZeroingWeakRef alloc] initWithTarget: obj];
    
    NSLog(@"%@", [ref target]);
    [obj release];
    [pool release];
    
    NSLog(@"%@", [ref target]);
The first NSLog will print the object, and the second will print (null). The autorelease pool is used to ensure that the object is truly destroyed, because the use of target will put the object into the pool and otherwise it will stay alive longer.

Using a cleanup block is similarly simple:

    NSObject *obj = [[NSObject alloc] init];
    MAZeroingWeakRef *ref = [[MAZeroingWeakRef alloc] initWithTarget: obj];
    [ref setCleanupBlock: ^(id target) { NSLog(@"Cleaned object %p!", target); }];
    [obj release];
The log will print when [obj release] is called. Of course you can take more actions than simply printing. However, because the cleanup block is called while the global weak reference lock is held, you should try to keep your activities in there to a minimum. If you need to do a lot of work, set up a deferred call, using performSelectorOnMainThread:, GCD, NSOperationQueue, etc. and do the extra work there.

A simple way to turn a regular instance variable into a zeroing weak reference is to use MAZeroingWeakRef in your getter and setter, and then make sure to always use your getter in other code:

    // ivar
    MAZeroingWeakRef *_somethingWeakRef;
    
    // accessors
    - (void)setSomething: (Something *)newSomething
    {
        [_somethingWeakRef release];
        _somethingWeakRef = [[MAZeroingWeakRef alloc] initWithTarget: newSomething];
    }
    
    - (Something *)something
    {
        return [_somethingWeakRef target];
    }
    
    // use
    - (void)doThing
    {
        [[self something] doThingWithObject: self];
    }
And of course if you do that, you have to be sure to release your reference in -dealloc, just like any other object you allocate. Just don't release the target.

For a more advanced use, here's an addition to NSNotificationCenter that eliminates the need to manually remove an observer in dealloc:

    @implementation NSNotificationCenter (MAZeroingWeakRefAdditions)
    
    - (void)addWeakObserver: (id)observer selector: (SEL)selector name: (NSString *)name object: (NSString *)object
    {
        [self addObserver: observer selector: selector name: name object: object];
        
        MAZeroingWeakRef *ref = [[MAZeroingWeakRef alloc] initWithTarget: observer];
        [ref setCleanupBlock: ^(id target) {
            [self removeObserver: target name: name object: object];
            [ref autorelease];
        }];
    }
    
    @end
Note the use of a cleanup block to remove the notification observer when the object is destroyed. All you have to do is call addWeakObserver: instead of addObserver: in notification observers, and you'll never again forget to remove an observer in dealloc.

Similarly, if you're tired of mysterious crashes caused by NSTableView data sources being deallocated before the views themselves, you can easily fix it:

    @implementation NSTableView (MAZeroingWeakRefAdditions)
    
    - (void)setWeakDataSource: (id <NSTableViewDataSource>)source
    {
        [self setDataSource: source];
        
        MAZeroingWeakRef *ref = [[MAZeroingWeakRef alloc] initWithTarget: observer];
        [ref setCleanupBlock: ^(id target) {
            if([self dataSource] == target) // double check for safety
                [self setDataSource: nil];
            [ref autorelease];
        }];
    }
    
    @end
If you anticipate a scenario where you change the data source of a table view frequently, you'll want to write some more sophisticated code to clear out the old weak reference when adding a new one. However that is not a common scenario.

Essentially, any time you have a weak reference (an object reference that you don't retain or copy), you should use a MAZeroingWeakRef instead of a raw unretained pointer. It will save you trouble and pain and is extremely easy to use.

ZeroingCollections
The repository includes MAWeakArray and MAWeakDictionary, subclasses of NSMutableArray and NSMutableDictionary which use zeroing weak references to their contents. MAWeakDictionary uses strong keys to weak objects, which would be useful for many caching scenarios. I won't go through their code here, but they're simple, and you can look at the code in the repository if you're curious.

Although I didn't write them, it would be possible to creat a weak version of NSMutableSet and NSMutableDictionary which uses weak keys instead of, or in addition to, weak objects. These would be trickier due to hashing/equality issues with the weak references, but could certainly be done.

Conclusion
Zeroing weak references are an extremely useful construct present in many languages. Even Objective-C has them when running under garbage collection, but without GC, Objective-C code has been stuck using non-zeroing weak references, which are tricky and dangerous.

MAZeroingWeakRef brings zeroing weak references to manual memory managed Objective-C. Although it uses some trickery on the inside, the API is extremely simple to use. By automatically zeroing weak references, you avoid many potential crashers and data corruption. Zeroing weak references can also be used for things like object caches where non-zeroing weak references aren't very practical at all.

The code is made available under a BSD license.

For the next Friday Q&A in two weeks, I will discuss how MAZeroingWeakRef works around the problems with CoreFoundation objects. Until then, enjoy!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

I wonder if you've considered using objc_setAssociatedObject instead of class/method swizzling. That is, associate an object with the target object that, itself, gets released when the target object is deallocated. This may allow you to work with CF objects without needing any kind of hack or private-API calls
Associated objects make zeroing weak references really easy if you don't care about thread safety. Unfortunately they're completely useless for thread-safe zeroing weak references.

The reason is object resurrection. You have to zero out all weak references to an object before that object's dealloc (or CF finalize function) executes, and any queries to a weak reference which return an object must prevent dealloc or finalize from executing at all.

By the time an associated object is freed, it's far too late to prevent the object it's associated with from being destroyed, so you end up with horrible race conditions.

This is why the pure ObjC override hits both release and dealloc; simply overriding dealloc isn't enough, as you end up with a race condition between the final call to release and the call to dealloc.

The CF craziness gets around it because pure CF objects are allowed to resurrect themselves in their finalize function, thus I override that and check for the condition where a weak reference has caused resurrection before proceeding with object destruction.
I want to mention further: I don't think that non-thread-safe zeroing weak references are useful in any general sense. They may work for specific situations, but it's far too easy to lose control of your object's lifetime and end up having it be deallocated on another thread without meaning to. Just capture an object in a block and pass it to GCD and you're toast.

Such zeroing weak references are easy to construct, and have been done before. What MAZeroingWeakRef brings to the table is that it is, as far as I know, the first such implementation that is thread safe, and therefore safe to use in the general case.
Have you considered making MAZeroingWeakRef a proxy for its target?
Oh, and don't you have to be extraordinarily careful in the cleanup block? Almost any reference to the target is dangerous.

First, there's the problem of resurrecting the object. This is especially likely if you reference the target in a deferred way, like you suggest.

Second, there's the problem that the target is partially dealloc'd already, in the case that KVO (or something else) has subclassed after MAZeroingWeakRef.

It's not clear to me that even your NSNotificationCenter(MAZeroingWeakRefAdditions) category is safe.

I almost think that you shouldn't support passing the target to the cleanup block. Or perhaps it would be passed as a uintptr_t so it can't accidentally be used as an actual object pointer.
Regarding the proxy idea, I thought about it, but decided against it because it's too easy to forget that your target can disappear in the middle of your code if you use a proxy. For example:

if([weakRef condition]) // alive here
    [weakRef doSomethingImportantWithObject: self]; // gone here
else
    [self doSomethingEquallyImportant]; // never gets here

Whereas with an explicit reference, you have to work very hard to create that problem. Normally you'd write:

id obj = [weakRef target]; // obj is now in autorelease pool, stays alive through the end of the code
if([obj condition]
    [obj doSomethingImportantWithObject: self];
else
    [self doSomethingEquallyImportant];

And you don't have to structure your code to tolerate having the object disappear in mid-stream.

As for the cleanup block, referring to target there is no more dangerous (and actually slightly safer) than referring to self in dealloc, which we do all the time. The notification example is semantically the same as the extremely common pattern of removing a notification observer in dealloc.
This is incredibly useful. We are having to implement a related utility class that solves this problem from another perspective -- when we aren't the ones calling into the delegate. Specifically, we have found that UIWebViews continue to send messages to their delegate even after setting their delegate to nil on the main thread.

It appears that because of JavaScript running on a timer, or something to that effect, there are cases whereby an Invocation has been queued up to send from a background web thread while the main thread is busy unsetting the UIWebView's delegate (because that object is being dealloc'ed). By the time the invocation makes it to the main thread, it attempts to send a message to the deallocated object.

Attempting to solve it through another proxy-like object like this one.
That is nasty. An NSProxy subclass which bounces messages through a MAZeroingWeakRef should solve your problem, and is probably a nicer way to approach delegates/data sources for code you don't own in general. Attach the proxy to the UIWebView as an associated object (if you can require an OS version that has them) and you won't have to worry about memory management.
With all the talk about proxies, I went ahead and implemented one. MAZeroingWeakProxy is available in the subversion repository now. I still recommend using the explicit weak ref wherever it makes sense, but for things like faking out delegates for framework objects, the proxy is probably a better choice.
Thanks so much!
Sure, and I'd be very interested in hearing how it works out for you.
On looking at the source, it appears to be exactly what is called for. However, I fear that it may wind up having to leak the proxy in order to resolve the crash issue. The great thing is that it allows the real target to be released (of course), but to keep UIWebView from crashing I expect that I will need to keep the proxy object alive.

I can probably avoid a crash by releasing it after some time. I'd like to do something like this in my FooController:

// setup
webView = ...;
myProxy = [[MAZeroingWeakProxy alloc] initWithTarget:self];
webView.delegate = myProxy;

// dealloc
if (myProxy)
{
    [myProxy performSelector:@selector(description) withObject:nil afterDelay:60];
    [myProxy release];
    // Now, only NSObject has a strong reference.
}
webView.delegate = nil;

But of course, NSProxy doesn't support that method. I'd have to set up a non-repeating NSTimer, I think. Not the best, but I think it's the only way.
If your OS requirement is iOS 3.1+, then you can use objc_setAssociatedObject to associate the proxy with the web view. Then it won't get deallocated until the web view does.

Another possibility: can you subclass UIWebView? If so, you could add the proxy as an instance variable, and release it in dealloc.
The docs say:

    The UIWebView class should not be subclassed.

So I'll stick with objc_setAssociatedObject and implement some category methods on UIWebView. Thanks again!

Thanks a lot for this, I have already found it very useful over the past couple days.

In + initialize the line CFStringCreateMutable(NULL, 0); creates a mutable string that is never referenced or used. Can we safely remove this, or is it somehow part of the hackery?
Good catch. That's just a leftover from my earlier experiments with CF workarounds. (I was having some trouble and thought that CF might need to be poked to initialize some values.) I've removed it from the repository.
I figured it may have been used for something like that.

The other thing that I've observed is a rare set mutation while enumerating exception in ClearWeakRefsForObject. This is what I think is happening: it occurs if the cleanup block releases another instance of MAZeroingWeakRef that has its target set to the object being deallocated. When the MAZeroingWeakRef is deallocated RemoveWeakRefFromObject gets called, which mutates the set that is still being enumerated by makeObjectsPerformSelector:.

I observed this when I added another weak observer method to NSNotificationCenter to support blocks:


- (void)addWeakObserver:(id)observer forName:(NSString *)name object:(NSString *)object onQueue:(NSOperationQueue *)queue usingBlock:(void (^)(NSNotification *))block
{
    id observerToken = [self addObserverForName:name object:object queue:queue usingBlock:block];
    
    MAZeroingWeakRef *ref = [[MAZeroingWeakRef alloc] initWithTarget:observer];
    [ref setCleanupBlock: ^(id target) {
        [self removeObserver:observerToken];
        [ref autorelease];
    }];
}


And then used it something like this:


MAZeroingWeakRef *selfRef = [MAZeroingWeakRef refWithTarget:self];
[[NSNotificationCenter defaultCenter] addWeakObserver:self forName:MyNotification object:nil onQueue:nil usingBlock:^(NSNotification *note) {
    [[selfRef target] doWhatever];
}];


When NSNotificationCenter releases the block, the captured MAZeroingWeakRef gets deallocated while it's target is clearing its weak references.

I'm not sure the best way to handle a mutating set, but I presume that if the set could be enumerated while being mutated that it would solve this, or would there be some other side effect I've overlooked?
Nice catch. I think the answer is to enumerate over a copy of the set rather than the original. That will hurt performance slightly, but will allow cleanup code to mutate the set. I don't have time to make the modification right now, but it should be a simple matter of sticking a copy and release in the method in question. I will get to it at some point....
Replacing it with this code works:

static void ClearWeakRefsForObject(id obj)
{
    CFMutableSetRef set = (void *)CFDictionaryGetValue(gObjectWeakRefsMap, obj);

    NSSet *setCopy = [(NSSet *)set copy];
    [setCopy makeObjectsPerformSelector: @selector(retain)];
    [setCopy makeObjectsPerformSelector: @selector(_zeroTarget)];
    [setCopy makeObjectsPerformSelector: @selector(release)];
    [setCopy release];

    CFDictionaryRemoveValue(gObjectWeakRefsMap, obj);
}

Without the extra retain / release messages the objects can be deallocated while it's enumerating sending _zeroTarget. But I don't completely understand why; shouldn't setCopy already retain them?
Sending the extra retain / release messages feels sloppy...
The original set is specially constructed not to retain its contents, and the copy probably keeps that attribute. You could fix it by replacing the copy with [[NSSet alloc] initWithSet: (NSSet *)set]. Then you wouldn't have to manually retain/release the contents.
On the toll-free bridging front, I would note that there are bridged classes that your check won't catch: NSFont/CTFont{Descriptor} comes to mind.
That's why the check is different (better, as far as I know 100% reliable) if COREFOUNDATION_HACK_LEVEL is set to 1 or 2.

If you or anyone else should happen to know of a better way to check for TFBness without using private CF API or doing a lame check for an NSCF prefix, I'd be very interested to know. I was not able to find any such way when I built this, but that doesn't mean it's not possible.
Thanks again for this. Fully integrated it, and it works beautifully. The proxy is deallocated with the UIWebView, and the zeroing ref's target was nil as intended.

The only catch was that I had to "insert" the contents WhileLocked, since I am targeting blockless iOS 3.2.
Duh, I forgot that WhileLocked would cause problems on blockless OSes. I should probably macroize it so it can work on older systems.
I've now fixed the code to build without blocks, by making WhileLocked a macro. I've also fixed the problem found by Steve Brambilla which could cause a mutation exception while zeroing out targets. These changes are in my repository.

Thanks very much to both of you for finding these problems, and Jason, it's very neat to hear how you've used this stuff to solve your problem.
Just read through the entry and all the comments. Brilliant work! Thank you!

A couple of questions though:

1. According to the thread-safety summary on Apple's Developer site (http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/Multithreading/ThreadSafetySummary/ThreadSafetySummary.html#//apple_ref/doc/uid/10000057i-CH12-SW1), "Object allocation and retain count functions" and "Zone and memory functions" are thread-safe. Why then do we need a mutex around the call to release and retain?

2. In ClearWeakRefsForObject, why make a copy of the set instead of taking a mutex around it like everywhere else?

Thanks!
The mutex around the release (I don't think there's one around retain) is to prevent a race condition that would appear otherwise. Imagine this scenario: thread A releases the last strong reference. Thread B uses a weak reference. Thread A destroys the object. Thread B accesses the destroyed object, crashing.

The mutex prevents this from happening. Essentially, Apple's code is thread safe but the code I add on to it is not unless I add a mutex to it as well. Thread safety is often tricky like that: X can be thread safe, and Y can be thread safe, but the combination of X+Y may not necessarily be thread safe without taking additional action.

As for copying the set, it is possible for a cleanup block to mutate the set of weak references on the object, which then causes an enumeration mutation exception. (It is forbidden to mutate a collection while enumerating over it in Cocoa.) Thus I make a copy of the set to ensure that it doesn't get mutated during enumeration. See TestCleanupBlockReleasingZWR in main.m for an example of how that could happen.
Thanks for the clarification! I was referring to the mutex around the call to -retain in -target, but I see why that's necessary now. I also understand your argument about -release and -retain being thread-safe on their own, but the fact that you want to make sure the operation of "release and nil out references if necessary" is atomic as a whole. Am I right?

Also, I'm guessing (correctly, I hope) that I don't need to duplicate the set for my implementation because I am not using a cleanup block and so it should not be the case that something is enumerating over the set while the another thing is mutating it.
Right, I'm basically expanding the scope of what needs to be atomic, so I have to add my own mutex to the mix.

If you are leaving out cleanup blocks then you won't need to duplicate the set, as without them, there is nothing that _zeroTarget can do to cause the set to mutate.
Hmm, OK, another edge case: in ClearWeakRefsForObject, you make a copy of the set to ensure that you are not iterating over a set that's being mutated. Now, what if, as part of the cleanup block, someone goes and creates a weak reference to the object for which ClearWeakRefsForObject is being called? It looks like the new value would get added to the set, but the set would get removed from the dictionary at the end of ClearWeakRefsForObject, leaving a dangling pointer behind in this brand new weak ref...

Potential fixes: when you enter ClearWeakRefsForObject, poison the object that you are entering for and make sure no new refs get added for it.
If you create a new weak reference to the target in your cleanup block, I would consider that to be a bug.

However, in a program with complicated object graphs, it may not be easy to realize what you're doing. This is why you should keep your cleanup block minimal.

I'm starting to think that zeroing the target and calling the cleanup block should happen in two passes. Pass one, zero all targets. Pass two, execute cleanup blocks for anything that needs it. That will ensure that you can't accidentally make new references to the target (or accidentally resurrect the target through an existing weak reference), as the target parameter will be the only way you could possibly get to it.

I'll have to think about this....
After more thought, I decided that splitting weak ref cleanup into two stages makes a lot of sense. It makes it much easier to write safe cleanup blocks and eliminates the potential for some ugly bugs. I've gone ahead and made that change and committed it.
It eventually comes down to the same thing as "resurrection" in the GC world, wherein if you have snuck away a pointer to object A somewhere, you can try to make a new weak pointer to it during A's dealloc, inside your cleanup block. There are no clear safeguards against it, but at the least you can add a check for retainCount > 0, before you agree to make a weak ref out of an object.
You can't really check for retainCount > 0, because that never happens. Cocoa reference counting is optimized to not store an entry for objects with a retain count of 1, so any object without a retain count entry is assumed to have a count of 1. When you release an object with a count of 1, it is destroyed, but it never gets an entry to say that its count is 0, because, well, it's being destroyed.

I suppose I could maintain a collection of weak refs that are in the middle of destruction and assert that nobody tries to resurrect them, but I think that with the two-stage zeroing/cleanup process, it's not really needed.
Hmm... you might be right. I've yet to actually check whether retainCount indeed returns 0 when the object is in -dealloc.

In the meanwhile though, I've found a more concerning issue. It looks like the custom subclasses that KVO makes do not like to be subclassed. I was seeing some random crashes in CFEqual recently and traced the problem down to the fact that one of my object's class information was getting corrupted when its class was changed to be a weak reference.

Here's briefly what I got in the debugger:

BEFORE object_setClass is called:
> po [object class]
AppDelegate
> po currClass
NSKVONotifying_AppDelegate
> po customSubclass
NSKVONotifying_AppDelegate_WeakReferenceable

And just AFTER object_setClass is called:
> po [object class]
0x0
> po object
<not an object or object does not respond to description method>

This is very disturbing and I'm not sure exactly what's going wrong. Have you seen anything like this? Any ideas?
That is really odd. I was afraid of KVO subclasses getting screwed up, but a quick test doesn't show any problems. The only way I can possibly see that your class could end up as 0x0 after object_setClass is if you're passing nil to that function to begin with.

Would it be possible for you to produce a small test case that shows the problem? My KVO test case is up on my svn and github if you want to compare with it.
I have now reproduced the problem, or at least a problem. It doesn't manifest until you actually call an observed setter, or at least it didn't for me. It appears that the problem lies in KVO assuming that self->isa is the KVO class, and calling object_getIndexedIvars on it. I'm investigating to see if this problem can somehow be mitigated in the general case. Overriding automaticallyNotifiesObserversForKey: to always return NO and doing manual notification should be a decent workaround though.
Well, I was easily able to reproduce the problem like this:

1. Create a new Xcode project. (The App Delegate should be automatically generated for you.)
2. Open up the default XIB file and add an NSTextField to it, bound to some property of the App Delegate.
3. Declare the appropriate property in the App Delegate so that the binding works.
4. In -applicationDidFinishLaunching:, query object_getClass(self) and ensure that it's the NSKVONotifying version. Then, do the custom subclassing and query the object, i.e., [self description] and [self class] to see that you get junk values.

Unfortunately, KVO is such a common case (bindings!) that I at least won't be able to use Weak Referencing until we can make it be friends with KVO. :-/
And fixed. I solved it by inserting the ZWR subclass above the KVO subclass. This is horrible, but it does the job.

The fix is up on github, but not yet in Subversion, as I have run into trouble getting them to communicate now.
Yeah, that's what I did as well... Sigh... Looking for a better (more fool-proof) solution though...
Thinking about it more, I believe a better solution would be to swizzle out the KVO subclass's implementation of - release and -dealloc and basically have a separate override path for them. This would prevent the KVO subclass from doing anything bad in release or dealloc. However, I can't think of a specific example, and I believe this probably can't actually happen in practice.
Mike,

Again, thank you for this excellent pair of classes. It has come in handy in a number of situations.

Recently I have discovered a bug in MAZeroingWeakProxy, however. The usual program flow appears to go through "forwardingTargetForSelector:".

However, if the weak target has been deallocated, this will return nil, so the message goes through "forwardInvocation:". This works correctly, except for delegate-style messages returning void. I had come across a bug whereby the internals of NSInvocation was crashing because that method's "returnLength" was zero, and NSInvocation's "setReturnValue:" was crashing because of it (it calls setArgument:atIndex: which then calls __NSI2, __NSI0, and crashes).

Simply wrapping those three lines in an "if (returnLength > 0)" block avoids the crash.
Thanks very much for the bug report. I've applied your recommended fix in the project on github:

http://github.com/mikeash/MAZeroingWeakRef

Due to the difficulty of maintaining both repositories, I am not currently maintaining the Subversion repository with fixes like this. If there's demand for it I'll get them synced up, but github seems to be a better place to keep this anyway.
What is everyone's opinion on this approach?

At startup, store NSObject's -release IMP somewhere and set it to your own function. In your function, check [self retainCount]. If retainCount is 1, then the object is doomed and will be deallocated upon calling NSObject's original IMP for -release.
One problem with that is that it won't work on classes which implement their own reference counting mechanism, which while not exactly common, is not entirely unheard of either. Inline reference counting can be a nice little performance boost for objects that get retained and released with high frequency.
I've found another problem with MAZeroingWeakReference and KVO, unfortunately. I've posted a gist with two tests, one which crashes, and another which passes. The crash occurs when a KVO object is released, and you subsequently try to use the target of the ref. If the KVO subclass was created first, the crash will occur when dereferencing _target. If the target is embedded in an MAZeroingWeakRef first, and then KVO observed, no crash occurs.

The gist is available here: https://gist.github.com/871905#file_ma_zeroing_weak_ref%20kvo%20crash%20test
Thanks a lot for that. I've fixed it. Your "NoCrash" test also revealed another KVO-related bug due to an ordering dependency in when subclasses are created. I assume you didn't encounter it because you ran your tests in a different order from mine. Fixes for both are now up on github.
No problem, thanks for the fix! If you don't mind me asking, how did you come up with the fix for the crash I reported? I kind of figured it was being caused by KVO not respecting the runtime class hierarchy, but after that I had no idea where to proceed in trying to fix it.
Here's a rough reconstruction of my steps.

1. Run the test, watch it crash. Notice that it's crashing because it's accessing a deallocated object.
2. Something must be going wrong with CustomSubclassDealloc to cause the weak reference not to be zeroed. Put a breakpoint there to see what happens.
3. Run the test again, breakpoint is not hit. Odd.
4. Add a stub dealloc method to KVOTarget to see if that is firing. Breakpoint on that to see what happens.
5. That breakpoint is hit. It is not being called from CustomSubclassDealloc, but rather from a private Foundation function called NSKVODeallocate.
6. Check out the annotated disassembly of that function from otx. Notice that it invokes my dealloc by fetching the IMP using class_getInstanceMethod. Immediately before that call is a call to object_getIndexedIvars.
7. Put a breakpoint on object_getIndexedIvars with the program stopped just before destroying target. Run, hit the breakpoint. Print the argument to object_getIndexedIvars, and find that it's the KVO subclass.
8. Guessing, I print *(id *)object_getIndexedIvars(KVO subclass) and discover that it's KVOTarget.
9. Write code to overwrite that with the intermediary class, run it, tests all pass, decide it's good.
Thanks. I got to step 5, and was lost thereafter. Thanks for the pointer to otx, and thanks for giving us so many useful classes!
otx is really useful. Its annotations are a huge help for bridging the gap between seeing assembly as gibberish and understanding it top to bottom.
Is there anyway to forward class messages to the parent class when a dynamic subclass is created? I'm using resolveInstanceMethod: to support dynamic properties. I've hacked the code to call just that one, but there might be other class methods that no longer work properly once a dynamic subclass is created.
I'm not sure if I've understood you correctly, but any class methods that you don't override will go to the parent class's implementation, since inheritance works at the class level as well as the instance level.
Mike, this is some fascinating stuff.

Just so I'm sure I understand the interaction of MAZeroingWeakRef and blocks' implicit retaining behavior, is the following a valid way to create a block that:

1) References self or an ivar of self, and
2) Can be copied and stored inside self (or an object retained by self) without creating a retain cycle, and
3) No-ops if self is dealloc'ed before the block is invoked (say, via a copy stored in another object)

?

@interface Foo : NSObject
@property (copy) void (^block)();
@end

@implementation Foo

@synthesize block = _block;

- (void)foo {
   MAZeroingWeakRef* weakSelf = [[[MAZeroingWeakRef alloc] initWithTarget:self] autorelease];

  self.block = ^{
    [[weakSelf target] bar];
  };
}

- (void)dealloc {
  self.block = nil;
  [super dealloc];
}

@end

Yep, that code looks good. In fact, this is a fairly common use case for this library in my experience. It's really easy to make retain cycles with blocks, and hard to deal with non-zeroing weak references in their place. Zeroing weak references make everything much easier.
Probably a naive question: why use a recursive (global) mutex? Just so you can lock/release it from a thread that already owns a lock on it? Is that better than a mutex variant which makes a lock and the matching release noops when executed by a thread that already has the lock?
A recursive lock is functionally identical to a lock that ignores locks/unlock pairs on a thread that already has the lock.
Great article! However, instead of isa-swizzling, couldn't you have used method_exchangeImplementations() for the dealloc and release methods? You can call the original implementations from your implementations.
It seems highly easier to use rather than creating a dynamic subclass, and may solve all of the KVO/CF problems.
The problem with swizzling dealloc is that it will affect every instance of that class, even ones that are not weakly referenced.

For a worst case scenario, consider allocating a plain NSObject instance, and then creating a weak reference to it. With that approach, you've now swizzled -[NSObject dealloc], meaning that every single object in the system is going through custom code, causing a potentially large performance impact, and increasing the chances of an unfortunate interaction with other code.
Can this be used for references in UIKit frameworks such as UIGestureRecognizer that may be internally referencing their delegate properties by iVars?
In CustomSubclassRelease, shouldn't WhileLocked surround the entire function? The call to GetRealSuperclass calls GetCustomSubclass which accesses the shared variable gCustomSubclasses.
It is state that:

- (void)setFoo: (id)newFoo
{
    _foo = newFoo;
}
Because the setter does not use retain, the reference does not keep the new object alive. It will stay alive as long as it's retained by other references, of course. But once those go away, the object will be deallocated even if _foo still points to it.”

Excerpt From: Mike Ash. “The Complete Friday Q&A: Volume 1.” Apple Books.

But in ARC, I think an instant variable hold a strong reference. Which means that _foo keep hold of the memory of newFoo, right.
Please explain me this ?

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
Formatting: <i> <b> <blockquote> <code>.
NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.