mikeash.com: just this guy, you know?

Posted at 2010-04-16 16:35 | RSS feed (Full text feed) | Blog Index
Next article: Mistakes and Chains of Events
Previous article: An Objective-C Continuations Library
Tags: enumeration fridayqna objectivec
Friday Q&A 2010-04-16: Implementing Fast Enumeration
by Mike Ash  

Last week I discussed the various options available in Objective-C for enumerating over a collection This week I'm going to finish up the discussion of enumeration with a guide on how to implement Fast Enumeration in your own program.

Basics
There are two benefits to implementing Fast Enumeration. One is that you can then use your object as the target in a for/in loop. The other is that, with a good implementation, such enumeration will be fast.

Implementing Fast Enumeration is accomplished by implementing the NSFastEnumeration protocol. This protocol contains only a single method:

    - (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(id *)stackbuf count:(NSUInteger)len;
Easy enough, right? But what's that NSFastEnumerationState thing?
    typedef struct {
        unsigned long state;
        id *itemsPtr;
        unsigned long *mutationsPtr;
        unsigned long extra[5];
    } NSFastEnumerationState;
This starts to get a little bit complex....

Deciphering Fields and Parameters
To understand how to implement this method, you need to understand what all of the fields and parameters mean, as well as the return value. I'll take these out of order.

The objective of this method is to return a series of arrays of objects. Each call returns one array, which allows objects to be returned in bulk. For speed, this uses a C array, which means that it needs a pointer and a length.

The length is provided by the return value of the method. That's what the count refers to in the name of the method. The array is actually the itemsPtr field of the NSFastEnumerationState struct. These two values together define the array returned by the method.

NSFastEnumeration is designed to allow returning a pointer to internal storage. However, not all data structures fit well with that, so it's also designed to allow copying objects into an array provided by the caller. That caller-provided array is stackbuf, and its size is given by len.

NSFastEnumeration is also designed to detect when a collection is mutated while being enumerated, and throw an exception if this occurs. mutationsPtr is indended to be pointed to a value which changes if the collection is mutated.

That's just about everything. The only fields I haven't covered yet are the state and extra fields of NSFastEnumerationState. These are just freeform fields which the callee can use to store whatever values it finds useful.

Generated Loop Code
Now we know what all these things are for, but to really understand how this stuff works, it's best to understand what kind of code the compiler generates. You write this:

    for(id obj in collection)
    {
        // body
    }
What really goes on behind the scenes?

The compiler creates an NSFastEnumerationState on the stack, as well as a stack buffer. It creates two nested loops, one which repeatedly calls countByEnumeratingWithState:... and one which loops over the array it returns. It ends up being something like this:

    // declare all the local state needed
    NSFastEnumerationState state = { 0 };
    id stackbuf[16];
    BOOL firstLoop = YES;
    long mutationsPtrValue;
    
    // outer loop
    NSUInteger count;
    while((count = [collection countByEnumeratingWithState: &state objects: stackbuf count: 16]))
    {
        // check for mutation, but only after the first loop
        // (note that I'm not sure whether the real compiler puts this
        // in the inner loop or outer loop, and it could conceivably
        // change from one compiler version to the next)
        if(!firstLoop && mutationsPtrValue != *state.mutationsPtr)
            @throw ..mutation exception...
        firstLoop = NO;
        mutationsPtrValue = *state.mutationsPtr;
        
        // inner loop over the array returned by the NSFastEnumeration call
        id obj;
        for(NSUInteger index = 0; index < count; index++)
        {
            obj = state.itemsPtr[index];
            // body
        }
    }
Notice how this code never touches or examines the state and extra fields. As I mentioned before, these are provided purely for the use of the collection, and to facilitate that, their value is preserved between calls while within the same loop.

Returning One Object At a Time
A major point of NSFastEnumeration is to achieve speed through bulk enumeration. Returning one object at a time defeats that point. However, it's easy to implement, and still gets you the benefit of being able to use for/in syntax. In the spirit of avoiding premature optimization, if returning one object at a time is easy, then go for it.

As an example, imagine you have a linked list class:

    @implementation LinkedList : NSObject
    {
        struct Node *listHead;
    }
Now let's implement NSFastEnumeration for this class, in the simplest possible way, by returning one object at a time:
    - (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(id *)stackbuf count:(NSUInteger)len
    {
        // plan of action: extra[0] will contain pointer to node
        // that contains next object to iterate
        // because extra[0] is a long, this involves ugly casting
        if(state->state == 0)
        {
            // state 0 means it's the first call, so get things set up
            // we won't try to detect mutations, so make mutationsPtr
            // point somewhere that's guaranteed not to change
            state->mutationsPtr = (unsigned long *)self;
            
            // set up extra[0] to point to the head to start in the right place
            state->extra[0] = (long)listHead;
            
            // and update state to indicate that enumeration has started
            state->state = 1;
        }
        
        // pull the node out of extra[0]
        struct Node *currentNode = (struct Node *)state->extra[0];
        
        // if it's NULL then we're done enumerating, return 0 to end
        if(!currentNode)
            return NULL
        
        // otherwise, point itemsPtr at the node's value
        state->itemsPtr = &currentNode->value
        
        // update extra[0]
        if(currentNode)
            state->extra[0] = (long)currentNode->next;
        
        // we're returning exactly one item
        return 1;
    }
This is really not bad at all. It gets a little ugly with the pointer/integer casting, but that's C for you....

Returning Copied Objects in Bulk
Let's say that it turns out the above method really is too slow and you want to make it faster. You can do this by returning objects in bulk. Because the objects in the linked list aren't stored contiguously, you have to do this by copying objects into the stackbuf. While no guarantee is given as to the size of stackbuf, we can assume that it's made large enough to justify this sort of thing. Here's how the code would look:

    - (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(id *)stackbuf count:(NSUInteger)len
    {
        // plan of action: pretty much the same as before,
        // with extra[0] pointing to the next node to use
        // we just iterate over multiple nodes at once
        if(state->state == 0)
        {
            state->mutationsPtr = (unsigned long *)self;
            state->extra[0] = (long)listHead;
            state->state = 1;
        }
        
        // pull the node out of extra[0]
        struct Node *currentNode = (struct Node *)state->extra[0];
        
        // keep track of how many objects we iterated over so we can return
        // that value
        NSUInteger objCount = 0;
        
        // we'll be putting objects in stackbuf, so point itemsPtr to it
        state->itemsPtr = stackbuf;
        
        // loop through until either we fill up stackbuf or run out of nodes
        while(currentNode && objCount < len)
        {
            // fill current stackbuf location and move to the next
            *stackbuf++ = currentNode->value
            
            // move to next node
            currentNode = currentNode->next;
            
            // and keep our count
            objCount++;
        }
        
        // update extra[0]
        if(currentNode)
            state->extra[0] = (long)currentNode->next;
        
        return objCount;
    }
This is not too much harder, and will significantly reduce the number of message sends that occur in the for/in loop.

Returning a Bulk Interior Pointer
For best efficiency, you can return a pointer to contiguously stored objects. For example, say you have a simple array class like this:

    @interface Array : NSObject
    {
        id *pointer;
        NSUInteger count;
    }
Implementing NSFastEnumeration for this class is really easy. It can return a single interior pointer to all of the objects, and that's it
    - (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(id *)stackbuf count:(NSUInteger)len
    {
        if(state->state == 0)
        {
            state->mutationsPtr = (unsigned long *)self;
            state->itemsPtr = pointer;
            state->state = 1;
            return count;
        }
        else
            return 0;
    }
That was easy! It'll also be really fast, because the enumeration loop will basically devolve into a straight C for loop.

This technique can also be used, with some care, for more complex data structures. If you have a series of contiguous object pointers, you can return pointers to each one in turn, which will result in efficient enumeration over all of the objects in sequence. You can make good use of the extra values to keep track of where you are in your internal data structure.

A Note on Temporary Objects
You may find it useful to store Objective-C objects in the extra values:

    state->extra[1] = (long)[NSArray arrayWith...];
But beware! This will break with this completely legal enumeration code:
    NSAutoreleasePool *pool = [NSAutoreleasePool new];
    for(id obj in collection)
    {
        // do stuff with obj
        [pool release];
        pool = [NSAutoreleasePool new];
    }
    [pool release];
When the autorelease pool goes away, it'll take your array with it, and the next time you try to access it, you'll explode. And you can't retain the array, either, because there's no guarantee the caller will loop all the way to the end to let you release it; they might break out of the loop early, and then you've leaked the object.

There's really no general way to solve this. (I've concocted a completely insane scheme which involves tracking the position of the stack pointer to know when it's safe to destroy temporary objects, but it's, well, completely insane.) If you can, try to avoid storing temporary Objective-C objects in extra like this. And if you must do it, just keep in mind that you have to be careful with autorelease pools in the for/in loops that you use with this object. Since you're likely to be the only client of your NSFastEnumeration implementation, this is a reasonable constraint to make, but it's something that you have to be aware of.

Conclusion
Implementing NSFastEnumeration allows you to use nice, simple syntax for enumerating over your custom objects which are conceptually collections of other objects. As a bonus, it will usually speed up that enumeration as well. And while NSFastEnumeration can look daunting at first glance, it's actually pretty easy to write an implementation of it, depending on just how hard-core you want to get and how complex your internal data structures are.

That's it for this week's Friday Q&A. Come back in another seven days for more gooey goodness. Friday Q&A is driven by user submissions, so in the meantime, if you have an idea for a topic that you'd like to see covered here, please send it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

I think you mean:

firstLoop = NO;

in the while loop example.
It's probably worth noting that many of the examples aren't 64-bit clean -- extra[0] isn't going to hold a 64-bit pointer without some backup from extra[1]. (Not to imply that I think you don't realize that, just that folks who find the page by Googling may need the reminder).
Skirwan: actually, I believe longs are 64-bit on Mac OS X (when compiling for 64 bits), see LP64 on this web page:
http://developer.apple.com/macosx/64bit.html
Aaron: You are correct, fixed, and thanks.

Skirwan: It would be nice if people would check these claims before making them. Mac OS X uses LP64, which means that long is 64 bits. That is the precise reason why these fields are of type long.
That's what I get for running my mouth at a higher clock than my brain. Apologies.
In the "Returning One Object At a Time" example:


        // otherwise, point itemsPtr at the node's value
        state->itemsPtr = &currentNode-;>value


Believe there is a superfluous semi-colon there.

Thanks for the article and the entire series.
Thanks for pointing that out. HTMLification gone wrong. I've fixed it now.
I think you mean "return 0" here:

        if(!currentNode)
            return NULL
One question about the Apple doc:

stackbuf
A C array of objects over which the sender is to iterate.



itemsPtr
A C array of objects.


I think the doc is very confusing, or right down wrong. It tells you that it is stackbuf that the sender is to iterate but in fact it it state ->itemsPtr. In fact, you can completely ignore stackbuf in your implementation and use an internal data structure to back up state ->itemsPtr.
Yes, return 0; is what I meant to write there. NULL will work, but is ugly.

As for the documentation question, I agree, it looks wrong. stackbuf is just a convenient storage, and the sender won't iterate over it unless you tell it to.
For anyone interested, I've made a series of blocks-based categories that rely on NSFastEnumeration for their implementation. I used this article as a reference. I'd appreciate any comments anyone has.

https://github.com/hborders/HBCollections
hm, by some unknown to me reason compiler tries to read memory pointed by mutationsPtr. (i tried to store element count there).
ah, sorry, didn't read the line "mutationsPtrValue != *state.mutationsPtr" carefully
Mike, do you (or any of your readers) know what happens when the body in
for(id obj in collection)
    {
        // body
    }

contains a break statement? What does your suggested implementation look like with break support?

Specifically, does [NSFastEnumeration countByEnumeratingWithState: objects: count:] get called with a special state value to indicate the end of the enumeration?

Regarding your point about temporary objects, that problem could be solved if the function was always called one more time with a special state value at the end of the enumeration, to give the implementation a chance to clean itself up, after retaining its objects in the first place. One wonders whether Apple has thought of that, and if not, whether a bug should be filed!

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
Formatting: <i> <b> <blockquote> <code>.
NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.