mikeash.com: just this guy, you know?

Posted at 2012-02-13 04:26 | RSS feed (Full text feed) | Blog Index
Next article: Friday Q&A 2012-02-17: Ring Buffers and Mirrored Memory: Part II
Previous article: Friday Q&A 2012-02-03: Ring Buffers and Mirrored Memory: Part I
Tags: evil objectivec threading
Deadlocks and Lock Ordering: a Vignette
by Mike Ash  

Every once in a while, when writing threaded code, you may find yourself wanting to acquire two different locks in a critical section. Normally one should resist such perversions, but sometimes they just end up being necessary, or too tempting. Holding multiple locks at the same time immediately raises the specter of deadlock: if two threads acquire the same locks in a different order, they can end up waiting on each other forever.

The solution to this conundrum is to enforce a lock order. Sometimes there's a natural ordering, like parent and child. But sometimes you just have two locks and you need to take both. Code like this can be dangerous:

    @synchronized(a)
    {
        @synchronized(b)
        {
            // do stuff
        }
    }

If two threads have opposite ideas of what a and b are, this carries the potential for deadlock. Some way needs to be found to ensure that every thread locks the same object first. In the absence of any other way to do it, an ordering can be imposed on these objects by comparing their addresses. If we always lock the one with the lower address first, we'll never deadlock, even with multiple threads grabbing the objects from different sources and not communicating with each other.

Fortunately, this comparison is easy, since C allows using the comparison operators on pointers of the same type:

    id min, max;
    if(a < b)
        min = a, max = b;
    else
        min = b, max = a;

    @synchronized(min)
    {
        @synchronized(max)
        {
            // do stuff, safely
        }
    }

This eliminates the risk of deadlock. Hooray!

You might notice that the code for finding the object with the lower address looks sort of like the code you might write for finding the smaller integer from two different int varibables. Sort of exactly like the equivalent code for integers, in fact. The only difference is the types.

Thus the punchline: Cocoa's MIN and MAX variables, being written in a completely type-generic fashion, work on object pointers just as well as they do integers and floats. They can be used to solve this problem a little more nicely:

    @synchronized(MIN(a, b))
    {
        @synchronized(MAX(a, b))
        {
            // do stuff, safely
        }
    }

This code is equivalent to the above, but much more compact without sacrificing any readability or safety. Except for the part where you use MIN and MAX on pointers, you might consider that to sacrifice readability. Depends on personal taste.

This works with other locking constructs as well. For example:

    NSLock *aLock = [[NSLock alloc] init];
    NSLock *bLock = [[NSLock alloc] init];

    ...

    [MIN(aLock, bLock) lock];
    [MAX(aLock, bLock) lock];
    // do stuff, safely
    [MAX(aLock, bLock) unlock];
    [MIN(aLock, bLock) unlock];

It's a little crazy to see a MIN macro as the target of an Objective-C message send expression, but it works just fine. Even pthread mutexes and spinlocks can use this, as long as you apply it to their pointers, and not to the values themselves.

If you ever find yourself needing to acquire two locks at once, first, think very hard to see if you can avoid doing so. But if you must, the MIN and MAX macros make for an easy way to ensure a consistent, deadlock-free lock ordering across all of your various threads.

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

Nice. I just posted an article on my blog (http://sutes.co.uk/2012/02/deadlocks.html) about a pattern that I use when you have two locks. It’s for the case where you usually want the child lock but depending on some condition you want the parent lock as well.
I don't believe that your pointer comparison is valid here.

§6.5.8 of the C99 standard defines the valid comparisons between pointer objects:

5. When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.


In your example, a and b can point to arbitrary locations in memory. Thus the comparison between them results in undefined behavior.

Granted, all that matters is that the behavior is consistent, but that's not guaranteed.
I figure a summary of the subsequent discussion on Twitter might be interesting to anyone who finds this page.

My reading of the quoted paragraph is that, since the result of pointer comparison depends on the relative locations in the address space, the result is a consistent total ordering of all pointers and the comparison is safe. I reconcile this with "the behavior is undefined" by interpreting it as saying that, while the ordering will be consistent, it will not be predictable in advance. In other words, a < b will always produce the same result for the same inputs, but you can't rely on e.g. local variables' pointers being in a particular order, or heap allocations being in a particular order, or local variables being ordered before/after heap allocations.

I think this satisfies every condition listed in that paragraph, but reading standards is hard (let's go shopping!), this paragraph is definitely not very clear, and I could be wrong. However, I'm happy enough with it to rely on it working for any compiler/architecture combo Cocoa code will ever run on.
Kyle Sluders reading and interpretation of the standard is correct- for the type of comparison being discussed, the standard says that the behavior is undefined. The standard explicitly enumerates when such pointer relational comparisons are defined and valid (i.e., different members of a struct and elements of the same array).

Everything else is explicitly undefined. No where in the standard is there anything that remotely supports Mikes "consistent total ordering of all pointers" interpretation.

From Derek Jones "The New C Standard", on §6.5.8:

Commentary

The C relational operator model enables pointers to objects to be treated in the same way as indexes into array objects. Relational comparisons between indexes into two different array objects (that are not both subobjects of a larger object) rarely have any meaning and the standard does not define such support for pointers. Some applications do need to make use of information on the relative locations of different objects in storage. However, this usage was not considered to be of sufficient general utility for the Committee to specify a model defining the behavior.

[...]

Common Implementations

Most implementations perform no checks prior to any operation on values having pointer type. Most processors use the same instructions for performing relational comparisons involving pointer types as they use for arithmetic types. For processors that use a segmented memory architecture, a pointer value is often represented using two components, a segment number and an offset within that segment. A consequence of this representation is that there are many benefits in allocating storage for objects such that it fits within a single segment (i.e., storage for an object does not span a segment boundary). One benefit is an optimization involving the generated machine code for some of the relational operators, which only needs to check the segment offset component. This can lead to the situation where p >= q is false but p > q is true, when p and q point to different objects.


The above not withstanding, the pointer comparison in question is pretty safe for the set of compilers and architectures the code is meant for, and almost certainly true for any future architecture.

This is one of those undefined behaviors where if it's going to be a problem, you're very much aware that it's a problem- you're writing C code for a PIC microcontroller that's got 4096 bytes of memory accessed in 256 byte chunks via a segment register. :)
For a comparison to be undefined doesn't mean you will get back an illogical value, it means your program may crash, hang, initiate global thermonuclear war, start a game of Nethack or whatever the compiler wants.
Objection!

This is a very clever hack, and I like it, but LL is right. It's undefined. If you want to make this work, then just cast your pointers to intptr_t first, and back to a pointer when you're done:

@synchronized((void*) MIN((intptr_t) a, (intptr_t) b)) { ...

It's a little uglier, but it's totally valid and well-defined C! Now your clever multiprocessing code will run even on your 286 in protected-mode segments, if you want.

OK, I joke, but there does appear to be an actual problem with this code as originally written.

Interestingly, Xcode/Clang generates different object code for this, depending on whether you cast it properly or not. In fact, the version without the casts is longer, and seems to include 4 extra objc_retain calls, which are unmatched by any apparent objc_release calls.

Looking through the code for __NSMIN_IMPL__ (and ...MAX...), I can perhaps see why. Xcode's MIN() isn't the trivial #define they show you on the first day of C preprocessor class. It makes a little block, creates local variables of the same type as the arguments, and then compares those (so it doesn't have to evaluate the parameters twice). I would totally believe that ARC would get a little confused by that, and throw in -retain calls to each object used by the preprocessor blocks. (MIN(a,b) + MAX(a,b) = 4 objects passed = 4 retain calls.)

That suggests to me that calling MIN() or MAX() on NSObject pointers when using ARC can result in a memory leak. If you actually have two arbitrary id's you need to compare, cast to intptr_t and compare those, because that's what it's there for.

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
Formatting: <i> <b> <blockquote> <code>.
NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.