mikeash.com: just this guy, you know?

Posted at 2019-10-11 12:09 | RSS feed (Full text feed) | Blog Index
Previous article: Friday Q&A 2018-06-29: Debugging with C-Reduce
Tags: objc
objc_msgSend's New Prototype
by Mike Ash  

Apple's new OSes are out. If you've looked through the documentation, you may have noticed that the prototype for objc_msgSend has changed. Previously, it was declared as a function that took id, SEL, and variadic arguments after that, and returned id. Now it's declared as a function that takes and returns void. Similar functions like objc_msgSendSuper also became void/void. Why the change?

The True Prototype
There's a big and surprisingly difficult question behind this: what is the true prototype of objc_msgSend? That is to say, what parameters does it actually take, and what does it actually return? This question doesn't have a straightforward answer.

You may have heard that objc_msgSend is implemented in assembly because it's so commonly called that it needs every bit of performance it can get. This is true, but not entirely complete. It's not possible to implement it in C at any speed.

The fast path of objc_msgSend does a few critical things:

  1. Load the class of the object.
  2. Look up the selector in that class's method cache.
  3. Jump to the method implementation found in the cache.

From the perspective of the method implementation, it looks like the caller invoked it directly. Because objc_msgSend jumps straight to the method implementation without making a function call, it effectively disappears once its job is done. The implementation is careful not to disturb any of the registers that can be used to pass arguments to a function. The caller calls objc_msgSend as if it was going to directly call the method implementation, passing all of the parameters in the same way it would for a direct function call. Once objc_msgSend looks up the implementation and jumps to it, those parameters are still exactly where the implementation expects them to be. When the implementation returns, it returns directly to the caller, and the return value is provided by the standard mechanism.

This answers the above question: the prototype of objc_msgSend is that of the method implementation it ends up calling.

But wait, isn't the whole point of dynamic method lookup and message sending that you don't know what method implementation you'll be calling? This is true! However, you do know what type signature the implementation will have. The compiler can get this information from the declaration of the method in an @interface or @protocol block, and uses that to generate the appropriate parameter passing and return value fetching code. If you override a method, the compiler complains if you don't match the type signature. It's possible to work around this by hiding declarations or adding methods at runtime, and in that case you can end up with a type signature for a method implementation that doesn't match the call site. The behavior of such a call then depends on how those two type signatures match up at the ABI level, with anything from perfectly reasonable and correct behavior (if the ABIs match so all the parameters happen to line up) to complete nonsense (if they don't).

This hints at an answer to this article's question: the old prototype worked in some circumstances (when the ABIs matched) and failed strangely in others (when the ABIs didn't match). The new prototype never works unless you cast it to the appropriate type first. As long as you cast it to the correct type, it always works. The new way of doing things thus encourages doing things correctly and makes it harder to do things wrong.

The Minimal Prototype
Although the prototype of objc_msgSend depends on the method implementation that will be called, there are two things that are common across all method implementations: the first parameter is always id self, and the second parameter is always SEL _cmd. The number and type of any additional parameters is unknown, as is the return type, but those two parameters are known. objc_msgSend needs these two pieces of information to perform its method dispatch work, so they always have to be in the same place for it to be able to find them.

We could write an approximate generalized prototype for objc_msgSend to represent this:

    ??? objc_msgSend(id self, SEL _cmd, ???)

Where ??? means that we don't know, and it depends on the particular method implementation that will be called. Of course, C has no way to represent a wildcard like this.

For the return value, we can try to pick something common. Since Objective-C is all about objects, it would make sense to assume the return value is id:

    id objc_msgSend(id self, SEL _cmd, ???)

This not only covers cases where the return value is an object, but also cases where it's void and some other cases where it's a different type but the value isn't used.

How about the parameters? C actually does have a way to indicate an arbitrary number of parameters of arbitrary types, in the form of variadic function prototypes. An ellipsis at the end of the parameter list means that a variable number of arbitrarily typed values follows:

    id objc_msgSend(id self, SEL _cmd, ...)

This is exactly what the prototype used to be before the recent change.

ABI Mismatches
The pertinent question at runtime is whether the ABI at the call site matches the ABI of the method implementation. Which is to say, will the receiver retrieve the parameters from the same location and in the same format that the caller passes them? If the caller puts a parameter into $rdx then the implementation needs to retrieve that parameter from $rdx, otherwise havoc will ensue.

The minimal prototype may be able to express the concept of passing an arbitrary number of arbitrary types, but for it to actually work at runtime, it needs to use the same ABI as the method implementation. That implementation is almost certainly using a different prototype, and usually has a fixed number of arguments.

There is no guarantee that the ABI for a variadic function matches the ABI for a function with a fixed number of arguments. On some platforms, they match almost perfectly. On others, they don't match at all.

Intel ABI
Let's look at a concrete example. macOS uses the standard System V ABI for x86-64. There is a ton of detail in the ABI, but we'll focus on the basics.

Parameters are passed in registers. Integer parameters are passed in registers rdi, rsi, rdx, rcx, r8, and r9, in that order. Floating point parameters are passed in the SSE registers xmm0 through xmm7. When calling a variadic function, the register al is set to the number of SSE registers that were used to pass parameters. Integer return values are placed in rax and rdx, and floating-point return values are placed in xmm0 and xmm1.

The ABI for variadic functions is almost identical to the ABI for normal functions. The one exception is passing the number of SSE registers used in al. However, this is harmless when using the variadic ABI to call a normal function, as the normal function will ignore the contents of al.

The C language messes things up a bit. C specifies that certain types get promoted to wider types when passed as a variadic argument. Integers smaller than int (such as char and short) get promoted to int, and float gets promoted to double. If your method signature includes one of these types, it's not possible for a caller to pass a parameter as that exact type if it's using a variadic prototype.

For integers, this doesn't actually matter. The integer gets stored in the bottom bits of the appropriate register, and the bits end up in the same place either way. However, it's catastrophic for float. Converting a smaller integer to an int just requires padding it out with extra bits. Converting float to double involves converting the value to a different structure altogether. The bits in a float don't line up with the corresponding bits in a double. If you try to use a variadic prototype to call a non-variadic function that takes a float parameter, that function will receive garbage.

To illustrate this problem, here's a quick example:

    // Use the old variadic prototype for objc_msgSend.
    #define OBJC_OLD_DISPATCH_PROTOTYPES 1

    #import <Foundation/Foundation.h>
    #import <objc/message.h>

    @interface Foo : NSObject @end
    @implementation Foo
    - (void)log: (float)x {
        printf("%f\n", x);
    }
    @end

    int main(int argc, char **argv) {
        id obj = [Foo new];
        [obj log: (float)M_PI];
        objc_msgSend(obj, @selector(log:), (float)M_PI);
    }

It produces this output:

    3.141593
    3370280550400.000000

As you can see, the value came through correctly when written as a message send, but got completely mangled when passed through an explicit call to objc_msgSend.

This can be remedied by casting objc_msgSend to have the right signature. Recall that objc_msgSend's actual prototype is that of whatever method will end up being invoked, so the correct way to use it is to cast it to the corresponding function pointer type. This call works correctly:

    ((void (*)(id, SEL, float))objc_msgSend)(obj, @selector(log:), M_PI);

ARM64 ABI
Let's look at another relevant example. iOS uses a variation on the standard ABI for ARM64.

Integer parameters are passed in registers x0 through x7. Floating point parameters are passed in v0 through v7. Additional parameters are passed on the stack. Return values are placed in the same register or registers where they would be passed as parameters.

This is only true for normal parameters. Variadic parameters are never passed in registers. They are always passed on the stack, even when parameter registers are available.

There's no need for a careful analysis of how this will work out in practice. The ABIs are completely mismatched and a method called with an uncast objc_msgSend will receive garbage in its parameters.

The New Prototype
The new prototype is short and sweet:

    void objc_msgSend(void);

This isn't correct at all. However, neither was the old prototype. This one is much more obviously incorrect, and that's a good thing. The old prototype made it easy to to use it without casting it, and worked often enough that you could easily end up thinking everything was OK. When you hit the problematic cases, the bugs were very unclear.

This prototype doesn't even allow you to pass the two required parameters of self and _cmd. You can call it with no parameters at all, but it'll immediately crash and it should be pretty obvious about what went wrong. If you try to use it without casting, the compiler will complain, which is much better than weird broken parameter values.

Because it still has a function type, you can still cast it to a function pointer of the appropriate type and invoke it that way. This will work correctly as long as you get the types right.

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

For folks looking for a concrete example of how you might achieve Mike's recommendation to cast objc_msgSend and "get the types right", I wrote up a post that culminates in such an example: https://indiestack.com/2019/10/casting-objective-c-message-sends/
Hi Mike,

Was this change not made years ago, when 64-bit ARM was introduced?

Thanks
It’s been an option for a long time and you could set it in Xcode. Might have been on by default in some cases. Now it’s set that way in the header and the docs reflect it.
Besides being necessary now, does calling objc_msgSend through a function pointer affect the way the code gets compiled down to machine code? Is function pointer overhead relevant here?
@Ahmad Alhashemi: You don't have to call it though function pointer.

You can cast it directly at call site.

((id (*)(id, SEL, id)) objc_msgSend)(obj, sel, arg);

which still generates a direct function call:

callq    _objc_msgSend

If the previous version of the objc_msgSend's signature has ABI mismatch problem, does this mean actual bugs (like garbage data you mentioned) have been existing in previous iOS/macOS versions ? Or do the compilers from previous OSs have been using other techniques to circumvent this issue?
@ZetaSQ: Bugs exist since the first runtime version, but only for people trying to use objc_msgSend directly without knowing how to use it properly.

For calls generated by the compiler, it has never been an issue.
And by the way, casting objc_msgSend may not be enough. For some types of arguments, it may still fails at runtime, that's why there is some variants like objc_msgSend_stret, objc_msgSend_fpret, objc_msgSend_fp2ret, …

As long as you use basic types, it should be fine, but when you start using structs, long double and complex you need to be careful to use the right variant.
For instance, programs that incorrectly casted objc_msgSend before using it had a harder time with the Intel transition: I seem to remember a wide-eyed blog post (wasn't it from one of the Unsanity devs?) reading pretty much like "we were casting the return value of objc_msgSend, but turns out, this isn't sufficient, neither is it sufficient to cast all the parameters as well: it is somehow necessary to cast the objc_msgSend function pointer itself, and call through the resultant cast!"

Since then, I have considered objc_msgSend (and friends) as something akin to dyld_stub_binding_helper (except higher level and more dynamic, e.g. it is called each time instead of just once per symbol): an address to jump to, as a substitute to a static address for the function entry point, everything else about the function call being otherwise equal; the code at the address in question being obviously responsible for binding to the actual implementation. The new prototype better reflects that. I don't think it's possible to have a C type for which no value can possibly be provided*; this is too bad, as having the objc_msgSend prototype require such a type would even better reflect this reality.

Anyway, congratulations on picking up the Objective-C runtime esoterica public communication role from Greg Parker!

*I am sure GNU C has an extension for that.

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
Web site:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
Formatting: <i> <b> <blockquote> <code>. URLs are automatically hyperlinked.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.