mikeash.com: Friday Q&A 2010-01-29: Method Replacement for Fun and Profit

Posted at 2010-01-29 19:00 | RSS feed (Full text feed) | Blog Index
Next article: Friday Q&A 2010-02-05: Error Returns with Continuation Passing Style
Previous article: Friday Q&A 2010-01-22: Toll Free Bridging Internals
Tags: evil fridayqna objectivec override swizzling

Friday Q&A 2010-01-29: Method Replacement for Fun and Profit

by Mike Ash

It's that time of the week again. For this week's Friday Q&A Mike Shields has suggested that I talk about method replacement and method swizzling in Objective-C.

Overriding Methods
Overriding methods is a common task in just about any object oriented language. Most of the time you do this by subclassing, a time-honored technique. You subclass, you implement the method in the subclass, you instantiate the subclass when necessary, and instances of the subclass use the overridden method. Everybody knows how to do this.

Sometimes, though, you need to override methods that are in objects whose instantiation you don't control. Subclassing doesn't suffice in that case, because you can't make that code instantiate your subclass. Your method override sits there, twiddling its thumbs, accomplishing nothing.

Posing
Posing is an interesting technique but, alas, is now obsolete, since Apple no longer supports it in the "new" (64-bit and iPhone) Objective-C runtime. With posing, you subclass, then pose the subclass as its superclass. The runtime does some magic and suddenly the subclass is used everywhere, and method overrides become useful again. Since this is no longer supported, I won't go into details.

Categories
Using a category, you can easily override a method in an existing class:

    @implementation NSView (MyOverride)
    
    - (void)drawRect: (NSRect)r
    {
        // this runs instead of the normal -[NSView drawRect:]
        [[NSColor blueColor] set];
        NSRectFill(r);
    }
    
    @end

However, this really only works if you want to override a method implemented in a superclass of the class you're targeting. When the method in question exists in the class where you want to override it, using a category to perform the override results in two problems:

It's impossible to call through to the original implementation of the method. The new implementation replaces the original, which is simply lost. Most overrides want to add functionality, not completely replace it, but it's not possible with a category.
The class in question could implement the method in question in a category too, and the runtime doesn't guarantee which implementation "wins" when two categories contain methods with the same name.

Swizzling
Using a technique called method swizzling, you can replace an existing method from a category without the uncertainty of who "wins", and while preserving the ability to call through to the old method. The secret is to give the override a different method name, then swap them using runtime functions.

First, you implement the override with a different name:

    @implementation NSView (MyOverride)
    
    - (void)override_drawRect: (NSRect)r
    {
        // call through to the original, really
        [self override_drawRect: r];
        
        [[NSColor blueColor] set];
        NSRectFill(r);
    }
    
    @end

Notice how calling through to the original is done by calling the same method, in what looks like a recursive call. This works because the method gets swapped with the original implementation. At runtime, the method called override_drawRect: is actually the original!

To swap the method, you need a bit of code to move the new implementation in and the old implementation out:

    void MethodSwizzle(Class c, SEL origSEL, SEL overrideSEL)
    {
        Method origMethod = class_getInstanceMethod(c, origSEL);
        Method overrideMethod = class_getInstanceMethod(c, overrideSEL);

To be completely general, this code has to handle two cases. The first case is when the method to be overridden is not implemented in the class in question, but rather in a superclass. The second case is when the method in question does exist in the class itself. These two cases need to be handled a bit differently.

For the case where the method only exists in a superclass, the first step is to add a new method to this class, using the override as the implementation. Once that's done, then the override method is replaced with the original one.

The step of adding the new method can also double as a check to see which case is actually present. The runtime function class_addMethod will fail if the method already exists, and so can be used for the check:

        if(class_addMethod(c, origSEL, method_getImplementation(overrideMethod), method_getTypeEncoding(overrideMethod)))
        {

If the add succeeded, then replace the override method with the original, completing the (conceptual) swap:

            class_replaceMethod(c, overrideSEL, method_getImplementation(origMethod), method_getTypeEncoding(origMethod));
        }

If the add failed, then it's the second case; both methods exist in the class in question. For that case, the runtime provides a handy function called method_exchangeImplementations which just swaps the two methods in place:

        else
        {
            method_exchangeImplementations(origMethod, overrideMethod);
        }
    }

You'll notice that the method_exchangeImplementations call just uses the two methods that the code already fetched, and you might wonder why it can't just go straight to that and skip all of the annoying stuff in the middle.

The reason the code needs the two cases is because class_getInstanceMethod will actually return the Method for the superclass if that's where the implementation lies. Replacing that implementation will replace the method for the wrong class!

As a concrete example, imagine replacing -[NSView description]. If NSView doesn't implement -description (which is probable) then you'll get NSObject's Method instead. If you called method_exchangeImplementations on that Method, you'd replace the -description method on NSObject with your own code, which is not what you want to do!

(When that's the case, a simple category method would work just fine, so this code wouldn't be needed. The problem is that you can't know whether a class overrides a method from its superclass or not, and that could even change from one OS release to the next, so you have to assume that the class may implement the method itself, and write code that can handle that.)

Finally we just need to make sure that this code actually gets called when the program starts up. This is easily done by adding a +load method to the MyOverride category:

    + (void)load
    {
        MethodSwizzle(self, @selector(drawRect:), @selector(override_drawRect:));
    }

Direct Override
This is a bit complicated, though. The swizzling concept is a little weird, and especially the way that you call through to the original implementation tends to bend the mind a bit. It's a pretty standard technique, but I want to propose a way that I believe is a little simpler, both in terms of being easier to understand and easier to implement.

It turns out that there's no need to preserve the method-ness of the original method. The dynamic dispatch involved in [self override_drawRect: r] is completely unnecessary. We know which implementation we want right from the start.

Instead of moving the original method into a new one, just move its implementation into a global function pointer:

    void (*gOrigDrawRect)(id, SEL, NSRect);

Then in +load you can fill that global with the original implementation

    + (void)load
    {
        Method origMethod = class_getInstanceMethod(self, @selector(drawRect:));
        gOrigDrawRect = (void *)method_getImplementation(origMethod);

(I like to cast to void * for these things just because it's so much easier to type than long, weird function pointer types, and thanks to the magic of C, the void * gets implicitly converted to the right pointer type anyway.)

Next, replace the original. Like before, there are two cases to worry about, so I'll first add the method, then replace the existing one if it turns out that there is one:

        if(!class_addMethod(self, @selector(drawRect:), (IMP)OverrideDrawRect, method_getTypeEncoding(origMethod)))
            method_setImplementation(origMethod, (IMP)OverrideDrawRect);
    }

Finally, implement the override. Unlike before, it's now a function, not a method:

    static void OverrideDrawRect(NSView *self, SEL _cmd, NSRect r)
    {
        gOrigDrawRect(self, _cmd, r);
        [[NSColor blueColor] set];
        NSRectFill(r);
    }

A bit uglier, certainly, but I think it's simpler and easier to follow.

The Obligatory Warning
Overriding methods on classes you don't own is a dangerous business. Your override could cause problems by breaking the assumptions of the class in question. Avoid it if it's at all possible. If you must do it, code your override with extreme care.

Conclusion
That's it for this week. Now you know the full spectrum of method override possibilities in Objective-C, including one variation that I haven't seen discussed much elsewhere. Use this power for good, not for evil!

Come back in seven days for the next edition. Until then, keep sending in your suggestions for topics. Friday Q&A is powered by reader submissions, so if you have an idea for a topic to cover here, send it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

Joel Bernstein at 2010-01-29 19:43:25:

If you're going to use method 2, you could do a lot worse than using Jonathan Rentzsch's JRSwizzle class:

http://github.com/rentzsch/jrswizzle

It takes care of most of the mindless busywork and edge cases.

Remy Demarest (Psy|) at 2010-01-29 20:11:43:

It's not needed to make two separate codes for when the method is implemented in the class or when it's only implemented in the super-class.
The runtime-function class_replaceMethod() takes care of that, if the method is defined in the class, then the replacement is done, if it's defined in a super-class then the function adds the method to the class. So you simply need to retrieve the "old" implementation (that might be the super-class's implementation) and call it in your function replacement.

mikeash at 2010-01-29 20:36:08:

Oh nice, I completely missed that aspect of that function, and that it returns the old implementation. So the Direct Override's +load method can be cut down to just this:

Method origMethod = class_getInstanceMethod(self, @selector(drawRect:));

gOrigDrawRect = (void *)class_replaceMethod(self, @selector(drawRect:), (IMP)OverrideDrawRect, method_getTypeEncoding(origMethod))

Whitney Young at 2010-01-29 21:00:59:

One interesting place that I've seen method swizzling fail is with the following NSResponder methods:



mouseEntered:

mouseExited:

mouseMoved:

If I remember correctly, these methods all use an IMP that actually determines what to do based on the _cmd argument. Therefore, when you swizzle, you end up passing override_mouseEntered: as _cmd instead of a value that it knows how to handle.

The direct override should not suffer from this problem since it's passing on _cmd correctly.

mikeash at 2010-01-29 22:24:08:

I'd always pondered the potential trouble of giving the original method the wrong _cmd, but never came across a place where it mattered in practice. Interesting!

Cédric Luthi at 2010-02-01 09:21:11:

Actually, it is technically possible to call the original implementation of a method overridden in a category. Matt Gallagher explains how in his Supersequent implementation blog post: http://cocoawithlove.com/2008/03/supersequent-implementation.html

Warning: extreme hacking inside!

mikeash at 2010-02-01 18:40:11:

Extreme hacking, and not the good kind. The technique as a whole relies on the runtime keeping the old IMP in the class method list after adding a category. As far as I know, this behavior is not guaranteed. This also won't chain: if you have two or more methods all implemented in a category (one of these could be in Apple's code, beyond your control) then it's still not guaranteed who "wins". There's also a big problem with the macro itself, in that it assumes the method signature is compatible with vararg calling conventions, something that's not at all guaranteed. Gallagher doesn't mention any of these caveats, which worries me greatly. He says that this technique is mainly good for debugging because it's slow. I say that it's only good for debugging because you don't have any guarantees that it'll actually work.

Matt Gallagher at 2010-02-09 12:47:48:

'fraidy cat :-)

But yes, the supersequent method stuff is a hack for the same reason any "undocumented" stuff is: it can be gone at a moment's notice. It relies on the way that things just happen to be done.

As for multiple categories though... the way the ObjC 1.0/2.0 runtimes just happen to be written, all categories are preserved in the order that they are loaded. Technically load order is deterministic but it's fragile -- generally, the system libraries will be loaded before your code, but not always.

But don't ship code with it unless you want to get burned.

Matt Gallagher at 2010-02-09 13:01:14:

On the point of vararg method calling conventions though... the supersequent code doesn't actually use them. The macro is variable argument but is expanded to a non-vararg parameter list by the preprocessor. The ObjC compiler sees (and compiles) regular parameters only.

Sorry to have gotten in the way. I'll see myself out...

mikeash at 2010-02-09 17:46:01:

No need to leave, I welcome all reasonable discussion.

Your macro looks like this:

#define invokeSupersequent(...) \

    ([self getImplementationOf:_cmd \

        after:impOfCallingMethod(self, _cmd)]) \

            (self, _cmd, ##__VA_ARGS__)

-getImplementationOf: is defined to return an IMP, which takes variable arguments after the self and _cmd parameters. This macro does not cast the IMP to a different function pointer type (and indeed could not, as it doesn't have enough information to do so). This means that the IMP is being called with variable argument calling conventions. Or did I miss some place where everything gets cast to the right function pointer type before calling?

mikeash at 2010-02-09 17:54:41:

By the way, lest this objection seem too theoretical, consider what happens if you use that macro for a method which returns a struct.

Matt Gallagher at 2010-02-10 00:06:59:

Okay, I was overlooking the fact that the declaration of IMP actually declares the third parameter to be a "...". You are correct; I retract my point about not using varargs.

However, this is the calling convention used by objc_msgSend, and by extension all methods except the objc_msgSend_(st/fp/fp2)ret methods. But yes, (st/fp/fp2)ret methods require a correct cast of the IMP or other special handling to work. Fortunately, the compiler is smart enough to give a hard error if you try to do this without a cast -- it doesn't slip through unnoticed.

I find the bigger problem is that without a signature, all regular parameters need to be correctly typed or they won't be passed correctly (since the compiler can infer the wrong register or stack size). This can cause problems without so much as a warning if you're not careful.

mikeash at 2010-02-10 01:13:19:

It gets worse. Because of C's type promotion rules, you can't pass float, short, or char (or the unsigned counterparts of the last two) through a vararg function, because they get promoted to double and int. This code illustrates the problem:

void Tester(int ign, float x, char y)

{

    printf("float: %f  char: %d\n", x, y);

}



int main(int argc, char **argv)

{

    float x = 42;

    float y = 42;

    Tester(0, x, y);

    

    void (*TesterAlt)(int, ...) = (void *)Tester;

    TesterAlt(0, x, y);

    

    return 0;

}

On my computer, the second invocation prints float: 0.000000 char: 0.

objc_msgSend doesn't use vararg calling conventions. The convention it "uses" is any convention which is compatible with a pointer return value, and placing the first two arguments in a place where they can be expected. objc_msgSend completely ignores all remaining arguments, and lets them pass through unhindered. The caller and the eventual callee (after objc_msgSend looks it up and jumps to it) still have to agree on how those work, and if the caller thinks they're varargs and the callee doesn't, they won't get along.

You must cast all calls to objc_msgSend and its variants in order to have the compiler generate the correct code. Failing to do so will work for many cases, but only because you're getting lucky. The same goes for casting IMPs.

mikeash at 2010-02-10 01:35:07:

My test program has a bug in it: in main, y should be of type char. Still fails as described with that change made.

I forgot to mention: even if none of your parameters are of the offending types, there's still nothing which guarantees that the calling conventions will match between a vararg call with certain argument types and a non-vararg receiver with those same types. It's far more likely to work (and I think the ABIs of the platforms that OS X runs on may guarantee it on those particular platforms) but still unsafe.

None at 2010-05-18 10:38:35:

Hi, i have some questions about doing the swizzle on some application's class method when under 64bit.

in 32bit, the code will run as i expected, but when in 64bit, i get this error:

Error loading XXX: dlopen(XXX, 123): Symbol not found: _OBJC_CLASS_$_SomeClass
Referenced from: XXX
Expected in: flat namespace
in XXX

i googled for solutions, and i get some answers like:

http://groups.google.com/group/f-script/browse_thread/thread/09f07a3771032de4

i go and see the code, which does not fix the error.

i use JRSwizzle's + (BOOL)jr_swizzleClassMethod:(SEL)origSel_ withClassMethod:(SEL)altSel_ error:(NSError**)error_

thx :)

Ryan Petrich at 2010-08-14 05:23:26:

Your "Direct Override" method fails to handle the case where two hooks are applied to the same method: first on a descendent class that does not override the method and again on the super class.

Example: Application has classes Dog and Mammal. Mammal has a reproduceWith: method. MadScientist uses your Direct Override technique to hook -[Dog reproduceWith:] and inject his additional code. EvilGeneticist uses any other hook technique to hook -[Mammal reproduceWith:].
Later, -[Dog reproduceWith:] is called; only MadScientist's hook is runs instead of both MadScientist and EvilGeneticist's hooks.

This is not a theoretical problem--the iPhone jailbreak community has encountered this issue numerous times and has standardized on two libraries: MobileSubstrate emits ARM bytecode at runtime to avoid it, and CaptainHook avoids it through macro trickery.

mikeash at 2010-08-14 15:36:10:

That's an interesting case that I hadn't thought of. Thanks for pointing it out.

Ling Wang at 2011-05-19 05:37:49:

In MethodSwizzle, what if overrideSEL is not implemented in the class in question, but rather in a superclass?

mikeash at 2011-05-19 22:44:22:

class_getInstanceMethod will return the superclass's implementation if the class in question doesn't have one of its own, so everything still works as desired.

Bernhard at 2012-10-18 17:46:26:

Thank you, Mike, for this great article!

I think, though, that you should have mentioned that in order to use the class_...() and method_...() functions, one needs to include the libobjc.A.dylib library in the Xcode Target and then #import <objc/runtime.h> in the source file.

Dmytro at 2012-11-16 22:06:55:

Can I replace class method (declared as + (void) methodname ....) ? I successfully replace instance method ( -(void) methodname ...), but can't class method...

Whitney Young at 2013-02-18 18:43:47:

I just came back to this, and was re-reading some of the comments. Mike, your comment from 2010-01-29 doesn't seem quite right. The class_replaceMethod returns NULL if the method was added rather than replaced. I think the simplified version would be:



Method origMethod = class_getInstanceMethod(self, @selector(drawRect:));

gOrigDrawRect = (void *)method_getImplementation(origMethod);

class_replaceMethod(self, @selector(drawRect:), (IMP)OverrideDrawRect, method_getTypeEncoding(origMethod))

chandan at 2013-08-20 10:33:03:

I am swizzling Copy: and Paste: method of UIResponder. I have to write the copied content to a private pasteboard.

- (void)copyToPrivatePasteboard:(id)sender
{
UIPasteboard *privatePasteboard = [self getPrivatePasteboard];
[privatePasteboard setString:@""];//How to get the copied string to store in pasteboard.
}

How can i write copied string to pasteboard. The parameter i am getting is of type id. If i convert it to NSString, it won't be proper because it is the sender who is calling this method (UIMenuController).

mikeash at 2013-08-20 13:38:33:

If you call through to the original implementation first, you can the retrieve the copied string from the regular pasteboard.

martin at 2013-09-15 12:25:45:

you wrote "and thanks to the magic of C, the void * gets implicitly converted to the right pointer type anyway.)"

i'm sure you are aware that nothing get's "converted" here, in C a pointer is a pointer, 4 or 8 bytes are copied, that's all.

(not to be confused by the real magic Objective C can do converting types on the fly when setting them, setting a BOOL or float from a NSNumber for example, using setValueForKey.

mikeash at 2013-09-27 14:33:04:

Actually, in C, a pointer is definitely not a pointer. There is no guarantee that pointers of different types have the same internal representation. This is especially true for function pointers. What you say happens to be the case on popular architectures we use today, but it's not part of C.

In any case, the type gets converted, even if the value remains the same.

Steven at 2014-04-16 10:26:27:

Mike, class_replaceMethod does not return the imp of the superclass method. If the method was newly added to the class, gOrigDrawRect is 0.

In that case, I add an extra method_getImplementation, using the method I get from class_getInstanceMethod. That should return the nearest superclass IMP of the method.

Geoff at 2015-07-07 17:57:59:

I tried using this technique on a class level method instead of an instance method. In addition to replacing class_getInstanceMethod with class_getClassMethod, I needed to wrap "self" with object_getClass() in the class_replaceMethod() function.



Method origMethod= class_getClassMethod(self, @selector(foo));

class_replaceMethod(object_getClass(self), @selector(foo), (IMP)overrideFoo, method_getTypeEncoding(origMethod));

I'm assuming that is because class_getClassMethod is expecting to be working with Class level methods while class_replaceMethod could work with either. Is this correct?

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Code syntax highlighting thanks to Pygments.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
	Formatting: `<i> <b> <blockquote> <code>`.
	NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.