Next article: Friday Q&A 2009-02-27: Holistic Optimization
Previous article: Friday Q&A 2009-02-13: Operations-Based Parallelization
Tags: cocoa fridayqna ipc performance
Welcome back to another Friday Q&A. This week I'm going to take Erik's (no last name given) suggestion from my interprocess communication post and expand a bit on Distributed Objects, what makes it so cool, and the problems that it has.
I'm not going to take a detailed look at the basics of Distributed Objects, as there are plenty of resources out there for that. But for those readers who are unfamiliar with Distributed Objects, I'll give a quick look at what it is.
Distributed Objects (DO) is a Cocoa API for interprocess communication built on automatic proxying of messages, so that remote objects look and act mostly like local objects. The primary interface to the DO system is the
NSConnection class. Vending an object with DO is as easy as this:
NSConnection *connection = [[NSConnection connectionWithReceivePort:[NSPort port]] sendPort:nil]; [connection setRootObject:theObject]; [connection registerName:@"com.example.whatever"];
id theObject = (id)[NSConnection rootProxyForConnectionWithRegisteredName:@"com.example.whatever" host:nil]; [theObject someMethod];
I think this should be pretty clear from the above description, but let's quickly compare it to other IPC mechanisms:
- It's easy. Just a few lines of code to set up a fully functional connection.
- It's transparent. For the most part, remote objects can be passed around just like local objects. This means that little of your code needs to be DO-aware.
- It's flexible. DO can be used over mach ports or sockets. It can be used to communicate between threads or between processes. It's reasonably configurable.
- It's robust. Because DO works the same way as Objective-C messages, you don't have the same problems you might have trying to support two different protocol versions. Of course it's not always rosy, but if your changes involve implementing different methods, it's easy to check for what's going on using
respondsToSelector:and the like, rather than having to give up.
So if DO is so great, why am I not a bigger fan? Part of it is simply because DO can't completely abstract away the fact that it's running over a transport layer and talking to a remote process, and part of it is because DO itself is just not as good as it could be.
One leaky abstraction is primitive types. DO needs to be able to either serialize (i.e. copy across the connection) or proxy everything that gets used as an argument or returned as a value with a "distant object". For objects, this is fine. Objects that want to be copied can implement
NSCoding, and all other objects can use Objective-C's built-in message capturing facilities to proxy all requests across the connection.
For primitives, things get harder. For scalars and even structs, the Objective-C runtime provides enough type information that these can be copied across. But once you hit pointers, things fall apart. Imagine trying to proxy a call like this:
[array getObjects:objarray range:NSMakeRange(5, 13)];
objarrayparameter is determined by the length of the range being passed in, and copy only that much memory across the connection. It would also have to know that this is a return-by-reference only, and that it shouldn't be trying to serialize or proxy the contents of
objarrayacross the connection (it could be filled with junk, and an attempt to proxy that junk would crash). Yes, DO could special-case this particular method, but it won't be able to deal with arbitrary such methods.
DO does have some interesing language-level facilities to help with this. You can specify a pointer parameter as being
inout so that it knows which way to serialize or proxy. But this only works with pointers to single objects. For arrays, it just can't cope.
Another leaky abstraction is that the process that you're talking to could disappear at any moment. Objective-C just isn't set up to deal with this very well:
id obj = [self method]; [obj thing1]; [obj thing2];
objis either broken from the start (in which case you'll crash) or it remains valid throughout the method. But if
objis a distant object, suddenly things are not so clear. The remote process could disappear (or freeze up and time out) in between the call to
When that happens, DO deals with it by throwing an exception. Surprise!
Most Objective-C code is not exception safe. Although there's no particular reason that exceptions can't be used more frequently the way they are in languages like Java, convention is that exceptions in Objective-C are only used to indicate a programming error. In order to really robustly use DO, you need to write your code such that it can handle an exception being thrown by any interaction with a distant object. Worse yet, this includes Cocoa code, meaning that you essentially cannot pass a distant object to any Cocoa code. (Ever wonder what happens when an
-hash to your object, and
-hash throws an exception? Odds are fair that it leaves the
NSSet in an inconsistent state that will lead to a crash.)
This requirement for all code touching distant objects to be exception safe is tough, and greatly limits the places in which DO can be practically used. The promise is that remote objects look like local objects, and they mostly do, but this one (absolutely necessary) detail means that they can't be used like local objects at all.
Lastly, DO is not very modular or extensible. Ideally, DO would be a fully modular system. You'd have the DO system which would sit on top of some kind of interchangeable transport class. Customizing the transport class (for example, to make it use encryption, or talk to a serial port, or use avian carriers) would simply be a matter of subclassing a public abstract class and implementing a documented set of primitive methods.
The reality is not so simple. The classes that DO uses internally are fairly tightly coupled, and there's a lot of legacy cruft. Implementing a custom NSPort subclass that works with NSConnection is so difficult that I'm only aware of one working example (Secure Distributed Objects, which appears to be a dead project now). This pretty much sinks the idea of using DO for any serious network communication, since DO doesn't encrypt the transport and it's not practical to add encryption to it.
Distributed Objects is a very cool system that has many uses. Unfortunately, due to both the costraints under which it works and some poor design decisions, it's not as useful as it could be. It can still be handy for doing IPC and it's a great tool to have in Cocoa, but it falls short of being a no-brainer way to talk to other processes.
That wraps up this week's Friday Q&A. Check back next week for another exciting installment. In the meantime, keep those ideas coming. Friday Q&A is powered by your submissions, so don't be shy. Post your topic ideas in the comments below or e-mail them directly to me. (Yes, I link my real e-mail address directly on the web! How can you refuse that!) If I use your suggestion then I will use your name unless you tell me otherwise.
Love Distributed Objects? Think your pet RPC mechanism is better? Fire away.
You forgot one thing, too. DO uses NSCoding to send objects, but AFAIK it doesn't support keyed coding. Which means you need to keep supporting the much more easy-to-break unkeyed system, which is all but legacy now.
I wish DO+GC resulted in dropped objects becoming nil, or that you could set something so that return values became nil when the connection disappears. Throwing an exception is one of the more inconvenient choices for signaling this error.
A more benign failure mode would be useful, certainly. That's actually something that you could add yourself with a little NSProxy subclass. The bad news is that it would require an explicit check and instantiation of your proxy class in all code that could obtain a distant object.
In my own Cocoa projects I have noticed that DO's transparency disappears pretty quickly in practice when one starts to build exception-catching, operation-retrying, non-blocking, time outing proxies around objects vended over DO. However, it still seems like a good choice for IPC between two Cocoa processes.
[self findBucket: [yourObject hash]];
If startTheThing puts the set into an inconsistent state, and endTheThing completes the work and puts the set back where it was, throwing an exception inside hash will cause problems.
There are a number of abstractions that seemed cool from an engineering standpoint (RPC, networked VM, publish-and-subscribe, DDE then OLE) but which basically never took off. In at least some of these cases there has been massive company pressure behind them, but no real traction among users or developers. These things seem to be what I'll call "bad abstractions". They're trying to solve a problem by reducing it to an already solved problem, but the gap between the reality and the metaphor is just too large to be straddled.
In the 90s there was this whole industry that grew up around the idea that computers would talk to servers via RPC and distributed objects, and it didn't happen. Not because talking to servers was a bad idea but, as far as I can tell, because talking to servers using network-aware constructs (and with the consequences in terms of making errors --- the errors specific to networks --- more visible, but also then easier to handle appropriately). And so --- HTTP.
Similarly with RPC where it's still done, of course, but with things like REST and SOAP, not the classic SUN-style RPC.
We saw a similar thing with presentation where, rather than the "one GUI everywhere" idea of Java (and some competitors) the solution that won didn't exactly pretend to be something it wasn't. HTML wasn't powerful, but it did what it did in a way that matched the task to be solved, it didn't create a UI layer that sorta pretended to be Mac or Windows while not actually working like either.
Point is --- Apple have probably look at this history and concluded that, regardless of what people might claim, distributed objects is a loser technology, and there's no point in trying to make it work better. If you want to do anything serious over the network, you're probably going to be a lot happier using network appropriate primitives rather than trying to hide the existence of the network.
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.