mikeash.com pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html commentshttp://www.mikeash.com/?page=pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html#commentsmikeash.com Recent CommentsThu, 28 Mar 2024 09:16:38 GMTPyRSS2Gen-1.0.0http://blogs.law.harvard.edu/tech/rssOwen Shepherd - 2012-01-01 13:30:20http://www.mikeash.com/?page=pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html#commentsThat is most certainly not a far jump/call. In AT&amp;T syntax, a far jump/call is encoded using the "ljmp" or "lcall" mnemonic, and is used for cross-segment calls and jumps. Now, since cross-segment calls and jumps are nigh-on obsolete, you'll only very rarely see them..9e1e04b12aa0a8121a8601f81bb24beaSun, 01 Jan 2012 13:30:20 GMTChris Suter - 2011-12-31 02:04:57http://www.mikeash.com/?page=pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html#comments&gt; If you peek at the generated machine code with a disassembler, it turns out to not be a mov instruction at all, but rather a lea! <br /> <br />Are you sure? I don’t see this. <br /> <br />&gt; GOTPCREL is a directive which allows the rip-relative address of a function to be inserted at link time so a direct call can be made <br /> <br />I don’t think that’s quite right. See below. <br /> <br />&gt; "far jump" (a branch over a long distance of code, which, by necessity, is much slower). <br /> <br />Again, I don’t think a “far jump” is necessarily much slower because it is over a long distance of code. It might be marginally slower because you might have to use more bytes to encode the instruction (but that won’t be *much* slower), and it will obviously be slower if the address you’re jumping to causes a page fault or a bit slower if it isn’t in the cache, but it’s not slower simply because it’s a branch over a long distance of code. Things might be a bit slower in your example because an indirect jump is being used (but you’d need to check the processor documentation to see by how much and I wouldn’t be surprised to find that if it’s cached, it’s nil). <br /> <br />&gt; Note: I'm not 100% sure of my facts on this one; I'd appreciate any insight anyone has on the specifics of @GOTPCREL <br /> <br />@GOTPCREL allows you to load an address in a global offset table in a single instruction (using instruction relative addressing). The Global Offset Table (GOT) stores the address of objc_msgSend and any other global addresses that might be required. It is fixed up by dyld at runtime. The GOT is always at a fixed offset relative to code so you can use instruction relative addressing to load an address from it. <br /> <br />All of the above said, I’m no expert on this, so please don’t take my word on the above; I might be wrong. <br />c08869631502d2d759e6320c98ed6660Sat, 31 Dec 2011 02:04:57 GMTGwynne Raskind - 2011-12-23 23:10:20http://www.mikeash.com/?page=pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html#comments<b>Jens:</b> I spent something like an hour trying to Goggle and otherwise look up that particular function attribute! I guess my Google-fu needs some work :). Thanks for the code listing!e0ae0e58515fa2fedcacce730290b16dFri, 23 Dec 2011 23:10:20 GMTJens Ayton - 2011-12-23 22:55:08http://www.mikeash.com/?page=pyblog/friday-qa-2011-12-23-disassembling-the-assembly-part-2.html#comments<div class="blogcommentquote"><div class="blogcommentquoteinner">it turns out to be extremely difficult to get Clang to actually emit such assembly under optimizing compilation without just inlining the function, and the unoptimized version is different.</div></div> <br /> <br /><code>__attribute__((noinline)) float MyFPFunction(float parameter)</code> <br /> <br />The call and print sequence is: <br /><code> <br />&nbsp;&nbsp;&nbsp;&nbsp;movss&nbsp;&nbsp;&nbsp;&nbsp;LCPI1_0(%rip), %xmm0 # Load argument <br />&nbsp;&nbsp;&nbsp;&nbsp;callq&nbsp;&nbsp;&nbsp;&nbsp;_MyFPFunction <br />&nbsp;&nbsp;&nbsp;&nbsp;cvtss2sd&nbsp;&nbsp;&nbsp;&nbsp;%xmm0, %xmm0 # Promote to double <br />&nbsp;&nbsp;&nbsp;&nbsp;leaq&nbsp;&nbsp;&nbsp;&nbsp;L__unnamed_cfstring_(%rip), %rdi <br />&nbsp;&nbsp;&nbsp;&nbsp;movb&nbsp;&nbsp;&nbsp;&nbsp;$1, %al <br />&nbsp;&nbsp;&nbsp;&nbsp;callq&nbsp;&nbsp;&nbsp;&nbsp;_NSLog <br /></code> <br /> <br />a8c598948fcc45d6bc84563648f2b0b7Fri, 23 Dec 2011 22:55:08 GMT