mikeash.com pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html commentshttp://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#commentsmikeash.com Recent CommentsFri, 29 Mar 2024 10:30:37 GMTPyRSS2Gen-1.0.0http://blogs.law.harvard.edu/tech/rssmikeash - 2017-11-16 18:12:50http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#comments<b>Jacob Bandes-Storch</b>: That's an interesting question. I looked around and didn't find much info on why there would be this difference. "To avoid race conditions" might just mean that it's hard to correctly write code that doesn't have a lock on the signal side. For example, if you signal after another thread checks the predicate but before that thread begins to wait, that thread will wait even though the predicate is now true, which isn't what you want. Locking on the sender side ensures that the condition can't be signaled during that time. <br /> <br /><b>Mndgs:</b> Sure, here it is: <br /> <br /><a href="https://gist.github.com/mikeash/75e99bbeebdb909d45b30c4d6753213e">https://gist.github.com/mikeash/75e99bbeebdb909d45b30c4d6753213e</a> <br /> <br />Just toss it into a project and call <code>Test()</code>.a2b706e12ba37b01a15f128ac0351c65Thu, 16 Nov 2017 18:12:50 GMTMndgs - 2017-11-14 08:21:18http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#commentsAny chance for you to share full code?d1dc2b291177f3a42a8283388e805873Tue, 14 Nov 2017 08:21:18 GMTJacob Bandes-Storch - 2017-11-13 00:53:21http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#commentsStraying off-topic here, but something caught my eye in your test harness code: <br /><code> <br />&nbsp;&nbsp;&nbsp;&nbsp;cond.signal() <br />&nbsp;&nbsp;&nbsp;&nbsp;cond.unlock()</code> <br /> <br />Sure enough, the documentation for <code>NSCondition.signal()</code> states: <a href="https://developer.apple.com/documentation/foundation/nscondition/1415724-signal">https://developer.apple.com/documentation/foundation/nscondition/1415724-signal</a> <br /> <br /><div class="blogcommentquote"><div class="blogcommentquoteinner">To avoid race conditions, you should invoke this method only while the receiver is locked.</div></div> <br />This struck me as odd because it’s contrary to the C++ STL’s recommendation for <code>condition_variable::notify_one()</code>: <a href="http://en.cppreference.com/w/cpp/thread/condition_variable/notify_one">http://en.cppreference.com/w/cpp/thread/condition_variable/notify_one</a> <br /> <br /><div class="blogcommentquote"><div class="blogcommentquoteinner">The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. However, some implementations (in particular many implementations of pthreads) recognize this situation and avoid this “hurry up and wait” scenario by transferring the waiting thread from the condition variable’s queue directly to the queue of the mutex within the notify call, without waking it up</div></div> <br />Any ideas why there’s a difference here? What is meant by the “to avoid race conditions…” comment?e7fbc02fca3b1c798db8b0ed089ccdfbMon, 13 Nov 2017 00:53:21 GMTmikeash - 2017-11-10 20:28:53http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#commentsThey definitely did not run exclusively on the same core for the entire run. At one point I had my test code log thread IDs along with the sampled running times, and it was clear that individual threads were bouncing between fast and slow cores. The test I did here doesn't need the threads to stay on the same core, it just needs the sampled computation to be fast enough that it's unlikely to be moved to a different core during the sampling period.430375a6c769e9d94f641441ad82430aFri, 10 Nov 2017 20:28:53 GMTThomas Tempelmann - 2017-11-10 20:13:11http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#commentsI am a bit skeptical that you actually managed to get the six threads run on their own core each. If I‘d write a thread scheduler for different speed cpus, and all those threads appear to have the same speed requirements, I‘d assign them the different cores in a round robin manner so that they all equally get both the fast and the slow cores. The two clusters you get may be due to the fact that their timing is close to the scheduler‘s core switching frequency. To test my hypothesis, more tests would have to be run with longer thread run times.7eb847d4a7977f804928d078a58942afFri, 10 Nov 2017 20:13:11 GMTmikeash - 2017-11-10 17:26:33http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#comments<b>gmurthy</b>: You're correct. Thanks for pointing that out. My brain must have taken a brief vacation while writing that part. I've fixed it now.368013f07838b9139860ae5fd708020dFri, 10 Nov 2017 17:26:33 GMTgmurthy - 2017-11-10 17:00:55http://www.mikeash.com/?page=pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html#comments<div class="blogcommentquote"><div class="blogcommentquoteinner"> There's one narrow cluster centered around ~6.7 microseconds, and another narrow cluster centered around ~9 nanoseconds, and nothing in between. <br /></div></div> <br /> <br />If I'm reading this right, you probably meant ~7 microseconds and ~9 microseconds?6f1badfeb19fc6da607030788b78dd67Fri, 10 Nov 2017 17:00:55 GMT