mikeash.com pyblog/friday-qa-2017-11-10-observing-the-a11s-heterogenous-cores.html comments

mikeash - 2017-11-16 18:12:50

Thu, 16 Nov 2017 18:12:50 GMT

Jacob Bandes-Storch: That's an interesting question. I looked around and didn't find much info on why there would be this difference. "To avoid race conditions" might just mean that it's hard to correctly write code that doesn't have a lock on the signal side. For example, if you signal after another thread checks the predicate but before that thread begins to wait, that thread will wait even though the predicate is now true, which isn't what you want. Locking on the sender side ensures that the condition can't be signaled during that time.

Mndgs: Sure, here it is:

https://gist.github.com/mikeash/75e99bbeebdb909d45b30c4d6753213e

Just toss it into a project and call Test().

Mndgs - 2017-11-14 08:21:18

Tue, 14 Nov 2017 08:21:18 GMT

Any chance for you to share full code?

Jacob Bandes-Storch - 2017-11-13 00:53:21

Mon, 13 Nov 2017 00:53:21 GMT

Straying off-topic here, but something caught my eye in your test harness code:



    cond.signal()

    cond.unlock()

Sure enough, the documentation for NSCondition.signal() states: https://developer.apple.com/documentation/foundation/nscondition/1415724-signal

To avoid race conditions, you should invoke this method only while the receiver is locked.

This struck me as odd because it’s contrary to the C++ STL’s recommendation for condition_variable::notify_one(): http://en.cppreference.com/w/cpp/thread/condition_variable/notify_one

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. However, some implementations (in particular many implementations of pthreads) recognize this situation and avoid this “hurry up and wait” scenario by transferring the waiting thread from the condition variable’s queue directly to the queue of the mutex within the notify call, without waking it up

Any ideas why there’s a difference here? What is meant by the “to avoid race conditions…” comment?

mikeash - 2017-11-10 20:28:53

Fri, 10 Nov 2017 20:28:53 GMT

They definitely did not run exclusively on the same core for the entire run. At one point I had my test code log thread IDs along with the sampled running times, and it was clear that individual threads were bouncing between fast and slow cores. The test I did here doesn't need the threads to stay on the same core, it just needs the sampled computation to be fast enough that it's unlikely to be moved to a different core during the sampling period.

Thomas Tempelmann - 2017-11-10 20:13:11

Fri, 10 Nov 2017 20:13:11 GMT

I am a bit skeptical that you actually managed to get the six threads run on their own core each. If I‘d write a thread scheduler for different speed cpus, and all those threads appear to have the same speed requirements, I‘d assign them the different cores in a round robin manner so that they all equally get both the fast and the slow cores. The two clusters you get may be due to the fact that their timing is close to the scheduler‘s core switching frequency. To test my hypothesis, more tests would have to be run with longer thread run times.

mikeash - 2017-11-10 17:26:33

Fri, 10 Nov 2017 17:26:33 GMT

gmurthy: You're correct. Thanks for pointing that out. My brain must have taken a brief vacation while writing that part. I've fixed it now.

gmurthy - 2017-11-10 17:00:55

Fri, 10 Nov 2017 17:00:55 GMT

There's one narrow cluster centered around ~6.7 microseconds, and another narrow cluster centered around ~9 nanoseconds, and nothing in between.

If I'm reading this right, you probably meant ~7 microseconds and ~9 microseconds?