mikeash.com pyblog/friday-qa-2009-02-13-operations-based-parallelization.html commentshttp://www.mikeash.com/?page=pyblog/friday-qa-2009-02-13-operations-based-parallelization.html#commentsmikeash.com Recent CommentsThu, 28 Mar 2024 19:43:18 GMTPyRSS2Gen-1.0.0http://blogs.law.harvard.edu/tech/rssmikeash - 2009-02-15 03:41:06http://www.mikeash.com/?page=pyblog/friday-qa-2009-02-13-operations-based-parallelization.html#commentsGood points, let me address them one by one. <br /> <br />Image scaling is certainly something that could be multithreaded. But this is a kind of advanced technique that I think goes beyond the level I was looking at in this article. "Scale an image" is conceptually a single unit, usually performed by the frameworks, and breaking it up and manually running it in parallel is a lot of work. If you do it a lot then it could pay off a bunch, of course, but it's not just a matter of designing your code in terms of discrete operations that can run in parallel. <br /> <br />And yes, saving could benefit from being another operation here exactly as you say, and having multiple threads saving simultaneously is probably a bad idea due to disk contention. <br /> <br />Putting the entire contents of that loop in a single operation and then running multiple such operations in parallel is still going to be helpful. Assuming you have a lot of images, you'll gain scaling performance just from running more than one in parallel. Saving is not going to cause a lot of problems because the threads will spend most of their time doing things other than saving, so that should be fine. But for optimal performance, you would probably want to pass saving off to a separate module which saves them serially. With NSOperationQueue, you could do this by having a separate queue with setMaxConcurrentOperationCount:1, then enqueue save operations onto it when the other stuff is finished. <br /> <br />Your point about memory usage is also extremely smart. If each operation is very memory-intensive then it's easy to blow out your memory and start swapping, utterly destroying any speed gains you might have. You can see this with Xcode on a memory-starved Mac Pro. Tell Xcode to run multiple instances of gcc (each of which takes several hundred MB of memory) and you can easily get it to run a build much more slowly than if you tell it to just run one. Of course if you have memory to spare, this reverses, and it becomes hugely beneficial to run multiple instances of gcc.f83b28d31fba1c86445252a5c1a231faSun, 15 Feb 2009 03:41:06 GMTmvo - 2009-02-15 01:15:36http://www.mikeash.com/?page=pyblog/friday-qa-2009-02-13-operations-based-parallelization.html#comments<div class="blogcommentquote"><div class="blogcommentquoteinner"> <br />&nbsp;&nbsp;&nbsp;&nbsp;for(NSImage *image in images) { <br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[self rotate:image]; <br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[self scale:image]; <br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[self save:image]; <br />&nbsp;&nbsp;&nbsp;&nbsp;} <br />There are no dependencies between those images, so each one can go into a separate operation. On the other hand, each individual method call within the loop depends on the previous one, so those can't be split out beneficially. <br /></div></div> <br /> <br />Just raising some thoughts on this article, then actual answering them... <br /> <br />Scaling one image can be split up over several threads, I don't have any heuristic data/experimental code that could proof if it would be more beneficially. Then your threading strategy, but lets say for example you have 4000Wx4000H image and you split it up in 4 threads on a 4 core system. 4000Wx100oH image sections. You would see some benefit. Also the saving is an IO bound operation, like you mention so you could benefit from splitting that up in separate thread/operation. Because while that thread is "sleeping" for IO stuff to finish. Another thread can do some other scaling... <br /> <br />Also I'm wondering if it is a good design to have a multiple threads that save to disk? Lets say you have 4 images in your array and they all start saving at the same time. What if you would design a threaded system with only one threaded dedicated to saving, means maybe less disk movement. <br /> <br />Also having 4 threads to load each an image meaning you would use more memory then having one image loaded at the same time but have 4 threads work on the scaling. More memory usage could invoke VM etc... <br /> <br />Rotation is a different story because if you flip an image say like 90% you loose the benefit of working on same rows. Also the Rotation needs to finish completely before the scaling can start. <br /> <br />just some thoughts that come up.. <br /> <br /> <br /> <br />The <br /> <br /> <br /> <br /> <br />7b9015559ebc66abfd8464cfaa9bf1f4Sun, 15 Feb 2009 01:15:36 GMT