Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the overhead of the different forms of parallelism in Julia v0.5?

As the title states, what is the overhead of the different forms of parallelism, at least in the current implementation of Julia (v0.5, in case the implementation changes drastically in the future)? I am looking for some "practical measures", some general heuristics or ballparks to keep in my head for when it can be useful. For example, it's pretty obvious that multiprocessing won't give you gains in a loop like:

addprocs(4)
@parallel (+) for i=1:4
  rand()
end

doesn't give you performance gains because each process is only taking one random number, but is there general heuristic for knowing when it will be worthwhile? Also, what about a heuristic for threading. It's surely a lower overhead than multiprocessing, but for example, with 4 threads, for what N is it a good idea to multithread:

A = rand(4)
Base.@threads (+) for i = 1:N
  A[i%4+1] 
end

(I know there isn't a threaded reduction right now, but let's act like there is, or edit with a better example). Sure, I can benchmark every example, but some good rules to keep in mind would go a long way.

In more concrete terms: what are some good rules of thumb?

  • How many numbers do you need to be adding/multiplying before threading gives performance enhancements, or before multiprocessing gives performance enhancements?
  • How much does the depend on Julia's current implementation?
  • How much does it depend on the number of threads/processes?
  • How much does the depend on the architecture? Are there good rules for knowing when the threshold should be higher/lower on a particular system?
  • What kinds of applications violate these heuristics?

Again, I'm not looking for hard rules, just general guidelines to guide development.

like image 269
Chris Rackauckas Avatar asked Aug 07 '16 19:08

Chris Rackauckas


People also ask

Does Julia use multiple cores?

Julia's multi-threading provides the ability to schedule Tasks simultaneously on more than one thread or CPU core, sharing memory. This is usually the easiest way to get parallelism on one's PC or on a single large multi-core server.

How do you multithread with Julia?

Starting Julia with multiple threadsThe number of execution threads is controlled either by using the -t / --threads command line argument or by using the JULIA_NUM_THREADS environment variable. When both are specified, then -t / --threads takes precedence.

Does Julia support concurrency?

Multithreading. Julia supports a variety of styles of concurrent computation. A multithreaded computation is a type of concurrency that involves simultaneous work on different processing units.


1 Answers

A few caveats: 1. I'm speaking from experience with version 0.4.6, (and prior), haven't played with 0.5 yet (but, as I hope my answer below demonstrates, I don't think this is essential vis-a-vis the response I give). 2. this isn't a fully comprehensive answer.

Nevertheless, from my experience, the overhead for multiple processes itself is very small provided that you aren't dealing with data movement issues. In other words, in my experience, any time that you ever find yourself in a situation of wishing something were faster than a single process on your CPU can manage, you're well past the point where parallelism will be beneficial. For instance, in the sum of random numbers example that you gave, I found through testing just now that the break-even point was somewhere around 10,000 random numbers. Anything more and parallelism was the clear winner. Generating 10,000 random number is trivial for modern computers, taking a tiny fraction of a second, and is well below the threshold where I'd start getting frustrated by the slowness of my scripts and want parallelism to speed them up.

Thus, I at least am of the opinion, that although there are probably even more wonderful things that the Julia developers could do to cut down on the overhead even more, at this point, anything pertinent to Julia isn't going to be so much of your limiting factor, at least in terms of the computation aspects of parallelism. I think that there are still improvements to be made in terms of enhancing both the ease and the efficiency of parallel data movement (I like the package that you've started on that topic as a good step. You and I would probably both agree there's still a ways more to go). But, the big limiting factors will be:

  1. How much data do you need to be moving around between processes?
  2. How much read/write to your memory do you need to be doing during your computations? (e.g. flops per read/write)

Aspect 1. might at times lean against using parallelism. Aspect 2. is more likely just to mean that you won't get so much benefit from it. And, at least as I interpret "overhead," neither of these really fall so directly into that specific consideration. And, both of these are, I believe, going to be far more heavily determined by your system hardware than by Julia.

like image 100
Michael Ohlrogge Avatar answered Nov 15 '22 22:11

Michael Ohlrogge