Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are closures suddenly useful for optimizing programs to run on multiple cores?

I read an article that claims that closures (or "blocks") are a useful weapon in the "War on Multicores", because

[...] they allow you to create units of work, which each have their own copy of the stack, and don’t step on each others toes as a result. What’s more, you can pass these units around like they are values, when in actual fact they contain a whole stack of values (pun intended), and executable code to perform some operation.

Now, I am not debating the usefulness of closures in general and possibly also for concurrent programming in a shared-memory model, but what's the difference with a thread that only acts on local data (or processes, or actors, or ...)?

Isn't a closure on its own as useful for concurrent programming as a thread without a scheduler?

What with closures that have non-local side effects?

like image 747
eljenso Avatar asked Oct 18 '09 18:10

eljenso


3 Answers

The argument is that having closures in your programming language makes it easier to have some work done in another thread. I think the author should have mentioned the importance of higher-order function in that argument.

My favorite introduction to higher-order functions is "Why functional programming matters", I won't try to present a bad replica here.

So using closures doesn't give you parallelism for free if you're going do execute closures in for loops, e.g.

for (int i = 0; i < numElements; i++) {
  result[i] = closure(inputs[i], i);
}

because the language can't tell if closure(a, b) somehow changes other values in the result or inputs arrays. But languages with higher-order functions like map specify that the function passed to map shouldn't look at or change other values in the inputs, and prevent it from affecting other results. So, code like the following, which is common in functional languages, can be parallelized for you, without you needing to create a pool of worker threads and hand off the closure to them:

results = map(closure, inputs, [0..numElements-1]);

In these languages, closures take away the pain of declaring a new function somewhere for short pieces of code. That makes it more fun to use higher-order functions.

The following Haskell code defines a function f that takes a list of numbers and returns a list where each input i is replaced with 2i+1. By saving you the hassle of creating a function to compute 2i+1 this is 1 line of code instead of 2.

f nums = map (\i -> 2*i+1) nums

Again, see "Why functional programming matters" for strong arguments as to how this scales up to real code bases.

like image 57
Harold L Avatar answered Nov 02 '22 06:11

Harold L


Here is a nice definition of closures:

A "closure" is an expression (typically a function) that can have free variables together with an environment that binds those variables (that "closes" the expression).

I think you are confusing definitions, as, in javascript for example, my closures may often have non-local side effects, as I am changing the DOM.

Closures are very useful, which is why C# added them to the language.

In languages such as the functional programming language, they seem to not necessarily create threads, which you have to pay a price for due to context switching, but create light-weight processes. The framework, or compiler, will have control over what to create to ensure that the processor is best utilized.

Whether you write with closures is less important than if you use immutable data.

For example, if I have an application that has no global data, but every thread uses it's own local copy, then it is up to the OS and the scheduler to determine which cores my application will use. Unfortunately, in C/C++ the compilers don't see to know how to do that well, so by moving to FP then we can go with frameworks, such as Erlang, that have been dealing with distributed processing for a long time, and leverage their experience.

Actors, in something like Erlang, will have less overhead than a C/C++ thread, as the switching seems to be faster with actors.

like image 20
James Black Avatar answered Nov 02 '22 05:11

James Black


This quote from the article packs quite a few misunderstandings into one sentence fragment:

[...] they allow you to create units of work, which each have their own copy of the stack, and don’t step on each others toes as a result.

This is not generally true of languages with closures.

Firstly, more often they have references to the stack, not copies, for the sake of efficiency. And in the majority of languages, you can modify things through a reference. So in those languages, this feature does not provide a way of isolating units of work at all. If anything, it can make it more confusing.

Secondly, in most (sane) languages you can refer to anything that is lexically enclosing a local function. You cannot refer to just anything anywhere on the stack. For example, you cannot dig into the local variables of the function that called the function that called the function... etc. You can only access the variables/parameters of functions declared locally, whose text encloses the text in which the usage occurs. Also local variables and parameters ("on the stack") are not the only things that may be lexically enclosing a function. So "the stack" is the wrong concept to invoke here.

Java is one example of a language that can only take copies of local variables and parameters in its "anonymous inner class" closures. However, it will still take a reference in the case of the this reference of the outer class.

And in the case of closing over this, the inner class will now store an implicit reference to the outer class - nothing to do with the stack, really.

The situation is similar in C#, except local variables and parameters are captured by reference instead of being copied.

var counter = 0;
Repeat(10, () => counter++);

Suppose Repeat is a library function that starts another thread and will now call the Action lambda passed to it every 10 milliseconds. You can hopefully see how this is a very succinct way to create race conditions!

The only kind of language that would avoid this problem would be a pure functional language such as Haskell, but that would not be due to closures - clearly - but instead due to the fact that you can never modify any shared state. (And Haskell still wouldn't avoid the problem entirely; most real software has to interact with shared state external to the program at some point, and Haskell's standard library has a way of doing that).

like image 23
Daniel Earwicker Avatar answered Nov 02 '22 07:11

Daniel Earwicker