Chapel reductions currently ignore the initial values of variables. That means this code
var x: int;
for i in 1..3 {
forall j in 1..10 with (+ reduce x) {
x += 1;
}
}
writeln(x);
returns 10 and not 30, as this user naively thought. While this behavior is fine (and it is documented in the notes on reduction clauses -- I just didn't think hard about it), it turns out that if I want to get 30 (by accumulating across both loops), I need to actually do the sum by hand. I think it would be quite elegant and symmetric for for
loops to also have a reduce
intent.... i.e. I'd like to write
var x: int;
for i in 1..3 with (+ reduce x) {
forall j in 1..10 with (+ reduce x) {
x += 1;
}
}
writeln(x);
Note that even in the case of summing numbers, I need to introduce a temporary variable. For max/min like operations, one needs to be even more careful.
Is there a reason not to support reduce
intents inside for loops? Alternately, is there a more idiomatic (Chapel-rrific) way to do this?
UPDATE: The more I think about this, it's not obvious that my proposed code would work in the case that the outer for
was replaced by a forall
. I think the issue is that the variables are task-local and not iteration-local, so that the reduction would only occur over tasks. So one would still need a separate internal reduction step. What this would remove is the need for a temporary variable.
I think the more overarching question is what the correct way to do these sorts of nested reductions is...
It seems to me that this is an oversight in the design of Chapel's reduce intent. Specifically, while I think it is appropriate that each task ignores the original variable's value in initializing its personal copy of the reduction variable to the identity (as you note is currently done), I believe the tasks' contributions should be combined back into the original variable's value at the end of the parallel loop rather than simply overwriting that original value as they are combined with one another. This would make your original attempt work as you had expected, and would also follow what OpenMP does, as suggested by the following C example which gets 35 as its result:
#include <stdio.h>
#include <omp.h>
int main(int argc, char* argv[]) {
int tot = 5;
for (int i=0; i<3; i++) {
#pragma omp parallel for reduction(+:tot)
for (int j=0; j<10; j++) {
tot += 1;
}
}
printf("tot is: %d\n", tot);
}
I would recommend filing a bug / feature request advocating for this behavior on the Chapel GitHub issues page.
As of Chapel 1.15.0, one way to work around this would be to do the reduction manually within the serial loop, as follows:
config var tot: int = 5;
for i in 1..3 {
var subtot: int;
forall j in 1..10 with (+ reduce subtot) do
subtot += 1;
tot += subtot;
}
writeln("tot is: ", tot);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With