Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe, to share an array between threads?

Is it safe, to share an array between promises like I did it in the following code?

#!/usr/bin/env perl6
use v6;

sub my_sub ( $string, $len ) {
    my ( $s, $l );
    if $string.chars > $len {
        $s = $string.substr( 0, $len );
        $l = $len;
    }
    else {
        $s = $string;
        $l = $s.chars;
    }
    return $s, $l;
}

my @orig = <length substring character subroutine control elements now promise>;
my $len = 7;
my @copy;
my @length;
my $cores = 4;
my $p = @orig.elems div $cores;
my @vb = ( 0..^$cores ).map: { [ $p * $_, $p * ( $_ + 1 ) ] };
@vb[@vb.end][1] = @orig.elems;

my @promise;
for @vb -> $r {
    @promise.push: start {
        for $r[0]..^$r[1] -> $i {
            ( @copy[$i], @length[$i] ) = my_sub( @orig[$i], $len );
        }
    };
}
await @promise;
like image 394
sid_com Avatar asked May 04 '17 12:05

sid_com


People also ask

Can threads share an array?

If you truly are running thread, definitionally all threads in the same process all share the same memory context. This means your global array is shared by default.

Is an array thread safe?

Intrinsically, Arrays in Swift are not thread-safe. Arrays are 'value-typed' in the effect of assignment, initialization, and argument passing — creates an independent instance with its own unique copy of its data. Immutable arrays (declared using let) are thread-safe since it is read-only.

Can multiple threads access the same array?

The answer is no. Each array element has a region of memory reserved for it alone within the region attributed the overall array. Modifications of different elements therefore do not write to any of the same memory locations.

Can two threads read the same variable?

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable.


3 Answers

It depends how you define "array" and "share". So far as array goes, there are two cases that need to be considered separately:

  • Fixed size arrays (declared my @a[$size]); this includes multi-dimensional arrays with fixed dimensions (such as my @a[$xs, $ys]). These have the interesting property that the memory backing them never has to be resized.
  • Dynamic arrays (declared my @a), which grow on demand. These are, under the hood, actually using a number of chunks of memory over time as they grow.

So far as sharing goes, there are also three cases:

  • The case where multiple threads touch the array over its lifetime, but only one can ever be touching it at a time, due to some concurrency control mechanism or the overall program structure. In this case the arrays are never shared in the sense of "concurrent operations using the arrays", so there's no possibility to have a data race.
  • The read-only, non-lazy case. This is where multiple concurrent operations access a non-lazy array, but only to read it.
  • The read/write case (including when reads actually cause a write because the array has been assigned something that demands lazy evaluation; note this can never happen for fixed size arrays, as they are never lazy).

Then we can summarize the safety as follows:

                     | Fixed size     | Variable size |
---------------------+----------------+---------------+
Read-only, non-lazy  | Safe           | Safe          |
Read/write or lazy   | Safe *         | Not safe      |

The * indicating the caveat that while it's safe from Perl 6's point of view, you of course have to make sure you're not doing conflicting things with the same indices.

So in summary, fixed size arrays you can safely share and assign to elements of from different threads "no problem" (but beware false sharing, which might make you pay a heavy performance penalty for doing so). For dynamic arrays, it is only safe if they will only be read from during the period they are being shared, and even then if they're not lazy (though given array assignment is mostly eager, you're not likely to hit that situation by accident). Writing, even to different elements, risks data loss, crashes, or other bad behavior due to the growing operation.

So, considering the original example, we see my @copy; and my @length; are dynamic arrays, so we must not write to them in concurrent operations. However, that happens, so the code can be determined not safe.

The other posts already here do a decent job of pointing in better directions, but none nailed the gory details.

like image 119
Jonathan Worthington Avatar answered Nov 27 '22 22:11

Jonathan Worthington


Just have the code that is marked with the start statement prefix return the values so that Perl 6 can handle the synchronization for you. Which is the whole point of that feature.
Then you can wait for all of the Promises, and get all of the results using an await statement.

my @promise = do for @vb -> $r {

    start

      do  # to have the 「for」 block return its values

        for $r[0]..^$r[1] -> $i {
            $i, my_sub( @orig[$i], $len )
        }
}

my @results = await @promise;

for @results -> ($i,$copy,$len) {
  @copy[$i] = $copy;
  @length[$i] = $len;
}

The start statement prefix is only sort-of tangentially related to parallelism.
When you use it you are saying, “I don't need these results right now, but probably will later”.

That is the reason it returns a Promise (asynchrony), and not a Thread (concurrency)

The runtime is allowed to delay actually running that code until you finally ask for the results, and even then it could just do all of them sequentially in the same thread.

If the implementation actually did that, it could result in something like a deadlock if you instead poll the Promise by continually calling it's .status method waiting for it to change from Planned to Kept or Broken, and only then ask for its result.
This is part of the reason the default scheduler will start to work on any Promise codes if it has any spare threads.


I recommend watching jnthn's talk “Parallelism, Concurrency, and Asynchrony in Perl 6”.
slides

like image 36
Brad Gilbert Avatar answered Nov 27 '22 21:11

Brad Gilbert


This answer applies to my understanding of the situation on MoarVM, not sure what the state of art is on the JVM backend (or the Javascript backend fwiw).

  • Reading a scalar from several threads can be done safely.
  • Modifying a scalar from several threads can be done without having to fear for a segfault, but you may miss updates:

$ perl6 -e 'my $i = 0; await do for ^10 { start { $i++ for ^10000 } }; say $i' 46785

The same applies to more complex data structures like arrays (e.g. missing values being pushed) and hashes (missing keys being added).

So, if you don't mind missing updates, changing shared data structures from several threads should work. If you do mind missing updates, which I think is what you generally want, you should look at setting up your algorithm in a different way, as suggested by @Zoffix Znet and @raiph.

like image 21
Elizabeth Mattijsen Avatar answered Nov 27 '22 23:11

Elizabeth Mattijsen