Ruby's Parallel gem seems very powerful, but I'm having trouble using it to build a collection.
Take the following example with processes set to 0:
[174] pry(main)> @array = []
=> []
[175] pry(main)> Parallel.each(1..10, :in_processes=>0) {|x| @array.push(Random.rand(10))}
=> 1..10
[176] pry(main)> @array
=> [7, 3, 5, 6, 1, 5, 4, 4, 5, 1]
But when we set the processes to 2:
[177] pry(main)> @array = []
=> []
[178] pry(main)> Parallel.each(1..10, :in_processes=>2) {|x| @array.push(Random.rand(10))}
=> 1..10
[179] pry(main)> @array
=> []
Obviously this isn't even close to the best way to build an array of random values, what I'm trying to get at is that values appended to @array aren't there after the loop finishes when there are multiple processes. Is this a scope issue or am I misunderstanding how forks work?
Parallel's default mode works under the hood by forking your process and doing work in sub-processes (this is, IMO, a gigantic hack). Child processes aren't going to have write access to the parent's memory; changes made in a child will not persist to the parent.
You will only be able to communicate with your parent process through the gem's facilities which capture return values from the child. Parallel.map
provides a mechanism by which the data passed in is marshaled on the parent's side and then unmarshaled on the child, worked on, then the result is marshaled and passed back to the parent, and collected into a results array. Anything past that is going to be "thrown away" when the forked child dies.
Consider using threads, instead (and properly synchronize access to your shared variables). If you need multi-core concurrency (ie, you're doing parallel work that isn't blocked on IO), you should consider JRuby, which doesn't have a GIL and can natively execute multiple Ruby threads in parallel.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With