Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Parallel each loop

I have a the following code:

FTP ... do |ftp| 
  files.each do |file| 
  ...
  ftp.put(file)
  sleep 1
  end 
end 

I'd like to run the each file in a separate thread or some parallel way. What's the correct way to do this? Would this be right?

Here's my try on the parallel gem

FTP ... do |ftp| 
  Parallel.map(files) do |file| 
  ...
  ftp.put(file)
  sleep 1
  end 
end 

The issue with parallel is puts/outputs can occur at the same time like so:

as = [1,2,3,4,5,6,7,8]
results = Parallel.map(as) do |a|
  puts a
end

How can I force puts to occur like they normally would line separated.

like image 967
Sten Kin Avatar asked Apr 01 '14 21:04

Sten Kin


1 Answers

The whole point of parallelization is to run at the same time. But if there's some part of the process that you'd like to run some of the code sequentially you could use a mutex like:

semaphore = Mutex.new
as = [1,2,3,4,5,6,7,8]
results = Parallel.map(as, in_threads: 3) do |a|
  # Parallel stuff
  sleep rand
  semaphore.synchronize {
    # Sequential stuff
    puts a
  }
  # Parallel stuff
  sleep rand
end

You'll see that it prints stuff correctly but not necesarily in the same order. I used in_threads instead of in_processes (default) because Mutex doesn't work with processes. See below for an alternative if you do need processes.

References:

  • http://ruby-doc.org/core-2.2.0/Mutex.html
  • http://dev.housetrip.com/2014/01/28/efficient-cross-processing-locking-in-ruby/
like image 97
César Izurieta Avatar answered Oct 07 '22 17:10

César Izurieta