Following this, Ruby thread limit - Also for any language
I am trying to understand why my threads are not working. Some answer were pretty clear like:
"..creating 4 subprocessses with fork will utilize your 4 cores" which this will be my final approach since threads don't seem to work in my case.
also this:
"..Ruby MRI threading will not by itself fully utilise a multi-core CPU with running Ruby code. But whether that's a problem for you depends on what the threads are doing. If they are making long-running I/O calls to other processes on the same machine, you will see the benefit without needing separate processes. Threading and multi-processing as subjects can get quite complex doing even simple things. Most languages will make some compromises on what is easy and what is difficult out of the box..."
Taking into consideration the second one, I have removed any processing from my code and just left I/O in it.
Here it is:
beginning_time = Time.now
img_processor.load_image(frames_dir+"/frame_0001.png")
img_processor.load_image(frames_dir+"/frame_0002.png")
end_time = Time.now
puts "Time elapsed #{(end_time - beginning_time)*1000} milliseconds"
beginning_time = Time.now
for frame_index in 1..2
greyscale_frames_threads << Thread.new(frame_index) { |frame_number|
puts "Loading Image #{frame_number}"
img_processor.load_image(frames_dir+"/frame_%04d.png"%+frame_number)
}
end
puts "Joining Threads"
greyscale_frames_threads.each { |thread| thread.join } #this blocks the main thread
end_time = Time.now
puts "Time elapsed #{(end_time - beginning_time)*1000} milliseconds"
And what I am getting is this...
For the first non-threaded case:
Time elapsed 15561.358 milliseconds
For the second threaded case:
Time elapsed 15442.401 milliseconds
Ok, where is the performance increase? Am I missing something? Is the HDD blocking? Do I really need to spawn processes to see real parallelism in ruby?
Do I really need to spawn processes to see real parallelism in ruby?
Yes, I think so:
require 'timeout'
require 'digest'
require 'benchmark'
def do_stuff
Digest::SHA256.new.digest "a" * 100_000_000
end
N = 10
Benchmark.bm(10) do |x|
x.report("sequential") do
N.times do
do_stuff
end
end
x.report("subprocess") do
N.times do
fork { do_stuff }
end
Process.waitall
end
x.report("thread") do
threads = []
N.times do
threads << Thread.new { do_stuff }
end
threads.each(&:join)
end
end
Results for MRI 2.0.0:
user system total real
sequential 3.200000 0.180000 3.380000 ( 3.383322)
subprocess 0.000000 0.000000 6.600000 ( 1.068517)
thread 3.290000 0.210000 3.500000 ( 3.496207)
The first block (sequential) runs do_stuff
4 times, one after another, the second block (subprocess) runs on 4 cores, whereas the third block (thread) runs on 1 core.
If you change do_stuff
to:
def do_stuff
sleep(1)
end
The result is different:
user system total real
sequential 0.000000 0.000000 0.000000 ( 10.021893)
subprocess 0.000000 0.010000 0.080000 ( 1.013693)
thread 0.000000 0.000000 0.000000 ( 1.003463)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With