When I first discovered threads, I tried checking that they actually worked as expected by calling sleep in many threads, versus calling sleep normally. It worked, and I was very happy.
But then a friend of mine told me that these threads weren't really parallel, and that sleep must be faking it.
So now I wrote this test to do some real processing:
class Test
ITERATIONS = 1000
def run_threads
start = Time.now
t1 = Thread.new do
do_iterations
end
t2 = Thread.new do
do_iterations
end
t3 = Thread.new do
do_iterations
end
t4 = Thread.new do
do_iterations
end
t1.join
t2.join
t3.join
t4.join
puts Time.now - start
end
def run_normal
start = Time.now
do_iterations
do_iterations
do_iterations
do_iterations
puts Time.now - start
end
def do_iterations
1.upto ITERATIONS do |i|
999.downto(1).inject(:*) # 999!
end
end
end
And now I'm very sad, because run_threads() not only didn't perform better than run_normal, it was even slower!
Then why should I complicate my application with threads, if they aren't really parallel?
** UPDATE **
@fl00r said that I could take advantage of threads if I used them for IO tasks, so I wrote two more variations of do_iterations:
def do_iterations
# filesystem IO
1.upto ITERATIONS do |i|
5.times do
# create file
content = "some content #{i}"
file_name = "#{Rails.root}/tmp/do-iterations-#{UUIDTools::UUID.timestamp_create.hexdigest}"
file = ::File.new file_name, 'w'
file.write content
file.close
# read and delete file
file = ::File.new file_name, 'r'
content = file.read
file.close
::File.delete file_name
end
end
end
def do_iterations
# MongoDB IO (through MongoID)
1.upto ITERATIONS do |i|
TestModel.create! :name => "some-name-#{i}"
end
TestModel.delete_all
end
The performance results are still the same: normal > threads.
But now I'm not sure if my VM is able to use all the cores. Will be back when I have tested that.
Why No Parallelism in Ruby? Today, there is no way of achieving parallelism within a single Ruby process using the default Ruby implementation (generally called MRI or CRuby). The Ruby VM enforces a lock (the GVM, or Global VM Lock) that prevents multiple threads from running Ruby code at the same time.
Multi-threading is the most useful property of Ruby which allows concurrent programming of two or more parts of the program for maximizing the utilization of CPU. Each part of a program is called Thread. So, in other words, threads are lightweight processes within a process.
The Ruby Interpreter is single threaded, which is to say that several of its methods are not thread safe. In the Rails world, this single-thread has mostly been pushed to the server.
A single thread executes on one CPU core, so if you write a ruby program, then it is executed only one core of CPU and if you have quad-core CPU then other 3 cores are not utilize to execute your ruby program. Threading makes ruby program to utilize more memory and CPU to execute faster and achieve concurrency.
Threads could be faster only if you have got some slow IO.
In Ruby you have got Global Interpreter Lock, so only one Thread can work at a time. So, Ruby spend many time to manage which Thread should be fired at a moment (thread scheduling). So in your case, when there is no any IO it will be slower!
You can use Rubinius or JRuby to use real Threads.
Example with IO:
module Test
extend self
def run_threads(method)
start = Time.now
threads = []
4.times do
threads << Thread.new{ send(method) }
end
threads.each(&:join)
puts Time.now - start
end
def run_forks(method)
start = Time.now
4.times do
fork do
send(method)
end
end
Process.waitall
puts Time.now - start
end
def run_normal(method)
start = Time.now
4.times{ send(method) }
puts Time.now - start
end
def do_io
system "sleep 1"
end
def do_non_io
1000.times do |i|
999.downto(1).inject(:*) # 999!
end
end
end
Test.run_threads(:do_io)
#=> ~ 1 sec
Test.run_forks(:do_io)
#=> ~ 1 sec
Test.run_normal(:do_io)
#=> ~ 4 sec
Test.run_threads(:do_non_io)
#=> ~ 7.6 sec
Test.run_forks(:do_non_io)
#=> ~ 3.5 sec
Test.run_normal(:do_non_io)
#=> ~ 7.2 sec
IO jobs are 4 times faster in Threads and Processes while non-IO jobs in Processes a twice as fast then Threads and sync methods.
Also in Ruby presents Fibers lightweight "corutines" and awesome em-synchrony gem to handle asynchronous processes
fl00r is right, the global interpretor lock prevents multiple threads running at the same time in ruby, except for IO.
The parallel
library is a very simple library that is useful for truly parallel operations. Install with gem install parallel
. Here is your example rewritten to use it:
require 'parallel'
class Test
ITERATIONS = 1000
def run_parallel()
start = Time.now
results = Parallel.map([1,2,3,4]) do |val|
do_iterations
end
# do what you want with the results ...
puts Time.now - start
end
def run_normal
start = Time.now
do_iterations
do_iterations
do_iterations
do_iterations
puts Time.now - start
end
def do_iterations
1.upto ITERATIONS do |i|
999.downto(1).inject(:*) # 999!
end
end
end
On my computer (4 cpus), Test.new.run_normal
takes 4.6 seconds, while Test.new.run_parallel
takes 1.65 seconds.
The behavior of threads is defined by the implementation. JRuby, for example, implements threads with JVM threads, which in turn uses real threads.
The Global Interpreter Lock is only there for historic reasons. If Ruby 1.9 had simply introduced real threads out of nowhere, backwards compatibility would have been broken, and it would have slowed down its adoption even more.
This answer by Jörg W Mittag provides an excellent comparison between the threading models of various Ruby implementations. Choose one which is appropriate for your needs.
With that said, threads can be used to wait for a child process to finish:
pid = Process.spawn 'program'
thread = Process.detach pid
# Later...
status = thread.value.exitstatus
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With