Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple threads calling the same function

Suppose we have multiple threads all calling the same function:

def foo 
  # do stuff ...
end

100.times do |i|
  Thread.new do
    foo
  end
end

If two or more threads are currently inside of foo, do they each share the same local variables within foo?

This relates to my second question. Do threads have individual stack frames, or do they share stack frames within a single process? Specifically, when multiple threads each invoke foo and before foo returns, are there multiple copies of foo on the stack, each with their own local variables, or is there only one copy of foo on the stack?

like image 290
Dustin Biser Avatar asked May 06 '12 00:05

Dustin Biser


Video Answer


1 Answers

Yes, they share the same variables. This is a key element of Threads and is fine in a read-only context, but if they write to any of those variables, you need to use a Mutex and synchronize the threads, so only one can be changing a variable at any given time. Sometimes they may be invoking a method which changes data indirectly, so you need to know the system fully before you decide if you need to synchronize or not.

As for your second question, if I understand what you're asking, they have individual stack frames, but they are still all sharing the same data in memory.

The clarify, in the following example, the local variable zip is shared by multiple threads, since it was defined in the current scope (threads don't change scope, they just start a separate, parallel thread of execution in the current scope).

zip = 42

t = Thread.new do
  zip += 1
end

t.join

puts zip # => 43

The join here saves me, but obviously there's no point in the thread at all, if I keep that there. It would be dangerous if I were to do the following:

zip = 42

t = Thread.new do
  zip += 1
end

zip += 1

puts zip # => either 43 or 44, who knows?

That is because you basically have two threads both trying to modify zip at the same time. This becomes noticeable when you are accessing network resources, or incrementing numbers etc, as in the above.

In the following example, however, the local variable zip is created inside a an entirely new scope, so the two threads aren't actually writing to the same variable at the same time:

def foo
  zip = 42
  zip += 1 # => 43, in both threads
end

Thread.new do
  foo
end

foo

There are two parallel stacks being managed, each with their own local variables inside the foo method.

The following code, however, is dangerous:

@zip = 42 # somewhere else

def foo
  @zip += 1
end

Thread.new do
  foo
end

foo

puts @zip # => either 43 or 44, who knows?

That's because the instance variable @zip is accessible outside of the scope of the foo function, so both threads may be accessing it at the same time.

These problems of 'two threads changing the same data at the same time' are resolved by using carefully placed Mutexes (locks) around the sections of the code that change the variable. The Mutex must be created before the threads are created, because in the case of a Mutex, it is (by design) vital that both threads access the same Mutex, in order to know if it's locked or not.

# somewhere else...
@mutex = Mutex.new
@zip   = 42

def foo
  @mutex.synchronize do
    @foo += 1
  end
end

Thread.new do
  foo
end

foo

puts @zip # => 44, for sure!

If when the flow of execution reaches the Mutex#synchronize line, it tries to lock the mutex. If successful, it enters the block and continues executing. Once the block finishes, the mutex is unlocked again. If the mutex is already locked, the thread waits until it becomes free again... effectively it's like a door that only one person can walk through at a time.

I hope this clears things up.

like image 91
d11wtq Avatar answered Oct 16 '22 03:10

d11wtq