Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it Safe to use 'Unsafe' Thread Functions?

Please pardon my slightly humorous title. I use two different definitions of the word 'safe' in it (obviously).

I am rather new to threading (well, I have used threading for many years, but only very simple forms of it). Now I am faced with the challange of writing parallal implementations of some algorithms, and the threads need to work on the same data. Consider the following newbie mistake:

const
  N = 2;

var
  value: integer = 0;    

function ThreadFunc(Parameter: Pointer): integer;
var
  i: Integer;
begin
  for i := 1 to 10000000 do
    inc(value);
  result := 0;
end;

procedure TForm1.FormCreate(Sender: TObject);
var
  threads: array[0..N - 1] of THandle;
  i: Integer;
  dummy: cardinal;
begin

  for i := 0 to N - 1 do
    threads[i] := BeginThread(nil, 0, @ThreadFunc, nil, 0, dummy);

  if WaitForMultipleObjects(N, @threads[0], true, INFINITE) = WAIT_FAILED then
    RaiseLastOSError;

  ShowMessage(IntToStr(value));

end;

A beginner might expect the code above to display the message 20000000. Indeed, first value is equal to 0, and then we inc it 20000000 times. However, since the inc procedure is not 'atomic', the two threads will conflict (I guess that inc does three things: it reads, it increments, and it saves), and so a lot of the incs will be effectively 'lost'. A typical value I get from the code above is 10030423.

The simplest workaround is to use InterlockedIncrement instead of Inc (which will be much slower in this silly example, but that's not the point). Another workaround is to place the inc inside a critical section (yes, that will also be very slow in this silly example).

Now, in most real algorithms, conflicts are not this common. In fact, they might be very uncommon. One of my algorithms creates DLA fractals, and one of the variables that I inc every now and then is the number of adsorbed particles. Conflicts here are very rare, and, more importantly, I really don't care if the variable sums up to 20000000, 20000008, 20000319, or 19999496. Thus, it is tempting not to use InterlockedIncrement or critical sections, since they just bloat the code and makes it (marginally) slower to no (as far as I can see) benefit.

However, my question is: can there be more severe consequences of conflicts than a slightly 'incorrect' value of the incrementing variable? Can the program crash, for instance?

Admittedly, this question might seem silly, for, after all, the cost of using InterlockedIncrement instead of inc is rather low (in many cases, but not all!), and so it is (perhaps) stupid not to play safe. But I also feel that it would be good to know how this really works on a theoretical level, so I still think that this question is very interesting.

like image 702
Andreas Rejbrand Avatar asked Jan 07 '12 13:01

Andreas Rejbrand


People also ask

Is a function thread-safe?

A threadsafe function protects shared resources from concurrent access by locks. Thread safety concerns only the implementation of a function and does not affect its external interface. The use of global data is thread-unsafe.

What's the difference between a thread-safe and a thread unsafe function?

Thread SafetyA procedure is thread safe when it is logically correct when executed simultaneously by several threads. At a practical level, it is convenient to recognize three levels of safety. An unsafe procedure can be made thread safe and serializable by surrounding it with statements to lock and unlock a mutex.

When should I worry about thread-safety?

Thread safety becomes a concern if there is at least a single entry point which can be accessed by multiple threads. If a piece of code is accessed by multiple threads and is calling other method/class/etc., then all this code tree becomes vulnerable.

What happens if code is not thread-safe?

Conditionally safe: Different threads can access different objects simultaneously, and access to shared data is protected from race conditions. Not thread safe: Data structures should not be accessed simultaneously by different threads.


1 Answers

Your program won't ever crash due to a race on the increment of an integer that is only used as a count. All that can go wrong is that you don't get the correct answer. Obviously if you were using the integer as an index into an array, or perhaps it was a pointer then you could have problems.

Unless you are incrementing this value incredibly frequently, it's hard to imagine that an interlocked increment would be expensive enough for you to notice the performance difference.

What's more the most efficient approach is to get each thread to maintain its own private count. Then sum all the individual thread counts when you join the threads at the end of the calculation. That way you get the best of both worlds. No contention on the incrementing, and the correct answer. Of course, you need to take measures to ensure that you don't get caught out by false sharing.

like image 79
David Heffernan Avatar answered Sep 21 '22 21:09

David Heffernan