Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compiler detection of race conditions between Tasks

Tags:

ada

My day job is working with safety-critical embedded systems. I also do some teaching/consultancy with customers on the topic of writing safe embedded code. The question of programming languages always comes up and we compare C, D, Ada, Erlang, Rust, etc.

There is one exercise that I often use for demonstration purposes. It's a simple, two-threaded program, with each thread picking up a global variable (initialised to 0), adding 1 to it and replacing it ten times. We then speculate about the maximum value that the variable can have at the end (20) and its minimum value (we normally decide on 10 before we use a formal proof to shew that it can be 2).

One thing I demonstrate is that the C version of the program compiles (dangerous), but the Rust version doesn't (good!). Today I have written the Ada version and have had two surprises on which I would ask for comments. First, my program:

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
   task AddTenA;
   task AddTenB;

   --  Global variable

   x : Natural := 0;

   finished : array (0 .. 1) of Natural := (0, 0);

   --  Make sure that the compiler doesn't remove
   --  all the addition.

   pragma Volatile (x);

   task body AddTenA is
      y : Integer;
   begin
      for I in 1 .. 10 loop
         y := x + 1;
         x := y;
      end loop;
      finished (0) := 1;
   end AddTenA;
   
   task body AddTenB is
      y : Integer;
   begin
      for I in 1 .. 10 loop
         y := x + 1;
         x := y;
      end loop;
      finished (1) := 1;
   end AddTenB;
   
begin

   while finished (0) + finished (1) < 2 loop
      delay 0.001;
   end loop;

   Put_Line (Integer'Image (x));

end Main;

And yes, I am familiar with Protected Objects and with Task rendezvous, but that wasn't the point of the program.

My two surprises:

  1. Even with a complete alphabet of compiler flags (-fstack-check, -gnata, -gnato13, -gnatf, -gnatwa, -g, -gnatVa,-gnaty3abcdefhiklmnoOprstux, -gnatwe, -gnat2012, -Wall,-O2) I do not get a compiler warning. Rust tells me that the global variable has no unique owner and therefore it's not going to compile the Rust version of the program for me. I understand that SPARK doesn't handle Tasks, and so Ada will generate no warning that I have a potentially dangerous race condition in the code. This surprises me in a language such as Ada. Have I missed a clever compiler or runtime option?

  2. When I execute the equivalent C program, the most common output is 20, but, when I run it many times, I get a scattering of values, typically from about 8 to 20. I have run the Ada program (above) 500,000 times and have only ever got values of 10 and 20 (with no values in between, and with 99.9% of the outputs being 20). This would indicate that there is some fundamental difference between C's pthreading and Ada's Tasking. What is that? Are Ada Tasks not mapped to pthreads? Is there an implicit round-robin scheduling in the Ada version?

Going around the addition loop 10 times presumably doesn't take long, so I have tried increasing the loop count to 100 to see whether the Task could be interrupted more often. Then I get just 200 and 100.

like image 592
Chris Hobbs Avatar asked Nov 26 '20 19:11

Chris Hobbs


People also ask

How do you determine race conditions?

Programmers use dynamic and static analysis tools to identify race conditions. Static testing tools scan a program without running it. However, they produce many false reports. Dynamic analysis tools have fewer false reports, but they may not catch race conditions that aren't executed directly within the program.

Which tool will help us detecting possible race conditions in a program?

RacerX This flow-sensitive static analysis tool is used for detecting races and deadlocks.

What is race condition in multithreading and how can we solve it?

When race conditions occur. A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable.

What is race detection?

Race detection is the procedure of identifying race conditions in parallel programs and program executions. Unintended race conditions are a common source of error in parallel programs and hence race detection methods play an important role in debugging such errors.


2 Answers

Using GNAT Community 2020 I get the following diagnostics using SPARK:

package threads with
   SPARK_Mode
is
   X : Natural := 0;
   pragma Volatile (X);

   task type AddTen with
   Global => (in_out => X);
end threads;

pragma Ada_2012;
package body threads with
SPARK_Mode
is

   ------------
   -- AddTen --
   ------------

   task body AddTen is
      Y : Integer;
   begin
      for I in 1 .. 10 loop
         Y := X + 1;
         X := Y;
      end loop;
   end AddTen;

end threads;

with Ada.Text_IO; use Ada.Text_IO;
with threads; use threads;

procedure Main with SPARK_Mode is
begin
   declare
      A : AddTen;
      B : AddTen;
   begin
      null;
   end;
   Put_Line(X'Image);
end Main;

I get the following messages from SPARK when examining all sources:

gnatprove -PD:\Ada\Stack_Overflow\Race\race.gpr -j0 --mode=flow --ide-progress-bar -U Phase 1 of 2: generation of Global contracts ... threads.adb:14:15: volatile object cannot appear in this context (SPARK RM 7.1.3(12)) main.adb:12:13: volatile object cannot appear in this context (SPARK RM 7.1.3(11)) gnatprove: error during generation of Global contracts

It appears to me that SPARK does indeed identify this use of a volatile object as being improper.

When I simplify the program, change volatile to atomic and eliminate the use of SPARK as below:

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
   X : Natural := 0;
   pragma Atomic (X);

   task type AddTen;
   task body AddTen is
      Y : Integer;
   begin
      for I in 1 .. 100 loop
         Y := X + 1;
         X := Y;
      end loop;
   end AddTen;
begin
   declare
      A, B : AddTen;
   begin
      null;
   end;
  
   Put_Line(X'Image);
end Main;

I consistently get a result of 200.

Note that running the tasks within an inner block causes outer block of the main procedure to wait until the inner block completes, and the inner block only completes when both tasks complete.

When I force longer execution by changing the upper loop range to 10000 I get a mixture of numbers such as 15509, 16318, 15283, 14555.

like image 192
Jim Rogers Avatar answered Nov 17 '22 11:11

Jim Rogers


Absent round-robin scheduling, which isn’t the default (it’s usually FIFO within priorities), why would the tasks swap? There are no scheduling points there. I should add, I’m more used to scheduling with Ravenscar RTSs on single-processor MCUs, so I’ve probably got the wrong idea here!

That would result in 200 every time, clearly not the case (by the way, very similar results with a loop count of 10_000_000! but I always see a result of 10000000). I suppose there may be some scheduling impact of the delay in the main program.

I tried using just

pragma Task_Dispatching_Policy 
  (Round_Robin_Within_Priorities);

with no effect.

Then

Ada.Dispatching.Round_Robin.Set_Quantum 
  (System.Default_Priority,
   Ada.Real_Time.Microseconds (2));

which gives the error

Round_Robin is not supported in this configuration

(the configuration being macOS/GCC 10.1.0).

Then

pragma Atomic (x);

which does indeed make the difference you were expecting.

Explanations welcome!

like image 28
Simon Wright Avatar answered Nov 17 '22 11:11

Simon Wright