Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Illegal instruction when running a minimal OpenMP program

This minimal OpenMP program

#include <omp.h>
int main() 
{
  #pragma omp parallel sections
  {
    #pragma omp section
    {
      while(1) {}
    }

    #pragma omp section
    {  
      while(1) {}
    }
  }
}

will produce this error when compiled and run with gcc test.c -fopenmp:

Illegal instruction (core dumped)

When I change either one of the loops with

  int i=1;
  while(i++) {}

or any other condition it compiles and runs without error. It seems, that 1 as a loop condition in different threads cause some strange behaviour. Why?

edit: I am using gcc 4.6.3

edit: This is a bug in gcc and was submitted as Bug 54017 to the gcc developers.

like image 849
steffen Avatar asked Jul 18 '12 13:07

steffen


2 Answers

This is apparently a bug in GCC. GCC implements OpenMP sections using the GOMP_sections_start() routine from libgomp that returns a 1-based section ID that the calling thread should execute or 0 if all work items have been distributed. Basically the transformed code should look like:

main._omp_fn.0 (void * .omp_data_i)
{
   unsigned int .section.1;

   .section.1 = GOMP_sections_start(2);
L0:
   switch (.section.1)
   {
      case 0:
         // No more sections to run, exit
         goto L2;
      case 1:
         // Do section 1
         while (1) {}
         goto L1;
      case 2:
         // Do section 2
         while (1) {}
         goto L1;
      default:
         // Impossible section value, possible error in libgomp
         __builtin_trap();
   }
L1:
   .section.1 = GOMP_sections_next();
   goto L0;
L2:
   GOMP_sections_end_nowait();
   return;
}

What happens is that in your case the both the default and the 0 case lead to __builtin_trap(). __builtin_trap() is a GCC built-in that is supposed to terminate your program abnormally and on x86 it emits the ud2 instruction that makes the CPU to bark with an illegal opcode exception. It is usually put in places where code should never execute, e.g. all possible correct return values from GOMP_sections_start() and GOMP_sections_next() should be covered by the cases in the switch and if the default is reached (signalling a possible bug in libgomp) it should fail and you will complain to the developers :)

Edit: This is definitely not expected OpenMP behaviour and it does not happen with icc or suncc. I have submitted Bug 54017 to the GCC Bugzilla.

Edit 2: I updated the text to more closely reflect what GCC should produce. It looks like GCC is getting wrong impression of the control flow in the parallel region and does some "optimisations" that further spoil code generation.

like image 175
Hristo Iliev Avatar answered Nov 02 '22 06:11

Hristo Iliev


SIGILL generated, because there is an illegal instruction, ud2/ud2a. According to http://asm.inightmare.org/opcodelst/index.php?op=UD2:

This instruction caused #UD. Intel guaranteed that in future Intel's CPUs this instruction will caused #UD. Of course all previous CPUs (186+) caused #UD on this opcode. This instruction used by software writers for testing #UD exception servise routine.

Let's look inside:

$ gcc-4.6.2 -fopenmp omp.c -o omp
$ gdb ./omp
...

(gdb) r
Program received signal SIGILL, Illegal instruction.
...
0x08048544 in main._omp_fn.0 ()
(gdb) x/i $pc
0x8048544 <main._omp_fn.0+28>:  ud2a

(gdb) disassemble
Dump of assembler code for function main._omp_fn.0:
0x08048528 <main._omp_fn.0+0>:  push   %ebp
0x08048529 <main._omp_fn.0+1>:  mov    %esp,%ebp
0x0804852b <main._omp_fn.0+3>:  sub    $0x18,%esp
0x0804852e <main._omp_fn.0+6>:  movl   $0x2,(%esp)
0x08048535 <main._omp_fn.0+13>: call   0x80483f0 <GOMP_sections_start@plt>
0x0804853a <main._omp_fn.0+18>: cmp    $0x1,%eax
0x0804853d <main._omp_fn.0+21>: je     0x8048548 <main._omp_fn.0+32>
0x0804853f <main._omp_fn.0+23>: cmp    $0x2,%eax
0x08048542 <main._omp_fn.0+26>: je     0x8048546 <main._omp_fn.0+30>
0x08048544 <main._omp_fn.0+28>: ud2a
0x08048546 <main._omp_fn.0+30>: jmp    0x8048546 <main._omp_fn.0+30>
0x08048548 <main._omp_fn.0+32>: jmp    0x8048548 <main._omp_fn.0+32>
End of assembler dump.

There is ud2a in assembler file already:

$ gcc-4.6.2 -fopenmp omp.c -o omp.S -S; cat omp.S

main._omp_fn.0:
.LFB1:
        pushl   %ebp
.LCFI4:
        movl    %esp, %ebp
.LCFI5:
        subl    $24, %esp
.LCFI6:
        movl    $2, (%esp)
        call    GOMP_sections_start
        cmpl    $1, %eax
        je      .L4
        cmpl    $2, %eax
        je      .L5
                .value  0x0b0f

.value 0xb0f is code of ud2a

After verifying that ud2a was inserted by intention of gcc (at early openmp phases), I tried to understand the code. The function main._omp_fn.0 is the body of parallel code; it will call _GOMP_sections_start and parse its return code. If code equal to 1 then we will jump to one infinite loop; if it is 2, jump to second infinite loop. But in other case ud2a will be executed. (Don't know why, but according to Hristo Iliev this is a GCC Bug 54017.)

I think, this test is good to check how much CPU cores there are. By default GCC's openmp library (libgomp) will start a thread for every CPU core in your system (in my case there were 4 threads). And sections will be selected in order: first section for first thread, second section - 2nd thread and so on.

There is no SIGILL, if I run the program on 1 or 2 CPUs (option of taskset is the cpu mask in hex):

 $ taskset 3 ./omp
 ... running on cpu0 and cpu1 ...
 $ taskset 1 ./omp
 ... running first loop on cpu0; then run second loop on cpu0...
like image 4
osgx Avatar answered Nov 02 '22 06:11

osgx