This minimal OpenMP program
#include <omp.h>
int main()
{
#pragma omp parallel sections
{
#pragma omp section
{
while(1) {}
}
#pragma omp section
{
while(1) {}
}
}
}
will produce this error when compiled and run with gcc test.c -fopenmp
:
Illegal instruction (core dumped)
When I change either one of the loops with
int i=1;
while(i++) {}
or any other condition it compiles and runs without error. It seems, that 1
as a loop condition in different threads cause some strange behaviour. Why?
edit: I am using gcc 4.6.3
edit: This is a bug in gcc and was submitted as Bug 54017 to the gcc developers.
This is apparently a bug in GCC. GCC implements OpenMP sections using the GOMP_sections_start()
routine from libgomp
that returns a 1
-based section ID that the calling thread should execute or 0
if all work items have been distributed. Basically the transformed code should look like:
main._omp_fn.0 (void * .omp_data_i)
{
unsigned int .section.1;
.section.1 = GOMP_sections_start(2);
L0:
switch (.section.1)
{
case 0:
// No more sections to run, exit
goto L2;
case 1:
// Do section 1
while (1) {}
goto L1;
case 2:
// Do section 2
while (1) {}
goto L1;
default:
// Impossible section value, possible error in libgomp
__builtin_trap();
}
L1:
.section.1 = GOMP_sections_next();
goto L0;
L2:
GOMP_sections_end_nowait();
return;
}
What happens is that in your case the both the default
and the 0
case lead to __builtin_trap()
. __builtin_trap()
is a GCC built-in that is supposed to terminate your program abnormally and on x86 it emits the ud2
instruction that makes the CPU to bark with an illegal opcode exception. It is usually put in places where code should never execute, e.g. all possible correct return values from GOMP_sections_start()
and GOMP_sections_next()
should be covered by the cases in the switch and if the default is reached (signalling a possible bug in libgomp
) it should fail and you will complain to the developers :)
Edit: This is definitely not expected OpenMP behaviour and it does not happen with icc
or suncc
. I have submitted Bug 54017 to the GCC Bugzilla.
Edit 2: I updated the text to more closely reflect what GCC should produce. It looks like GCC is getting wrong impression of the control flow in the parallel region and does some "optimisations" that further spoil code generation.
SIGILL generated, because there is an illegal instruction, ud2/ud2a. According to http://asm.inightmare.org/opcodelst/index.php?op=UD2:
This instruction caused #UD. Intel guaranteed that in future Intel's CPUs this instruction will caused #UD. Of course all previous CPUs (186+) caused #UD on this opcode. This instruction used by software writers for testing #UD exception servise routine.
Let's look inside:
$ gcc-4.6.2 -fopenmp omp.c -o omp
$ gdb ./omp
...
(gdb) r
Program received signal SIGILL, Illegal instruction.
...
0x08048544 in main._omp_fn.0 ()
(gdb) x/i $pc
0x8048544 <main._omp_fn.0+28>: ud2a
(gdb) disassemble
Dump of assembler code for function main._omp_fn.0:
0x08048528 <main._omp_fn.0+0>: push %ebp
0x08048529 <main._omp_fn.0+1>: mov %esp,%ebp
0x0804852b <main._omp_fn.0+3>: sub $0x18,%esp
0x0804852e <main._omp_fn.0+6>: movl $0x2,(%esp)
0x08048535 <main._omp_fn.0+13>: call 0x80483f0 <GOMP_sections_start@plt>
0x0804853a <main._omp_fn.0+18>: cmp $0x1,%eax
0x0804853d <main._omp_fn.0+21>: je 0x8048548 <main._omp_fn.0+32>
0x0804853f <main._omp_fn.0+23>: cmp $0x2,%eax
0x08048542 <main._omp_fn.0+26>: je 0x8048546 <main._omp_fn.0+30>
0x08048544 <main._omp_fn.0+28>: ud2a
0x08048546 <main._omp_fn.0+30>: jmp 0x8048546 <main._omp_fn.0+30>
0x08048548 <main._omp_fn.0+32>: jmp 0x8048548 <main._omp_fn.0+32>
End of assembler dump.
There is ud2a in assembler file already:
$ gcc-4.6.2 -fopenmp omp.c -o omp.S -S; cat omp.S
main._omp_fn.0:
.LFB1:
pushl %ebp
.LCFI4:
movl %esp, %ebp
.LCFI5:
subl $24, %esp
.LCFI6:
movl $2, (%esp)
call GOMP_sections_start
cmpl $1, %eax
je .L4
cmpl $2, %eax
je .L5
.value 0x0b0f
.value 0xb0f
is code of ud2a
After verifying that ud2a was inserted by intention of gcc (at early openmp phases), I tried to understand the code. The function main._omp_fn.0
is the body of parallel code; it will call _GOMP_sections_start
and parse its return code. If code equal to 1 then we will jump to one infinite loop; if it is 2, jump to second infinite loop. But in other case ud2a will be executed. (Don't know why, but according to Hristo Iliev this is a GCC Bug 54017.)
I think, this test is good to check how much CPU cores there are. By default GCC's openmp library (libgomp) will start a thread for every CPU core in your system (in my case there were 4 threads). And sections will be selected in order: first section for first thread, second section - 2nd thread and so on.
There is no SIGILL, if I run the program on 1 or 2 CPUs (option of taskset is the cpu mask in hex):
$ taskset 3 ./omp
... running on cpu0 and cpu1 ...
$ taskset 1 ./omp
... running first loop on cpu0; then run second loop on cpu0...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With