Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intel x86 0x2E/0x3E Prefix Branch Prediction actually used?

In the latest Intel software dev manual it describes two opcode prefixes:

Group 2 > Branch Hints      0x2E: Branch Not Taken     0x3E: Branch Taken 

These allow for explicit branch prediction of Jump instructions (opcodes likeJxx)

I remember reading a couple of years ago that on x86 explicit branch prediction was essentially a no-op in the context of gccs branch prediciton intrinsics.

I am now unclear if these x86 branch hints are a new feature or whether they are essentially no-ops in practice.

Can anyone clear this up?

(That is: Does gccs branch prediction functions generate these x86 branch hints? - and do current Intel CPUs not ignore them? - and when did this happen?)

Update:

I created a quick test program:

int main(int argc, char** argv) {     if (__builtin_expect(argc,0))         return 1;      if (__builtin_expect(argc == 2, 1))         return 2;      return 3; } 

Disassembles to the following:

00000000004004cc <main>:   4004cc:   55                      push   %rbp   4004cd:   48 89 e5                mov    %rsp,%rbp   4004d0:   89 7d fc                mov    %edi,-0x4(%rbp)   4004d3:   48 89 75 f0             mov    %rsi,-0x10(%rbp)   4004d7:   8b 45 fc                mov    -0x4(%rbp),%eax   4004da:   48 98                   cltq      4004dc:   48 85 c0                test   %rax,%rax   4004df:   74 07                   je     4004e8 <main+0x1c>   4004e1:   b8 01 00 00 00          mov    $0x1,%eax   4004e6:   eb 1b                   jmp    400503 <main+0x37>   4004e8:   83 7d fc 02             cmpl   $0x2,-0x4(%rbp)   4004ec:   0f 94 c0                sete   %al   4004ef:   0f b6 c0                movzbl %al,%eax   4004f2:   48 85 c0                test   %rax,%rax   4004f5:   74 07                   je     4004fe <main+0x32>   4004f7:   b8 02 00 00 00          mov    $0x2,%eax   4004fc:   eb 05                   jmp    400503 <main+0x37>   4004fe:   b8 03 00 00 00          mov    $0x3,%eax   400503:   5d                      pop    %rbp   400504:   c3                      retq      400505:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)   40050c:   00 00 00    40050f:   90                      nop 

I don't see 2E or 3E ? Maybe gcc has elided them for some reason?

like image 826
Andrew Tomazos Avatar asked Jan 15 '13 07:01

Andrew Tomazos


Video Answer


2 Answers

These instruction prefixes have no effect on modern processors (anything newer than Pentium 4). They just cost one byte of code space, and thus, not generating them is the right thing.

For details, see Agner Fog's optimization manuals, in particular 3. Microarchitecture: http://www.agner.org/optimize/

The "Intel® 64 and IA-32 Architectures Optimization Reference Manual" no longer mentions them in the section about optimizing branches (section 3.4.1): http://www.intel.de/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf

These prefixes are a (harmless) relict of the Netburst architecture. In all-out optimization, you can use them to align code, but that's all they're good for nowadays.

like image 185
Chris Avatar answered Sep 20 '22 15:09

Chris


gcc is right to not generate the prefix, as they have no effect for all processors since the Pentium 4.

But __builtin_expect has other effects, like moving a not expected code path away from the cache-hot locations in the code or inlining decisions, so it is still useful.

like image 39
Gunther Piez Avatar answered Sep 23 '22 15:09

Gunther Piez