I was playing with the Compiler Explorer and I stumbled upon an interesting behavior with the ternary operator when using something like this:
std::string get_string(bool b) { return b ? "Hello" : "Stack-overflow"; }
The compiler generated code for this (clang trunk, with -O3) is this:
get_string[abi:cxx11](bool): # @get_string[abi:cxx11](bool) push r15 push r14 push rbx mov rbx, rdi mov ecx, offset .L.str mov eax, offset .L.str.1 test esi, esi cmovne rax, rcx add rdi, 16 #< Why is the compiler storing the length of the string mov qword ptr [rbx], rdi xor sil, 1 movzx ecx, sil lea r15, [rcx + 8*rcx] lea r14, [rcx + 8*rcx] add r14, 5 #< I also think this is the length of "Hello" (but not sure) mov rsi, rax mov rdx, r14 call memcpy #< Why is there a call to memcpy mov qword ptr [rbx + 8], r14 mov byte ptr [rbx + r15 + 21], 0 mov rax, rbx pop rbx pop r14 pop r15 ret .L.str: .asciz "Hello" .L.str.1: .asciz "Stack-Overflow"
However, the compiler generated code for the following snippet is considerably smaller and with no calls to memcpy
, and does not care about knowing the length of both strings at the same time. There are 2 different labels that it jumps to
std::string better_string(bool b) { if (b) { return "Hello"; } else { return "Stack-Overflow"; } }
The compiler generated code for the above snippet (clang trunk with -O3) is this:
better_string[abi:cxx11](bool): # @better_string[abi:cxx11](bool) mov rax, rdi lea rcx, [rdi + 16] mov qword ptr [rdi], rcx test sil, sil je .LBB0_2 mov dword ptr [rcx], 1819043144 mov word ptr [rcx + 4], 111 mov ecx, 5 mov qword ptr [rax + 8], rcx ret .LBB0_2: movabs rdx, 8606216600190023247 mov qword ptr [rcx + 6], rdx movabs rdx, 8525082558887720019 mov qword ptr [rcx], rdx mov byte ptr [rax + 30], 0 mov ecx, 14 mov qword ptr [rax + 8], rcx ret
The same result is when I use the ternary operator with:
std::string get_string(bool b) { return b ? std::string("Hello") : std::string("Stack-Overflow"); }
I would like to know why the ternary operator in the first example generates that compiler code. I believe that the culprit lies within the const char[]
.
P.S: GCC does calls to strlen
in the first example but Clang doesn't.
Link to the Compiler Explorer example: https://godbolt.org/z/Exqs6G
Thank you for your time!
sorry for the wall of code
Ternary operator values The values part of the ternary operator in the above example is this: “This is an even number!” : “This is an odd number!”; In the example above, if the condition evaluates to true then the ternary operator will return the string value “This is an even number!”.
The only "advantage" is that you can use the ternary operator in an expression (eg. function arguments), making for terser code. using an if , you'd duplicate the full expression.
A ternary operator lets you assign one value to the variable if the condition is true, and another value if the condition is false. The if else block example from above could now be written as shown in the example below. var num = 4, msg = ""; msg = (num === 4) ?
It is not faster. There is one difference when you can initialize a constant variable depending on some expression: const int x = (a<b) ?
The overarching difference here is that the first version is branchless.
16 isn’t the length of any string here (the longer one, with NUL, is only 15 bytes long); it’s an offset into the return object (whose address is passed in RDI to support RVO), used to indicate that the small-string optimization is in use (note the lack of allocation). The lengths are 5 or 5+1+8 stored in R14, which is stored in the std::string
as well as passed to memcpy
(along with a pointer chosen by CMOVNE) to load the actual string bytes.
The other version has an obvious branch (although part of the std::string
construction has been hoisted above it) and actually does have 5 and 14 explicitly, but is obfuscated by the fact that the string bytes have been included as immediate values (expressed as integers) of various sizes.
As for why these three equivalent functions produce two different versions of the generated code, all I can offer is that optimizers are iterative and heuristic algorithms; they don’t reliably find the same “best” assembly independently of their starting point.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With