With std::string_view
, range::for_each
yields exact assembly with both const char[N]
and const char *
passing to std::string_view
ctor
In other words, this code
auto str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(std::string_view{str}, std::putchar);
and
auto& str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(std::string_view{str}, std::putchar);
both yield below assembly :
main: # @main
pushq %rbx
movq $-90, %rbx
.LBB0_1: # =>This Inner Loop Header: Depth=1
movsbl .L.str+90(%rbx), %edi
movq stdout(%rip), %rsi
callq _IO_putc
addq $1, %rbx
jne .LBB0_1
xorl %eax, %eax
popq %rbx
retq
.L.str:
.asciz "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n"
Moreover, if we pass a c string as const char[N]
to ranges::view::c_str()
,
auto& str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(ranges::view::c_str(str), std::putchar);
this yields the exact assembly above like one std::string_view
produces.
On the other hand, If we pass a c string as const char*
to ranges::view::c_str()
auto str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(ranges::view::c_str(str), std::putchar);
This time it yields a different assembly as below:
main: # @main
pushq %rbx
movb $116, %al
movq $-90, %rbx
.LBB0_1: # =>This Inner Loop Header: Depth=1
movsbl %al, %edi
movq stdout(%rip), %rsi
callq _IO_putc
movzbl .L.str+91(%rbx), %eax
incq %rbx
jne .LBB0_1
xorl %eax, %eax
popq %rbx
retq
.L.str:
.asciz "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n"
Which assembly wins?
Why does std::string_view
decide to yield the same binary?
Could view::c_str()
yield only one faster assembly with both const char*
and const char [N]
?
godbolt.org/g/wcQyY1
Both std::string_view
versions call the same constructor, which takes a const char*
and then uses std::char_traits::length
(which is basically strlen
) to find the length. The compiler optimizes away the strlen
because the string literal is visible to the compiler so its length is known, but both forms use the exact same constructor, and both optimize away the strlen
, and so both generate the same code.
The view::c_str
version uses different overloads depending whether it's given a pointer or an array, see https://github.com/ericniebler/range-v3/blob/1f4a96e9240786801e95a6c70afebf27f04cffeb/include/range/v3/view/c_str.hpp#L68
When given a pointer it has to find the length similarly to using strlen
, but when given an array of size N
it uses N-1
for the length. Even when the compiler optimizes away the strlen
-like code to a fixed compile-time value, it's still compiling something different, so it's not entirely surprising that the generated code is not identical.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With