Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does passing `const char[N]` and `const char*` to view::c_str() yield different binaries, while string_view produces the same?

With std::string_view, range::for_each yields exact assembly with both const char[N] and const char * passing to std::string_view ctor

In other words, this code

auto str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(std::string_view{str}, std::putchar);

and

auto& str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(std::string_view{str}, std::putchar);

both yield below assembly :

main:                                   # @main
        pushq   %rbx
        movq    $-90, %rbx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movsbl  .L.str+90(%rbx), %edi
        movq    stdout(%rip), %rsi
        callq   _IO_putc
        addq    $1, %rbx
        jne     .LBB0_1
        xorl    %eax, %eax
        popq    %rbx
        retq
.L.str:
        .asciz  "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n"

Moreover, if we pass a c string as const char[N] to ranges::view::c_str(),

auto& str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(ranges::view::c_str(str), std::putchar);

this yields the exact assembly above like one std::string_view produces.


On the other hand, If we pass a c string as const char* to ranges::view::c_str()

auto str = "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n";
ranges::for_each(ranges::view::c_str(str), std::putchar);

This time it yields a different assembly as below:

main:                                   # @main
        pushq   %rbx
        movb    $116, %al
        movq    $-90, %rbx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movsbl  %al, %edi
        movq    stdout(%rip), %rsi
        callq   _IO_putc
        movzbl  .L.str+91(%rbx), %eax
        incq    %rbx
        jne     .LBB0_1
        xorl    %eax, %eax
        popq    %rbx
        retq
.L.str:
        .asciz  "the quick brown fox is jumping on a lazy dog\nthe quick brown fox is jumping on a lazy dog\n"

Which assembly wins?

Why does std::string_view decide to yield the same binary?

Could view::c_str() yield only one faster assembly with both const char* and const char [N]?

godbolt.org/g/wcQyY1

like image 292
sandthorn Avatar asked Mar 06 '18 15:03

sandthorn


1 Answers

Both std::string_view versions call the same constructor, which takes a const char* and then uses std::char_traits::length (which is basically strlen) to find the length. The compiler optimizes away the strlen because the string literal is visible to the compiler so its length is known, but both forms use the exact same constructor, and both optimize away the strlen, and so both generate the same code.

The view::c_str version uses different overloads depending whether it's given a pointer or an array, see https://github.com/ericniebler/range-v3/blob/1f4a96e9240786801e95a6c70afebf27f04cffeb/include/range/v3/view/c_str.hpp#L68

When given a pointer it has to find the length similarly to using strlen, but when given an array of size N it uses N-1 for the length. Even when the compiler optimizes away the strlen-like code to a fixed compile-time value, it's still compiling something different, so it's not entirely surprising that the generated code is not identical.

like image 171
Jonathan Wakely Avatar answered Nov 13 '22 21:11

Jonathan Wakely