Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing against string literal not resolved at compile time

I recently found something akin to the following lines:

#include <string>

// test if the extension is either .bar or .foo
bool test_extension(const std::string& ext) {
    return ext == ".bar" || ".foo";
    // it obviously should be
    // return ext == ".bar" || ext == ".foo";
}

The function obviously does not do what the comment suggests. But that's not the point here. Please note that this is not a duplicate of Can you use 2 or more OR conditions in an if statement? since I'm fully aware of how you would write the function properly!


I started to wonder how a compiler might treat this snippet. My first intuition would have been that this would be compiled to return true; basically. Plugging the example into godbolt, showed that neither GCC 9.2 nor clang 9 make this optimization with optimization -O2.

However, changing the code to1

#include <string>

using namespace std::string_literals;

bool test_extension(const std::string& ext) {
    return ext == ".bar"s || ".foo";
}

seems to do the trick since the assembly is now in essence:

mov     eax, 1
ret

So my core question is: Is there something I missed that does not allow a compiler to make the same optimization on the first snippet?


1With ".foo"s this would not even compile, since the compiler does not want to convert a std::string to bool ;-)


Edit

The following piece of code also gets "properly" optimized to return true;:

#include <string>

bool test_extension(const std::string& ext) {
    return ".foo" || ext == ".bar";
}
like image 328
AlexV Avatar asked Dec 03 '19 13:12

AlexV


1 Answers

This will boggle your head even more: What happens if we create a custom char type MyCharT and use it to make our own custom std::basic_string?

#include <string>

struct MyCharT {
    char c;
    bool operator==(const MyCharT& rhs) const {
        return c == rhs.c;
    }
    bool operator<(const MyCharT& rhs) const {
        return c < rhs.c;
    }
};
typedef std::basic_string<MyCharT> my_string;

bool test_extension_custom(const my_string& ext) {
    const MyCharT c[] = {'.','b','a','r', '\0'};
    return ext == c || ".foo";
}

// Here's a similar implementation using regular
// std::string, for comparison
bool test_extension(const std::string& ext) {
    const char c[] = ".bar";
    return ext == c || ".foo";
}

Certainly, a custom type cannot be optimized more easily than a plain char, right?

Here's the resulting assembly:

test_extension_custom(std::__cxx11::basic_string<MyCharT, std::char_traits<MyCharT>, std::allocator<MyCharT> > const&):
        mov     eax, 1
        ret
test_extension(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&):
        sub     rsp, 24
        lea     rsi, [rsp+11]
        mov     DWORD PTR [rsp+11], 1918984750
        mov     BYTE PTR [rsp+15], 0
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const
        mov     eax, 1
        add     rsp, 24
        ret

See it live!


Mindblown!

So, what's the difference between my "custom" string type and std::string?

Small String Optimization

At least on GCC, Small String Optimization is actually compiled into the binary for libstdc++. This means that, during the compilation of your function, the compiler has no access to this implementation, thus, it cannot know if there are any side effects. Because of this, it cannot optimize the call to compare(char const*) away. Our "custom" class does not have this problem because SSO is implemented only for plain std::string.

BTW, if you compile with -std=c++2a, the compiler does optimize it away. I'm unfortunately not savvy enough on C++ 20 yet to know what changes made this possible.

like image 174
Cássio Renan Avatar answered Nov 04 '22 11:11

Cássio Renan