Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible undefined behavior in primitive static_vector implementation

tl;dr: I think my static_vector has undefined behavior, but I can't find it.

This problem is on Microsoft Visual C++ 17. I have this simple and unfinished static_vector implementation, i.e. a vector with a fixed capacity that can be stack allocated. This is a C++17 program, using std::aligned_storage and std::launder. I've tried to boil it down below to the parts that I think are relevant to the issue:

template <typename T, size_t NCapacity>
class static_vector
{
public:
    typedef typename std::remove_cv<T>::type value_type;
    typedef size_t size_type;
    typedef T* pointer;
    typedef const T* const_pointer;
    typedef T& reference;
    typedef const T& const_reference;

    static_vector() noexcept
        : count()
    {
    }

    ~static_vector()
    {
        clear();
    }

    template <typename TIterator, typename = std::enable_if_t<
        is_iterator<TIterator>::value
    >>
    static_vector(TIterator in_begin, const TIterator in_end)
        : count()
    {
        for (; in_begin != in_end; ++in_begin)
        {
            push_back(*in_begin);
        }
    }

    static_vector(const static_vector& in_copy)
        : count(in_copy.count)
    {
        for (size_type i = 0; i < count; ++i)
        {
            new(std::addressof(storage[i])) value_type(in_copy[i]);
        }
    }

    static_vector& operator=(const static_vector& in_copy)
    {
        // destruct existing contents
        clear();

        count = in_copy.count;
        for (size_type i = 0; i < count; ++i)
        {
            new(std::addressof(storage[i])) value_type(in_copy[i]);
        }

        return *this;
    }

    static_vector(static_vector&& in_move)
        : count(in_move.count)
    {
        for (size_type i = 0; i < count; ++i)
        {
            new(std::addressof(storage[i])) value_type(move(in_move[i]));
        }
        in_move.clear();
    }

    static_vector& operator=(static_vector&& in_move)
    {
        // destruct existing contents
        clear();

        count = in_move.count;
        for (size_type i = 0; i < count; ++i)
        {
            new(std::addressof(storage[i])) value_type(move(in_move[i]));
        }

        in_move.clear();

        return *this;
    }

    constexpr pointer data() noexcept { return std::launder(reinterpret_cast<T*>(std::addressof(storage[0]))); }
    constexpr const_pointer data() const noexcept { return std::launder(reinterpret_cast<const T*>(std::addressof(storage[0]))); }
    constexpr size_type size() const noexcept { return count; }
    static constexpr size_type capacity() { return NCapacity; }
    constexpr bool empty() const noexcept { return count == 0; }

    constexpr reference operator[](size_type n) { return *std::launder(reinterpret_cast<T*>(std::addressof(storage[n]))); }
    constexpr const_reference operator[](size_type n) const { return *std::launder(reinterpret_cast<const T*>(std::addressof(storage[n]))); }

    void push_back(const value_type& in_value)
    {
        if (count >= capacity()) throw std::out_of_range("exceeded capacity of static_vector");
        new(std::addressof(storage[count])) value_type(in_value);
        count++;
    }

    void push_back(value_type&& in_moveValue)
    {
        if (count >= capacity()) throw std::out_of_range("exceeded capacity of static_vector");
        new(std::addressof(storage[count])) value_type(move(in_moveValue));
        count++;
    }

    template <typename... Arg>
    void emplace_back(Arg&&... in_args)
    {
        if (count >= capacity()) throw std::out_of_range("exceeded capacity of static_vector");
        new(std::addressof(storage[count])) value_type(forward<Arg>(in_args)...);
        count++;
    }

    void pop_back()
    {
        if (count == 0) throw std::out_of_range("popped empty static_vector");
        std::destroy_at(std::addressof((*this)[count - 1]));
        count--;
    }

    void resize(size_type in_newSize)
    {
        if (in_newSize > capacity()) throw std::out_of_range("exceeded capacity of static_vector");

        if (in_newSize < count)
        {
            for (size_type i = in_newSize; i < count; ++i)
            {
                std::destroy_at(std::addressof((*this)[i]));
            }
            count = in_newSize;
        }
        else if (in_newSize > count)
        {
            for (size_type i = count; i < in_newSize; ++i)
            {
                new(std::addressof(storage[i])) value_type();
            }
            count = in_newSize;
        }
    }

    void clear()
    {
        resize(0);
    }

private:
    typename std::aligned_storage<sizeof(T), alignof(T)>::type storage[NCapacity];
    size_type count;
};

This appeared to work fine for a while. Then, at one point, I was doing something very similar to this - the actual code is longer, but this gets to the gist of it:

struct Foobar
{
    uint32_t Member1;
    uint16_t Member2;
    uint8_t Member3;
    uint8_t Member4;
}

void Bazbar(const std::vector<Foobar>& in_source)
{
    static_vector<Foobar, 8> valuesOnTheStack { in_source.begin(), in_source.end() };

    auto x = std::pair<static_vector<Foobar, 8>, uint64_t> { valuesOnTheStack, 0 };
}

In other words, we first copy 8-byte Foobar structs into a static_vector on the stack, then we make a std::pair of a static_vector of 8-byte structs as the first member, and a uint64_t as the second. I can verify that valuesOnTheStack contains the right values immediately before the pair is constructed. And... this segfaults with optimization enabled inside static_vector's copy constructor (which has been inlined into the calling function) when constructing the pair.

Long story short, I inspected the disassembly. This where things get a bit weird; the generated asm around the inlined copy constructor is shown below - note that this is from the actual code, not the sample above, which is pretty close but has some more stuff above the pair construction:

00621E45  mov         eax,dword ptr [ebp-20h]  
00621E48  xor         edx,edx  
00621E4A  mov         dword ptr [ebp-70h],eax  
00621E4D  test        eax,eax  
00621E4F  je          <this function>+29Ah (0621E6Ah)  
00621E51  mov         eax,dword ptr [ecx]  
00621E53  mov         dword ptr [ebp+edx*8-0B0h],eax  
00621E5A  mov         eax,dword ptr [ecx+4]  
00621E5D  mov         dword ptr [ebp+edx*8-0ACh],eax  
00621E64  inc         edx  
00621E65  cmp         edx,dword ptr [ebp-70h]  
00621E68  jb          <this function>+281h (0621E51h)  

Okay, so first we have two mov instructions copying the count member from the source to the destination; so far so good. edx is zeroed because it's the loop variable. We then have a quick check if count is zero; it isn't zero, so we proceed to the for loop where we copy the 8-byte struct using two 32-bit mov operations first from memory to register, then from register to memory. But there's something fishy - where we would expect a mov from something like [ebp+edx*8+] to read from the source object, there is instead just... [ecx]. That doesn't sound right. What's the value of ecx?

Turns out, ecx just contains a garbage address, the same one that we're segfaulting on. Where did it get this value from? Here's the asm immediately above:

00621E1C  mov         eax,dword ptr [this]  
00621E22  push        ecx  
00621E23  push        0  
00621E25  lea         ecx,[<unrelated local variable on the stack, not the static_vector>]  
00621E2B  mov         eax,dword ptr [eax]  
00621E2D  push        ecx  
00621E2E  push        dword ptr [eax+4]  
00621E31  call        dword ptr [<external function>@16 (06AD6A0h)]  

This looks like a regular old cdecl function call. Indeed, the function has a call to an external C function just above. But note what's happening: ecx is being used as a temporary register to push arguments on the stack, the function is invoked, and... then ecx is never touched again until it is erroneously used below to read from the source static_vector.

In practice, the contents of ecx get overwritten by the function called here, which it is of course allowed to do. But even if it didn't, there is no way ecx is ever going to contain an address to the correct thing here - at best, it would point to a local stack member that is not the static_vector. It seems as if the compiler has emitted some bogus assembly. This function could never produce the correct output.

So that's where I am now. Weird assembly when optimizations are enabled while playing around in std::launder land smells to me like undefined behavior. But I can't see where that could be coming from. As supplemental but marginally useful information, clang with the right flags produces similar assembly to this, except it correctly uses ebp+edx instead of ecx to read values.

like image 877
pjohansson Avatar asked Mar 10 '20 07:03

pjohansson


1 Answers

I think you have a compiler bug. Adding __declspec( noinline ) to operator[] seems to fix the crash:

__declspec( noinline ) constexpr const_reference operator[]( size_type n ) const { return *std::launder( reinterpret_cast<const T*>( std::addressof( storage[ n ] ) ) ); }

You can try reporting the bug to Microsoft but the bug seems to already be fixed in Visual Studio 2019.

Removing std::launder also seems to fix the crash:

constexpr const_reference operator[]( size_type n ) const { return *reinterpret_cast<const T*>( std::addressof( storage[ n ] ) ); }
like image 194
Alan Birtles Avatar answered Oct 31 '22 21:10

Alan Birtles