Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the clang static analyzer confused by popping the front from a list of unique_ptrs?

The following C++11 code is a minimal example of what I believe triggers a false positive in clang:

#include <iostream>
#include <list>
#include <memory>

class ElementType {};

int main(int argc, const char * argv[]) {
    std::list<std::unique_ptr<ElementType>> theList(5);

    theList.pop_front();

    for (const auto &element: theList) { // (*)
        std::cout << "This should be fine." << std::endl;
    }

    return 0;
}

On the line marked by an asterisk (*), the clang analyzer claims

...filePath.../main.cpp:21:29: Use of memory after it is freed (within a call to 'begin')

As far as I interpret it, this code is harmless, but clang misses the point that std::list<T>::pop_front() not only calls its elements' destructor, but that it also moves the location of std::list<T>::begin(). Replacing the call to pop_front by pop_back makes the analyzer warning disappear, and even replacing it by erase(theList.begin()) makes it come out warning-free.

Am I missing something or did I actually stumble upon a missed case in clang?

For reference: These results come from XCode 5.1.1 (5B1008) on Mac OS X 10.9.2,

$ clang --version
Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix
like image 999
Thierry Avatar asked May 02 '14 15:05

Thierry


2 Answers

It has been acknowledged to be a bug by the LLVM team.

In a comment on Revision 211832 it is stated that since

[t]he analyzer cannot reason about the internal invariances of [containers such as std::vector and std::list] which leads to false positives

the analyzer should

just not inline the methods of the containers and allow objects to escape whenever such methods are called.

The issue is indeed no longer reproducible on XCode 6.4 (6E35b) with

$ clang --version
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.4.0
Thread model: posix
like image 30
Thierry Avatar answered Nov 02 '22 06:11

Thierry


The code, as it stands, looks fine.

I check the code from libc++ (relevant parts) and I believe that it just confuses the static analyzer.

In more details:

template <class _Tp, class _Alloc>
void list<_Tp, _Alloc>::pop_front()
{
    _LIBCPP_ASSERT(!empty(), "list::pop_front() called with empty list");
    __node_allocator& __na = base::__node_alloc();
    __node_pointer __n = base::__end_.__next_;
    base::__unlink_nodes(__n, __n);
    --base::__sz();
    __node_alloc_traits::destroy(__na, _VSTD::addressof(__n->__value_));
    __node_alloc_traits::deallocate(__na, __n, 1);
}

list is implemented as a circular list, based at __end_ (which is the end-pointer), so to get to the first element, the code goes to __end_.__next_.

The implementation of __unlink_nodes is:

// Unlink nodes [__f, __l]
template <class _Tp, class _Alloc>
inline void __list_imp<_Tp, _Alloc>::__unlink_nodes(__node_pointer __f,
                                                    __node_pointer __l) noexcept
{
    __f->__prev_->__next_ = __l->__next_;
    __l->__next_->__prev_ = __f->__prev_;
}

We can easily understand it with some simple ASCII art:

       Z             A             B             C
  +---------+   +---------+   +---------+   +---------+
--| __prev_ |<--| __prev_ |<--| __prev_ |<--| __prev_ |<-
->| __next_ |-->| __next_ |-->| __next_ |-->| __next_ |--
  +---------+   +---------+   +---------+   +---------+

To remove the range A-B from this list:

  • Z.__next_ has to point to C
  • C.__prev_ has to point to Z

Thus the call __unlink_nodes(A, B) will:

  • take A.__prev_.__next_ (ie, Z.__next_) and make it point to B.__next_ (ie, C)
  • take B.__next_.__prev_ (ie, C.__prev_) and make it point to A.__prev_ (ie, Z)

This is simple, and works even when invoked with a single element-range (the case here).

Now, however, note that if the list were to be empty, this would not work at all! The default constructor of __list_node_base is:

__list_node_base()
    : __prev_(static_cast<pointer>(pointer_traits<__base_pointer>::pointer_to(*this))),
      __next_(static_cast<pointer>(pointer_traits<__base_pointer>::pointer_to(*this)))
      {}

That is, it refers to itself. In this case, __unlink_nodes is invoked with &__end_ (twice), and would not change it __end_.__prev_.__next_ = __end_.__next_ is idempotent (because __end_.prev is __end_ itself).

It may be that:

  • the analyzer takes into account the case of an empty list (_LIBCPP_ASSERT being compiled out)
  • and concludes that in this case, the __end_.__next_ (used by begin()) is left dangling by the deallocate() call in pop_front()

Or maybe it's something else in the pointer dance... hopefully the Clang team will be able to patch things up.

like image 81
Matthieu M. Avatar answered Nov 02 '22 06:11

Matthieu M.