Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why aren't functions like std::is_permutation() intrinsically unsafe?

C and C++ programmers have taken a beating over the last decade or so for often failing to perform proper bounds checking, especially on strings. These failures have often resulted in deep security problems in major software products. Since buffer overflow insecurities became well-understood, the drive to institute proper bounds checking has pushed many programmers away from traditional buffer- and string-manipulation functions like strcpy() and sprintf() at least in part because of the tendency of these functions to invite buffer overflow problems by making assumptions about the size of the target buffer. One of the advantages of STL types like std::string and std::vector is their strong control of buffer access.

But one thing puzzles me. Many of the most widely-used functions in the <algorithms> C++ header seem to positively beg for overflow abuse: specifically, those functions that take a begin iterator (especially an InputIterator) without a matching end iterator. For example:

template <class InputIterator, class OutputIterator>
  OutputIterator copy (InputIterator first, InputIterator last, OutputIterator result);

template <class InputIterator, class OutputIterator, class UnaryOperation>
  OutputIterator transform (InputIterator first1, InputIterator last1,
                            OutputIterator result, UnaryOperation op);

template <class ForwardIterator1, class ForwardIterator2>
   bool is_permutation (ForwardIterator1 first1, ForwardIterator1 last1,
                        ForwardIterator2 first2);

The last example—is_permutation() is especially instructive. copy() and transform() are well-understood, so that C++ programmers should know either to manually check the bounds of the output container before calling these functions or to use something like a back_inserter that ensures the output container grows as needed. Therefore, the case can be made that although copy() and transform() can be misused, anything can, and programmers are easily educated on best practices with functions like these.

is_permutation() is a trickier case. Just looking at the function declaration above, what would you assume about the size of the second range (the one that begins with first2)? Does the second range need to be the same size as the first one, or no smaller, or no larger? I'm betting that easy answers to these questions are not springing to your mind. The concept of a "permutation" is not as comfortable and familiar to most programmers as the concept of copying. It therefore seems relatively easy to get is_permutation() wrong and to overflow a buffer one way or the other.

"Look it up!" I hear you say. Well yes, of course. But if programmers remembered everything they should and looked up everything else then we wouldn't have bugs and security holes, would we?

Why, then, don't is_permutation() and similar functions (i.e. functions taking all input iterators but not a full begin-end iterator pair for every range) require a full begin-end pair for all input ranges? (Note that lexicographical_compare(), for example, does fulfill this requirement.) Are functions like is_permutation() actually not as unsafe as I'm imagining them to be?

like image 310
OldPeculier Avatar asked Jul 10 '13 23:07

OldPeculier


2 Answers

Most of the language is inherently unsafe, it is up to the programmer to use it correctly. The programmer must know before calling the function whether the arguments being used are correct or not.

Additionally, in some cases like copy it enables the use of forward iterators on open ranges. For example:

std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout," "));

There is no corresponding iterator to mark the end of the stream, and the stream really has no end, you can keep adding to it continuously.

like image 198
David Rodríguez - dribeas Avatar answered Sep 28 '22 15:09

David Rodríguez - dribeas


In C++14, there are four iterator versions of equal, is_permutation and mismatch to address precisely this point.

like image 20
Marshall Clow Avatar answered Sep 28 '22 15:09

Marshall Clow