Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vector vs string

Tags:

c++

string

vector

What is the fundamental difference, if any, between a C++ std::vector and std::basic_string?

like image 863
Yttrill Avatar asked Dec 29 '10 19:12

Yttrill


People also ask

What is the difference between a string and a vector?

TLDR: string s are optimized to only contain character primitives, vector s can contain primitives or objects.

Can a string be a vector?

A vector of strings is created the way a vector of any other type would be created. Remember to make the template specialization, string. Do not forget to include the string library and the vector library. The common ways of creating vectors with string as the element type have been illustrated above.

Which is faster vector or string?

vector will probably be faster since it doesn't do SSO like string does.

Is vector of char same as string?

These are interchangeable, std::string just offers additional functionality.


4 Answers

  • basic_string doesn't call constructors and destructors of its elements. vector does.

  • swapping basic_string invalidates iterators (enabling small string optimization), swapping vectors doesn't.

  • basic_string memory may not be allocated continuously in C++03. vector is always continuous. This difference is removed in C++0x [string.require]:

    The char-like objects in a basic_string object shall be stored contiguously

  • basic_string has interface for string operations. vector doesn't.

  • basic_string may use copy on write strategy (in pre C++11). vector can't.

Relevant quotes for non-believers:

[basic.string]:

The class template basic_string conforms to the requirements for a Sequence Container (23.2.3), for a Reversible Container (23.2), and for an Allocator-aware container (Table 99), except that basic_string does not construct or destroy its elements using allocator_traits::construct and allocator_- traits::destroy and that swap() for basic_string invalidates iterators. The iterators supported by basic_string are random access iterators (24.2.7).

like image 174
Yakov Galka Avatar answered Oct 05 '22 14:10

Yakov Galka


basic_string gives compiler and standard library implementations, a few freedoms over vector:

  1. The "small string optimization" is valid on strings, which allows implementations to store the actual string, rather than a pointer to the string, in the string object when the string is short. Something along the lines of:

    class string
    {
        size_t length;
        union
        {
            char * usedWhenStringIsLong;
            char usedWhenStringIsShort[sizeof(char*)];
        };
    };
    
  2. In C++03, the underlying array need not be contiguous. Implementing basic_string in terms of something like a "rope" would be possible under the current standard. (Though nobody does this because that would make the members std::basic_string::c_str() and std::basic_string::data() too expensive to implement.)
    C++11 now bans this behavior though.

  3. In C++03, basic_string allows the compiler/library vendor to use copy-on-write for the data (which can save on copies), which is not allowed for std::vector. In practice, this used to be a lot more common, but it's less common nowadays because of the impact it has upon multithreading. Either way though, your code cannot rely on whether or not std::basic_string is implemented using COW.
    C++11 again now bans this behavior.

There are a few helper methods tacked on to basic_string as well, but most are simple and of course could easily be implemented on top of vector.

like image 30
Billy ONeal Avatar answered Oct 05 '22 14:10

Billy ONeal


The key difference is that std::vector should keep its data in continuous memory, when std::basic_string could not to. As a result:

std::vector<char> v( 'a', 3 );
char* x = &v[0]; // valid

std::basic_string<char> s( "aaa" );
char* x2 = &s[0];     // doesn't point to continuous buffer
//For example, the behavior of 
std::cout << *(x2+1);
//is undefined.
const char* x3 = s.c_str(); // valid

On practice this difference is not so important.

like image 29
Kirill V. Lyadvinsky Avatar answered Oct 05 '22 16:10

Kirill V. Lyadvinsky


TLDR: strings are optimized to only contain character primitives, vectors can contain primitives or objects

The preeminent difference between vector and string is that vector can correctly contain objects, string works only on primitives. So vector provides these methods that would be useless for a string working with primitives:

  1. vector::emplace
  2. vector::emplace_back
  3. vector::~vector

Even extending string will not allow it to correctly handle objects, because it lacks a destructor. This should not be viewed as a drawback, it allows significant optimization over vector in that string can:

  1. Do short string optimization, potentially avoiding heap allocation, with little to no increased storage overhead
  2. Use char_traits, one of string's template arguments, to define how operations should be implemented on the contained primitives (of which only char, wchar_t, char16_t, and char32_t are implemented: http://en.cppreference.com/w/cpp/string/char_traits)

Particularly relevant are char_traits::copy, char_traits::move, and char_traits::assign obviously implying that direct assignment, rather than construction or destruction will be used which is again, preferable for primitives. All this specialization has the additional drawbacks to string that:

  1. Only char, wchar_t, char16_t, or char32_t primitives types will be used. Obviously, primitives of sizes up to 32-bit, could use their equivalently sized char_type: https://stackoverflow.com/a/35555016/2642059, but for primitives such as long long a new specialization of char_traits would need to be written, and the idea of specializing char_traits::eof and char_traits::not_eof instead of just using vector<long long> doesn't seem like the best use of time.
  2. Because of short string optimization, iterators are invalidated by all the operations that would invalidate a vector iterator, but string iterators are additionally invalidated by string::swap and string::operator=

Additional differences in the interfaces of vector and string:

  1. There is no mutable string::data: Why Doesn't std::string.data() provide a mutable char*?
  2. string provides functionality for working with words unavailable in vector: string::c_str, string::length, string::append, string::operator+=, string::compare, string::replace, string::substr, string::copy, string::find, string::rfind, string::find_first_of, string::find_first_not_of, string::flind_last_of, string::find_last_not_of, string::operator+, string::operator>>, string::operator<<, string::stoi, string::stol, string::stoll, string::stoul, string::stoull, string::stof, string::stod, string::stold, stirng::to_string, string::to_wstring
  3. Finally everywhere vector accepts arguments of another vector, string accepts a string or a char*

Note this answer is written against C++11, so strings are required to be allocated contiguously.

like image 43
Jonathan Mee Avatar answered Oct 05 '22 16:10

Jonathan Mee