Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the value of std::string::max_size "strange"?

Tags:

c++

I was looking at std::string::max_size and noticed the example:

#include <iostream>
#include <string>

int main ()
{
  std::string str ("Test string");
  std::cout << "max_size: " << str.max_size() << "\n";
  return 0;
}

with the output:

max_size: 4294967291

However, I always thought this limitation is due to the max value of an unsigned integer / size_t - so I kind of expected it to be 2^32 - 1 which would be 4294967295. Why is the max size in this example not using those 4 bytes?

I also tried to run the sample code, and on that machine it was 2^62 - which again confused me, why wouldn't it be 2^64 - 1 instead?

In general I am wondering, for what reasons would an implementation not use all the space?

like image 647
Julius Avatar asked Feb 05 '19 14:02

Julius


People also ask

What is the default value of std::string?

The particular case of the default default value because a default constructed std::string is an empty string.

What does std::string () do?

std::string class in C++ C++ has in its definition a way to represent a sequence of characters as an object of the class. This class is called std:: string. String class stores the characters as a sequence of bytes with the functionality of allowing access to the single-byte character.

Is string the same as std::string?

There is no functionality difference between string and std::string because they're the same type.

What can be the max size of string in C++?

While an individual quoted string cannot be longer than 2048 bytes, a string literal of roughly 65535 bytes can be constructed by concatenating strings.


1 Answers

One of the indices, the largest representable to be more specific, is reserved for the std::string::npos value, which represents a "not found" result in some string functions. Furthermore, the strings are internally null terminated, so one position must be reserved for the null termination character.

This brings us to a theoretical maximum of radix^bits - 3 that the standard library could provide (unless those reserved positions could be share the same value; I'm not 100% sure that would be impossible). Presumably the implementation has chosen to reserve two more indices for internal usage (or I've missed some necessarily reserved position). One potential usage for such reserved index that I could imagine might be an overflow trap, which detects accesses out of bounds.

From practical point of view: std::string::size_type is usually the same width as the address space, and under such assumption it's not practically possible to use the entire address space for a single string anyway. As such, the number reported by the library is usually not achievable; It is just an upper bound set by the standard library implementation and the actual size limit of a string is subject to limitations from other sources - most often by the amount of available RAM.

like image 186
eerorika Avatar answered Oct 25 '22 03:10

eerorika