Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between std::string and std::u16string (or u32string)

I have referred below posts before asking here:

std::string, wstring, u16/32string clarification
std::u16string, std::u32string, std::string, length(), size(), codepoints and characters

But they don't my question. Look at the simple code below:

#include<iostream>
#include<string>
using namespace std;

int main ()
{
  char16_t x[] = { 'a', 'b', 'c', 0 };
  u16string arr = x;

  cout << "arr.length = " << arr.length() << endl;
  for(auto i : arr)
    cout << i << "\n";
}

The output is:

arr.length = 3  // a + b + c
97
98
99

Given that, std::u16string consists of char16_t and not char shouldn't the output be:

arr.length = 2  // ab + c(\0)
<combining 'a' and 'b'>
99

Please excuse me for the novice question. My requirement is to get clear about the concept of new C++11 strings.

Edit:

From @Jonathan's answer, I have got the loophole in my question. My point is that how to initialize the char16_t, so that the length of the arr becomes 2 (i.e. ab, c\0).
FYI, below gives a different result:

  char x[] = { 'a', 'b', 'c', 0 };
  u16string arr = (char16_t*)x;  // probably undefined behavior

Output:

arr.length = 3
25185
99
32767
like image 559
iammilind Avatar asked Sep 04 '25 16:09

iammilind


1 Answers

No, you have created an array of four elements, the first element is 'a' converted to char16_t, the second is 'b' converted to char16_t etc.

Then you create a u16string from that array (converted to a pointer), which reads each element up to the null terminator.

like image 120
Jonathan Wakely Avatar answered Sep 07 '25 07:09

Jonathan Wakely