Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Canot read char8_t from basic_stringstream<char8_t>

Tags:

c++

gcc

c++20

gcc9

I'm simply trying stringstream in UTF-8:

#include<iostream>
#include<string>
#include<sstream>
int main()
{
    std::basic_stringstream<char8_t> ss(u8"hello");
    char8_t c;
    std::cout << (ss.rdstate() & std::ios_base::goodbit) << " " << (ss.rdstate() & std::ios_base::badbit) << " "
            << (ss.rdstate() & std::ios_base::failbit) << " " << (ss.rdstate() & std::ios_base::eofbit) << "\n";
    ss >> c;
    std::cout << (ss.rdstate() & std::ios_base::goodbit) << " " << (ss.rdstate() & std::ios_base::badbit) << " "
            << (ss.rdstate() & std::ios_base::failbit) << " " << (ss.rdstate() & std::ios_base::eofbit) << "\n";
    std::cout << c;
    return 0;
}

Compile using:

g++-9 -std=c++2a -g -o bin/test test/test.cpp

The result on screen is:

0 0 0 0
0 1 4 0
0

It seems that something goes wrong when reading c, but I don't know how to correct it. Please help me!

like image 838
陈浩南 Avatar asked Aug 08 '19 06:08

陈浩南


Video Answer


1 Answers

This is actually an old issue not specific to support for char8_t. The same issue occurs with char16_t or char32_t in C++11 and newer. The following gcc bug report has a similar test case.

  • https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

The issue is also discussed at the following:

  • GCC 4.8 and char16_t streams - bug?
  • Why does `std::basic_ifstream<char16_t>` not work in c++11?
  • http://gcc.1065356.n8.nabble.com/UTF-16-streams-td1117792.html

The issue is that gcc does not implicitly imbue the global locale with facets for ctype<char8_t>, ctype<char16_t>, or ctype<char32_t>. When attempting to perform an operation that requires one of these facets, a std::bad_cast exception is thrown from std::__check_facet (which is subsequently silently swallowed by the IOS sentry object created for the character extraction operator and which then sets badbit and failbit).

The C++ standard only requires that ctype<char> and ctype<wchar_t> be provided. See [locale.category]p2.

like image 162
Tom Honermann Avatar answered Oct 16 '22 14:10

Tom Honermann