Does std::ctype always classify characters by the "C" locale?

Question

cppreference says std::ctype provides character classification based on the classic "C" locale. Is this even true when we create a locale like this:

std::locale loc(std::locale("en_US.UTF8"), new std::ctype<char>);

Will the facet of loc still classify characters based on the "C" locale or the Unicode one? If it classifies by the former, why do we even specify the locale name as "en_US.UTF8"?

Cubbi · Accepted Answer

The standard requires the default-constructed std::ctype<char> to match the minimal "C" locale via §22.4.1.3.3[facet.ctype.char.statics]/1

static const mask* classic_table() noexcept;

Returns: A pointer to the initial element of an array of size table_size which represents the classifications of characters in the "C" locale

the classification member function is() is defined in terms of table() which is defined in terms of classic_table() unless another table was provided to the ctype<char>'s constructor

I've updated cppreference to match these requirements more properly (it was saying "C" for std::ctype<wchar_t> too)

To answer your second question, the locale constructed with std::locale loc(std::locale("en_US.UTF8"), new std::ctype<char>); will use the ctype facet you specified (and, therefore, "C") to classify narrow characters, but it's redundant: narrow character classification of a plain std::locale("en_US.UTF8") (at least in GNU implementation) is exactly the same:

#include <iostream>
#include <cassert>
#include <locale>
int main()
{

    std::locale loc1("en_US.UTF8");
    const std::ctype_base::mask* tbl1 =
         std::use_facet<std::ctype<char>>(loc1).table();

    std::locale loc2(std::locale("en_US.UTF8"), new std::ctype<char>);
    const std::ctype_base::mask* tbl2 =
         std::use_facet<std::ctype<char>>(loc2).table();

    for(size_t n = 0; n < 256; ++n)
        assert(tbl1[n] == tbl2[n]);
}

Does std::ctype always classify characters by the "C" locale?

Tags:

c++

character-encoding

unicode

locale

template boy

1 Answers

Cubbi

Recent Activity

Donate For Us

Does std::ctype always classify characters by the "C" locale?

Tags:

c++

character-encoding

unicode

locale

template boy

1 Answers

Cubbi

Related questions

Recent Activity

Donate For Us