In https://en.cppreference.com/w/cpp/regex/regex_traits/transform_primary the following example snippet is proposed:
#include <iostream>
#include <regex>
int main()
{
std::locale::global(std::locale("en_US.UTF-8"));
std::wstring str = L"AÀÁÂÃÄÅaàáâãäå";
std::wregex re(L"[[=a=]]*", std::regex::basic);
std::cout << std::boolalpha << std::regex_match(str, re) << '\n';
}
It is also said that it should output true
. However, trying it with GCC 8 and Clang 7 on Debian and with the Clang that comes with a macOS High Sierra always gave false
(you can directly test this with the "Run" button in the cppreference page).
One might say that the cppreference page is wrong, which is surely possible, however reading the documentation it also seems to me that true
is the right output: all the characters in the string str
are, as I understand it, in the primary collating class of a
.
So the question is: who is right? The compilers or cppreference? And why?
Here's what the g++/libstdc++-9 implementation of transform_primary
looks like:
template<typename _Fwd_iter>
string_type
transform_primary(_Fwd_iter __first, _Fwd_iter __last) const
{
// TODO : this is not entirely correct.
// This function requires extra support from the platform.
//
// Read http://gcc.gnu.org/ml/libstdc++/2013-09/msg00117.html and
// http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2003/n1429.htm
// for details.
typedef std::ctype<char_type> __ctype_type;
const __ctype_type& __fctyp(use_facet<__ctype_type>(_M_locale));
std::vector<char_type> __s(__first, __last);
__fctyp.tolower(__s.data(), __s.data() + __s.size());
return this->transform(__s.data(), __s.data() + __s.size());
}
The comment says "is not entirely correct"; in my humble opinion the comment is not quite right. It should have said "this is totally wrong", because it is. It simply doesn't work.
The comment at the top of libc++-8 says:
// transform_primary is very FreeBSD-specific
Indeed it doesn't work on Linux at all (it returns an empty string for all characters). It could be working on a macOS, which is sort of a variant of FreeBSD, but I don't have one nearby to check. There could be a different bug lurking inside.
So the answer is, at least some of the compilers are wrong at least some of the time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With