I have trouble transforming a string to lowercase with the tolower() function in C++. With normal strings, it works as expected, however special characters are not converted successfully.
How I use my function:
string NotLowerCase = "Grüßen";
string LowerCase = "";
for (unsigned int i = 0; i < NotLowerCase.length(); i++) {
LowerCase += tolower(NotLowerCase[i]);
}
For example:
3 and 4 are not working as expected as you can see
How can I fix this issue? I have to keep the special chars, but as lowercase.
The sample code (below) from tolower
shows how you fix this; you have to use something other than the default "C" locale.
#include <iostream>
#include <cctype>
#include <clocale>
int main()
{
unsigned char c = '\xb4'; // the character Ž in ISO-8859-15
// but ´ (acute accent) in ISO-8859-1
std::setlocale(LC_ALL, "en_US.iso88591");
std::cout << std::hex << std::showbase;
std::cout << "in iso8859-1, tolower('0xb4') gives "
<< std::tolower(c) << '\n';
std::setlocale(LC_ALL, "en_US.iso885915");
std::cout << "in iso8859-15, tolower('0xb4') gives "
<< std::tolower(c) << '\n';
}
You might also change std::string
to std::wstring
which is Unicode on many C++ implementations.
wstring NotLowerCase = L"Grüßen";
wstring LowerCase;
for (auto&& ch : NotLowerCase) {
LowerCase += towlower(ch);
}
Guidance from Microsoft is to "Normalize strings to uppercase", so you might use toupper
or towupper
instead.
Keep in mind that a character-by-character transformation might not work well for some languages. For example, using German as spoken in Germany, making Grüßen all upper-case turns it into GRÜESSEN (although there is now a capital ẞ). There are numerous other "problems" such a combining characters; if you're doing real "production" work with strings, you really want a completely different approach.
Finally, C++ has more sophisticated support for managing locales, see <locale>
for details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With