Converting UTF-8 Characters to Upper/Lower case C++

Question

I have a string that contains UTF-8 Characters, and I have a method that is supposed to convert every character to either upper or lower case, this is easily done with characters that overlap with ASCII, and obviously some characters cannot be converted, e.g. any Chinese character. However is there a good way to detect and convert other characters that can be Upper/Lower, e.g. all the greek characters? Also please note that I need to be able to do this on both Windows and Linux.

Thank you,

Alexandre C. · Accepted Answer

Have a look at ICU.

Note that lower case to upper case functions are locale-dependant. Think about the turkish (ascii) letter I which gets "dotless lowercase i" and (ascii) i which gets "uppercase I with a dot".

tidwall · Answer

Assuming that you have access to wctype.h, then convert your text to a 2-byte unicode string and use towupper(). Then convert it back to UTF-8.

Converting UTF-8 Characters to Upper/Lower case C++

Tags:

c++

linux

windows

unicode

cross-platform

NSA

2 Answers

Alexandre C.

tidwall

Recent Activity

Donate For Us

Converting UTF-8 Characters to Upper/Lower case C++

Tags:

c++

linux

windows

unicode

cross-platform

NSA

2 Answers

Alexandre C.

tidwall

Related questions

Recent Activity

Donate For Us