Retrieve Unicode code points U+FFFF from QChar

Question

I have an application that is supposed to deal with all kinds of characters and at some point display information about them. I use Qt and its inherent Unicode support in QChar, QString etc.

Now I need the code point of a QChar in order to look up some data in http://unicode.org/Public/UNIDATA/UnicodeData.txt, but QChar's unicode() method only returns a ushort (unsigned short), which usually is a number from 0 to 65535 (or 0xFFFF). There are characters with code points > 0xFFFF, so how do I get these? Is there some trick I am missing or is this currently not supported by Qt/QChar?

Delan Azabani · Accepted Answer

Each QChar is a UTF-16 value, not a complete Unicode codepoint. Therefore, non-BMP characters consist of two QChar surrogate pairs.

A. Penner · Answer

The solution appears to lay in code that is documented but not seen much on the Web. You can get the utf-8 value in decimal form. You then apply to determine if a single QChar is large enough. In this case it is not. Then you need to create two QChar's.

uint32_t cp = 155222; // a 4-byte Japanese character 
QString str;
if(Qchar::requiresSurrogate(cp))
{
    QChar charArray[2];
    charArray[0] = QChar::highSurrogate(cp);
    charArray[1] = QChar::lowSurrogate(cp);
    str =  QString(charArray, 2);
}

The resulting QString will contain the correct information to display your supplemental utf-8 character.

Retrieve Unicode code points > U+FFFF from QChar

Tags:

unicode

qt

codepoint

astral-plane

qchar

Sebastian Negraszus

2 Answers

Delan Azabani

A. Penner

Recent Activity

Donate For Us