Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a unicode string to its unicode escapes?

Tags:

c++

unicode

qt

Say I have a text "Բարև Hello Здравствуй". (I save this code in QString, but if you know other way to store this text in c++ code, you'r welcome.) How can I convert this text to Unicode escapes like this "\u1330\u1377\u1408\u1415 Hello \u1047\u1076\u1088\u1072\u1074\u1089\u1090\u1074\u1091\u1081" (see here)?

like image 935
Narek Avatar asked Dec 23 '22 01:12

Narek


2 Answers

#include <cstdio>

#include <QtCore/QString>
#include <QtCore/QTextStream>

int main() {
  QString str = QString::fromWCharArray(L"Բարև Hello Здравствуй");
  QString escaped;
  escaped.reserve(6 * str.size());
  for (QString::const_iterator it = str.begin(); it != str.end(); ++it) {
    QChar ch = *it;
    ushort code = ch.unicode();
    if (code < 0x80) {
      escaped += ch;
    } else {
      escaped += "\\u";
      escaped += QString::number(code, 16).rightJustified(4, '0');
    }
  }
  QTextStream stream(stdout);
  stream << escaped << '\n';
}

Note this loops over UTF-16 code units, not actual code points.

like image 74
Philipp Avatar answered Jan 09 '23 06:01

Philipp


I assume you're doing code-generation (of JavaScript, maybe?)

QString is like a collection of QChar. Loop through the contents, and on each QChar call the unicode method to get the ushort (16-bit integer) value.

Then format each character like "\\u%04X", i.e. \u followed by the 4-digit hex value.

NB. You may need to swap the two bytes (the two hex characters) to get the right result depending on the platform you're running on.

like image 36
Daniel Earwicker Avatar answered Jan 09 '23 06:01

Daniel Earwicker