Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print Unicode character in C++?

I am trying to print a Russian "ф" (U+0444 CYRILLIC SMALL LETTER EF) character, which is given a code of decimal 1092. Using C++, how can I print out this character? I would have thought something along the lines of the following would work, yet...

int main (){    wchar_t f = '1060';    cout << f << endl; } 
like image 621
James Raitsev Avatar asked Aug 18 '12 03:08

James Raitsev


People also ask

Can I use Unicode in C?

It can represent all 1,114,112 Unicode characters. Most C code that deals with strings on a byte-by-byte basis still works, since UTF-8 is fully compatible with 7-bit ASCII. Characters usually require fewer than four bytes. String sort order is preserved.

What is the Unicode for C '?

Unicode Character “C” (U+0043)

How do I type Unicode text?

To insert a Unicode character, type the character code, press ALT, and then press X. For example, to type a dollar symbol ($), type 0024, press ALT, and then press X. For more Unicode character codes, see Unicode character code charts by script.

Does C use Unicode or ASCII?

As far as I know, the standard C's char data type is ASCII, 1 byte (8 bits).


2 Answers

To represent the character you can use Universal Character Names (UCNs). The character 'ф' has the Unicode value U+0444 and so in C++ you could write it '\u0444' or '\U00000444'. Also if the source code encoding supports this character then you can just write it literally in your source code.

// both of these assume that the character can be represented with // a single char in the execution encoding char b = '\u0444'; char a = 'ф'; // this line additionally assumes that the source character encoding supports this character 

Printing such characters out depends on what you're printing to. If you're printing to a Unix terminal emulator, the terminal emulator is using an encoding that supports this character, and that encoding matches the compiler's execution encoding, then you can do the following:

#include <iostream>  int main() {     std::cout << "Hello, ф or \u0444!\n"; } 

This program does not require that 'ф' can be represented in a single char. On OS X and most any modern Linux install this will work just fine, because the source, execution, and console encodings will all be UTF-8 (which supports all Unicode characters).

Things are harder with Windows and there are different possibilities with different tradeoffs.

Probably the best, if you don't need portable code (you'll be using wchar_t, which should really be avoided on every other platform), is to set the mode of the output file handle to take only UTF-16 data.

#include <iostream> #include <io.h> #include <fcntl.h>  int main() {     _setmode(_fileno(stdout), _O_U16TEXT);     std::wcout << L"Hello, \u0444!\n"; } 

Portable code is more difficult.

like image 106
bames53 Avatar answered Sep 21 '22 09:09

bames53


When compiling with -std=c++11, one can simply

  const char *s  = u8"\u0444";   cout << s << endl; 
like image 32
James Raitsev Avatar answered Sep 20 '22 09:09

James Raitsev