Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Output unicode strings in Windows console app

Hi I was trying to output unicode string to a console with iostreams and failed.

I found this: Using unicode font in c++ console app and this snippet works.

SetConsoleOutputCP(CP_UTF8); wchar_t s[] = L"èéøÞǽлљΣæča"; int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL); char* m = new char[bufferSize];  WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL); wprintf(L"%S", m); 

However, I did not find any way to output unicode correctly with iostreams. Any suggestions?

This does not work:

SetConsoleOutputCP(CP_UTF8); utf8_locale = locale(old_locale,new boost::program_options::detail::utf8_codecvt_facet()); wcout.imbue(utf8_locale); wcout << L"¡Hola!" << endl; 

EDIT I could not find any other solution than to wrap this snippet around in a stream. Hope, somebody has better ideas.

//Unicode output for a Windows console  ostream &operator-(ostream &stream, const wchar_t *s)  {      int bufSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);     char *buf = new char[bufSize];     WideCharToMultiByte(CP_UTF8, 0, s, -1, buf, bufSize, NULL, NULL);     wprintf(L"%S", buf);     delete[] buf;      return stream;  }   ostream &operator-(ostream &stream, const wstring &s)  {      stream - s.c_str();     return stream;  }  
like image 970
Andrew Avatar asked Mar 22 '10 12:03

Andrew


People also ask

Does Windows use Unicode or Ascii?

These functions use UTF-16 (wide character) encoding, which is the most common encoding of Unicode and the one used for native Unicode encoding on Windows operating systems.

Does Microsoft use Unicode?

Microsoft was one of the first companies to implement Unicode in their products.

Is string a Unicode?

A string is a sequence of chars while a unicode is a sequence of "pointers". The unicode is an in-memory representation of the sequence and every symbol on it is not a char but a number (in hex format) intended to select a char in a map. So a unicode var does not have encoding because it does not contain chars.


2 Answers

I have verified a solution here using Visual Studio 2010. Via this MSDN article and MSDN blog post. The trick is an obscure call to _setmode(..., _O_U16TEXT).

Solution:

#include <iostream> #include <io.h> #include <fcntl.h>  int wmain(int argc, wchar_t* argv[]) {     _setmode(_fileno(stdout), _O_U16TEXT);     std::wcout << L"Testing unicode -- English -- Ελληνικά -- Español." << std::endl; } 

Screenshot:

Unicode in console

like image 148
DuckMaestro Avatar answered Sep 20 '22 03:09

DuckMaestro


Unicode Hello World in Chinese

Here is a Hello World in Chinese. Actually it is just "Hello". I tested this on Windows 10, but I think it might work since Windows Vista. Before Windows Vista it will be hard, if you want a programmatic solution, instead of configuring the console / registry etc. Maybe have a look here if you really need to do this on Windows 7: Change console Font Windows 7

I dont want to claim this is the only solution, but this is what worked for me.

Outline

  1. Unicode project setup
  2. Set the console codepage to unicode
  3. Find and use a font that supports the characters you want to display
  4. Use the locale of the language you want to display
  5. Use the wide character output i.e. std::wcout

1 Project Setup

I am using Visual Studio 2017 CE. I created a blank console app. The default settings are alright. But if you experience problems or you use a different ide you might want to check these:

In your project properties find configuration properties -> General -> Project Defaults -> Character Set. It should be "Use Unicode Character Set" not "Multi-Byte". This will define _UNICODE and UNICODE preprocessor macros for you.

int wmain(int argc, wchar_t* argv[]) 

Also I think we should use wmain function instead of main. They both work, but in a unicode environment wmain may be more convenient.

Also my source files are UTF-16-LE encoded, which seems to be the default in Visual Studio 2017.

  1. Console Codepage ===================

This is quite obvious. We need the unicode codepage in the console. If you want to check your default codepage, just open a console and type chcp withou any arguments. We have to change it to 65001, which is the UTF-8 codepage. Windows Codepage Identifiers There is a preprocessor macro for that codepage: CP_UTF8. I needed to set both, the input and output codepage. When I omitted either one, the output was incorrect.

SetConsoleOutputCP(CP_UTF8); SetConsoleCP(CP_UTF8); 

You might also want to check the boolean return values of those functions.

  1. Choose a Font ================

Until yet I didnt find a console font that supports every character. So I had to choose one. If you want to output characters which are partly only available in one font and partly in another font, then I believe it is impossible to find a solution. Only maybe if there is a font out there that supports every character. But also I didnt look into how to install a font.

I think it is not possible to use two different fonts in the same console window at the same time.

How to find a compatible font? Open your console, go to the properties of the console window by clicking on the icon in the upper left of the window. Go to the fonts tab and choose a font and click ok. Then try to enter your characters in the console window. Repeat this until you find a font you can work with. Then note down the name of the font.

Also you can change the size of the font in the properties window. If you found a size you are happy with, note down the size values that are displayed in the properties window in the section "selected font". It will show width and height in pixels.

To actually set the font programmatically you use:

CONSOLE_FONT_INFOEX fontInfo; // ... configure fontInfo SetCurrentConsoleFontEx(hConsole, false, &fontInfo); 

See my example at the end of this answer for details. Or look it up in the fine manual: SetCurrentConsoleFont. This function only exists since Windows Vista.

  1. Set the locale =================

You will need to set the locale to the locale of the language which characters you want to print.

char* a = setlocale(LC_ALL, "chinese"); 

The return value is interesting. It will contain a string to describe exactly wich locale was chosen. Just give it a try :-) I tested with chinese and german. More info: setlocale

  1. Use wide character output ============================

Not much to say here. If you want to output wide characters, use this for example:

std::wcout << L"你好" << std::endl; 

Oh, and dont forget the L prefix for wide characters! And if you type literal unicode characters like this in the source file, the source file must be unicode encoded. Like the default in Visual Studio is UTF-16-LE. Or maybe use notepad++ and set the encoding to UCS-2 LE BOM.

Example

Finally I put it all together as an example:

#include <Windows.h> #include <iostream> #include <io.h> #include <fcntl.h> #include <locale.h> #include <wincon.h>  int wmain(int argc, wchar_t* argv[]) {     SetConsoleTitle(L"My Console Window - 你好");     HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);      char* a = setlocale(LC_ALL, "chinese");     SetConsoleOutputCP(CP_UTF8);     SetConsoleCP(CP_UTF8);      CONSOLE_FONT_INFOEX fontInfo;     fontInfo.cbSize = sizeof(fontInfo);     fontInfo.FontFamily = 54;     fontInfo.FontWeight = 400;     fontInfo.nFont = 0;     const wchar_t myFont[] = L"KaiTi";     fontInfo.dwFontSize = { 18, 41 };     std::copy(myFont, myFont + (sizeof(myFont) / sizeof(wchar_t)), fontInfo.FaceName);          SetCurrentConsoleFontEx(hConsole, false, &fontInfo);      std::wcout << L"Hello World!" << std::endl;     std::wcout << L"你好!" << std::endl;     return 0; } 

Cheers !

Edit on 2021-11-20

Maybe you can also try to use the new Windows Terminal. It seems to print unicode out of the box. You will still need to configure a font that supports your characters in the settings. It is developed by Microsoft as OpenSource on github and you can also install it from the Microsoft Store. I successfully tried this on Windows 10.

like image 30
David Avatar answered Sep 22 '22 03:09

David