Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: output contents of a Unicode file to console in Windows

I've read a bunch of articles and forums posts discussing this problem all of the solutions seem way too complicated for such a simple task.

Here's a sample code straight from cplusplus.com:

// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
  string line;
  ifstream myfile ("example.txt");
  if (myfile.is_open())
  {
    while ( myfile.good() )
    {
      getline (myfile,line);
      cout << line << endl;
    }
    myfile.close();
  }

  else cout << "Unable to open file"; 

  return 0;
}

It works fine as long as example.txt has only ASCII characters. Things get messy if I try to add, say, something in Russian.

In GNU/Linux it's as simple as saving the file as UTF-8.

In Windows, that doesn't work. Converting the file into UCS-2 Little Endian (what Windows seems to use by default) and changing all the functions into their wchar_t counterparts doesn't do the trick either.

Isn't there some kind of a "correct" way to get this done without doing all kinds of magic encoding conversions?

like image 629
Nikolai Avatar asked Feb 05 '11 19:02

Nikolai


2 Answers

The Windows console supports unicode, sort of. It does not support left-to-right and "complex scripts". To print a UTF-16 file with Visual C++, use the following:

   _setmode(_fileno(stdout), _O_U16TEXT);   

And use wcout instead of cout.

There is no support for a "UTF8" code page so for UTF-8 you will have to use MultiBytetoWideChar

More on console support for unicode can be found in this blog

like image 163
John Avatar answered Sep 26 '22 01:09

John


The right way to output to a console on Windows using cout is to first call GetConsoleOutputCP, and then convert the input you have into the console code page. Alternatively, use WriteConsoleW, passing a wchar_t*.

like image 41
Martin v. Löwis Avatar answered Sep 27 '22 01:09

Martin v. Löwis