I have a problem with writing unicode to a file in C++. I want to write to a file with my own extension a few smiley faces that you can get by typing ALT+NUMPAD(2). I can display it on CMD by making a char and assigning the value of '\2' to it and it will display a smiley face, but it won't write it to a file. Here is a snippet of code for my program: <pre class="prettyprint"><code>ofstream myfile; myfile.open("C:\Users\My Username\test.exampleCodeFile"); myfile << "\2"; myfile.close(); </code></pre> It will write to the file, but it wont display what I want. I would show you what it displays but StackOverflow won't let me display the character. Thanks in advance.

You have to use Unicode to specify the characters you want to display. The character represented by byte <code>02h</code> in the console is translated by code page 437 (cp437) to the Unicode character <code>U+263B</code>. Using a source file saved in UTF-8 with BOM makes using Unicode easier, because you can paste or type the characters you want without resorting to Unicode escape codes. For a file stream the stream needs to be configured for UTF-8. There are various ways to do this and it depends on the compiler, but using Visual Studio 2012, source saved in UTF-8 w/ BOM, and a bit of Googling: <pre class="prettyprint"><code>#include <locale> #include <codecvt> #include <fstream> #include <iostream> #include <io.h> #include <fcntl.h> using namespace std; int main() { const std::locale utf8_locale = std::locale(std::locale(), new std::codecvt_utf8<wchar_t>()); wofstream f(L"sample.txt"); f.imbue(utf8_locale); f << L"\u263b我是美国人。我叫马克。" << endl; _setmode(_fileno(stdout),_O_U16TEXT); wcout << L"\u263b我是美国人。我叫马克。" << endl; } </code></pre> Content of <code>sample.txt</code> as viewed in Notepad: <pre class="prettyprint"><code>☻我是美国人。我叫马克。 </code></pre> Hex dump (correct UTF-8): <pre class="prettyprint"><code>E68891E698AFE7BE8EE59BBDE4BABAE38082E68891E58FABE9A9ACE5858BE380820D0A </code></pre> Output to console cut-and-pasted here. The visual display was � for each Chinese character without the right font, but the characters display correctly pasted into SO or Notepad. <pre class="prettyprint"><code>☻我是美国人。我叫马克。 </code></pre>

Writing Unicode to a file in C++

Tags:

c++

unicode

ofstream

writetofile

I have a problem with writing unicode to a file in C++. I want to write to a file with my own extension a few smiley faces that you can get by typing ALT+NUMPAD(2). I can display it on CMD by making a char and assigning the value of '\2' to it and it will display a smiley face, but it won't write it to a file.

Here is a snippet of code for my program:

ofstream myfile;
myfile.open("C:\Users\My Username\test.exampleCodeFile");
myfile << "\2";
myfile.close();

It will write to the file, but it wont display what I want. I would show you what it displays but StackOverflow won't let me display the character. Thanks in advance.

658

asked Apr 09 '13 19:04

Garrett Ratliff

1 Answers

You have to use Unicode to specify the characters you want to display. The character represented by byte 02h in the console is translated by code page 437 (cp437) to the Unicode character U+263B. Using a source file saved in UTF-8 with BOM makes using Unicode easier, because you can paste or type the characters you want without resorting to Unicode escape codes.

For a file stream the stream needs to be configured for UTF-8. There are various ways to do this and it depends on the compiler, but using Visual Studio 2012, source saved in UTF-8 w/ BOM, and a bit of Googling:

#include <locale>
#include <codecvt>
#include <fstream>
#include <iostream>
#include <io.h>
#include <fcntl.h>
using namespace std;

int main()
{
    const std::locale utf8_locale = std::locale(std::locale(), new std::codecvt_utf8<wchar_t>());
    wofstream f(L"sample.txt");
    f.imbue(utf8_locale);
    f << L"\u263b我是美国人。我叫马克。" << endl;

    _setmode(_fileno(stdout),_O_U16TEXT);
    wcout << L"\u263b我是美国人。我叫马克。" << endl;
}

Content of sample.txt as viewed in Notepad:

☻我是美国人。我叫马克。

Hex dump (correct UTF-8):

E68891E698AFE7BE8EE59BBDE4BABAE38082E68891E58FABE9A9ACE5858BE380820D0A

Output to console cut-and-pasted here. The visual display was � for each Chinese character without the right font, but the characters display correctly pasted into SO or Notepad.

☻我是美国人。我叫马克。

answered Sep 28 '22 09:09

Mark Tolonen

Related questions
                            
                                C function pointers with C++11 lambdas
                            
                                Zooming into a window based on the mouse position
                            
                                assigning derived class pointer to base class pointer in C++
                            
                                pointer to struct or class versus pointer to first field
                            
                                How to print all words in a Trie?
                            
                                Does delete[] deallocate the entire block of memory?
                            
                                C++: Conflicts with a previous declaration?
                            
                                Why am I getting the error "A is an inaccessible base of B" when using dynamic_cast and templates?
                            
                                Remove characters from std::string from "(" to ")" with erase ?
                            
                                How to store unsigned long long (uint64_t) values in a MongoDB document?
                            
                                c++ temporary - "pure virtual method called"
                            
                                C++ STL queue memory usage compared to vector?
                            
                                Realloc equivalent in C++
                            
                                GetClipboardData (CF_UNICODETEXT);
                            
                                Arithmetics on calendar dates in C or C++ (add N days to given date)
                            
                                C++ VS2010 determine if Release or Debug
                            
                                Merging three grayscale [R, G, B] images into a single color image in opencv
                            
                                What are the Differences between C++ Templates and Java/C# Generics and what are the limits? [closed]
                            
                                NOMINMAX with Visual Studio 2012 MFC project [duplicate]
                            
                                Callstack for std::bad_function_call

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With