Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c++ how to write/read ofstream in unicode / utf8

I have UTF-8 text file , that I'm reading using simple :

ifstream in("test.txt");

Now I'd like to create a new file that will be UTF-8 encoding or Unicode. How can I do this with ofstream or other? This creates ansi Encoding.

ofstream out(fileName.c_str(), ios::out | ios::app | ios::binary);
like image 949
user63898 Avatar asked Feb 17 '11 08:02

user63898


People also ask

Does UTF-8 include Unicode?

UTF-8 is a Unicode character encoding method. This means that UTF-8 takes the code point for a given Unicode character and translates it into a string of binary. It also does the reverse, reading in binary digits and converting them back to characters.

Can ASCII be read as UTF-8?

Any text file encoded in ASCII can be decoded as UTF-8 to get exactly the same result.

Is UTF-8 ASCII or Unicode?

UTF-8 encodes Unicode characters into a sequence of 8-bit bytes. The standard has a capacity for over a million distinct codepoints and is a superset of all characters in widespread use today. By comparison, ASCII (American Standard Code for Information Interchange) includes 128 character codes.


1 Answers

Ok, about the portable variant. It is easy, if you use the C++11 standard (because there are a lot of additional includes like "utf8", which solves this problem forever).

But if you want to use multi-platform code with older standards, you can use this method to write with streams:

  1. Read the article about UTF converter for streams
  2. Add stxutif.h to your project from sources above
  3. Open the file in ANSI mode and add the BOM to the start of a file, like this:

    std::ofstream fs;
    fs.open(filepath, std::ios::out|std::ios::binary);
    
    unsigned char smarker[3];
    smarker[0] = 0xEF;
    smarker[1] = 0xBB;
    smarker[2] = 0xBF;
    
    fs << smarker;
    fs.close();
    
  4. Then open the file as UTF and write your content there:

    std::wofstream fs;
    fs.open(filepath, std::ios::out|std::ios::app);
    
    std::locale utf8_locale(std::locale(), new utf8cvt<false>);
    fs.imbue(utf8_locale); 
    
    fs << .. // Write anything you want...
    
like image 59
Yarkov Anton Avatar answered Sep 23 '22 06:09

Yarkov Anton