I'm trying to use new unicode characters in C++0x. So I wrote sample code:
#include <fstream>
#include <string>
int main()
{
std::u32string str = U"Hello World";
std::basic_ofstream<char32_t> fout("output.txt");
fout<<str;
return 0;
}
But after executing this program I'm getting empty output.txt file. So why it's not printing Hello World?
Also is there something like a cout
and cin
already defined for these types, or stdin
and stdout
doesn't support Unicode?
Edit: I'm using g++ and Linux.
EDIT:АТТЕNTION. I have discovered, that standard committee dismissed Unicode streams from C++0x. So previously accepted answer is not correct anymore. For more information see my answer!
It can represent all 1,114,112 Unicode characters. Most C code that deals with strings on a byte-by-byte basis still works, since UTF-8 is fully compatible with 7-bit ASCII. Characters usually require fewer than four bytes. String sort order is preserved.
As far as I know, the standard C's char data type is ASCII, 1 byte (8 bits).
Unicode uses two encoding forms: 8-bit and 16-bit, based on the data type of the data that is being that is being encoded. The default encoding form is 16-bit, where each character is 16 bits (2 bytes) wide. Sixteen-bit encoding form is usually shown as U+hhhh, where hhhh is the hexadecimal code point of the character.
C#, Java, Python3, as far as I know, are all Unicode based programming.
Unicode string literals support began in GCC 4.5. Maybe that's the problem.
[edit]
After some digging I've found that streams for this new unicode literals are described in N2035 and it was included in a draft of the standard. According to this document you need u32ofstream
to output you string but this class is absent in GCC 4.5 C++0x library.
As a workaround you can use ordinary fstream:
std::ofstream fout2("output2.txt", std::ios::out | std::ios::binary);
fout2.write((const char *)str.c_str(), str.size() * 4);
This way I've output your string in UTF-32LE on my Intel machine (which is little-endian).
[edit]
I was a little bit wrong about the status of u32ofstream
: according to the latest draft on the The C++ Standards Committee's web site you have to use std::basic_ofstream<char32_t>
as you did. This class would use codecvt<char32_t,char,typename traits::state_type>
class (see end of §27.9.1.1) which has to be implemented in the standard library (search codecvt<char32_t
in the document), but it's not available in GCC 4.5.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With