Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a non-English string to a file and read from that file with C++?

I want to write a std::wstring onto a file and need to read that content as std:wstring. This is happening as expected when the string as L"<Any English letter>". But the problem is happening when we have character like Bengali, Kannada, Japanese etc, any kind of non English letter. Tried various options like:

  1. Converting the std::wstring to std::string and write onto the file and reading time read as std::string and convert as std::wstring
    • Writing is happening (I could see from edito) but reading time getting wrong character
  2. Writing std::wstring onto wofstream, this is also not helping for native language character letters like std::wstring data = L"হ্যালো ওয়ার্ল্ড";

Platform is mac and Linux, Language is C++

Code:

bool
write_file(
    const char*         path,
    const std::wstring  data
) {
    bool status = false;
    try {
        std::wofstream file(path, std::ios::out|std::ios::trunc|std::ios::binary);
        if (file.is_open()) {
            //std::string data_str = convert_wstring_to_string(data);
            file.write(data.c_str(), (std::streamsize)data.size());
            file.close();
            status = true;
        }
    } catch (...) {
        std::cout<<"exception !"<<std::endl;
    }
    return status;
}


// Read Method

std::wstring
read_file(
    const char*  filename
) {
    std::wifstream fhandle(filename, std::ios::in | std::ios::binary);
    if (fhandle) {
        std::wstring contents;
        fhandle.seekg(0, std::ios::end);
        contents.resize((int)fhandle.tellg());
        fhandle.seekg(0, std::ios::beg);
        fhandle.read(&contents[0], contents.size());
        fhandle.close();
        return(contents);
    }
    else {
        return L"";
    }
}

// Main

int main()
{
  const char* file_path_1 = "./file_content_1.txt";
  const char* file_path_2 = "./file_content_2.txt";

  //std::wstring data = L"Text message to write onto the file\n";  // This is happening as expected
  std::wstring data = L"হ্যালো ওয়ার্ল্ড";
// Not happening as expected.

  // Lets write some data
  write_file(file_path_1, data);
 // Lets read the file
 std::wstring out = read_file(file_path_1);

 std::wcout<<L"File Content: "<<out<<std::endl;
 // Let write that same data onto the different file
 write_file(file_path_2, out);
 return 0;
}
like image 394
Abhrajyoti Kirtania Avatar asked Aug 02 '13 08:08

Abhrajyoti Kirtania


1 Answers

How a wchar_t is output depends on the locale. The default locale ("C") generally doesn't accept anything but ASCII (Unicode code points 0x20...0x7E, plus a few control characters.)

Any time a program handles text, the very first statement in main should be:

std::locale::global( std::locale( "" ) );

If the program uses any of the standard stream objects, the code should also imbue them with the global locale, before any input or output.

like image 139
James Kanze Avatar answered Oct 16 '22 04:10

James Kanze