Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert wstring to string encoded in UTF-8

Tags:

I need to convert between wstring and string. I figured out, that using codecvt facet should do the trick, but it doesn't seem to work for utf-8 locale.

My idea is, that when I read utf-8 encoded file to chars, one utf-8 character is read into two normal characters (which is how utf-8 works). I'd like to create this utf-8 string from wstring representation for library I use in my code.

Does anybody know how to do it?

I already tried this:

  locale mylocale("cs_CZ.utf-8");   mbstate_t mystate;    wstring mywstring = L"čřžýáí";    const codecvt<wchar_t,char,mbstate_t>& myfacet =     use_facet<codecvt<wchar_t,char,mbstate_t> >(mylocale);    codecvt<wchar_t,char,mbstate_t>::result myresult;      size_t length = mywstring.length();   char* pstr= new char [length+1];    const wchar_t* pwc;   char* pc;    // translate characters:   myresult = myfacet.out (mystate,       mywstring.c_str(), mywstring.c_str()+length+1, pwc,       pstr, pstr+length+1, pc);    if ( myresult == codecvt<wchar_t,char,mbstate_t>::ok )    cout << "Translation successful: " << pstr << endl;   else cout << "failed" << endl;   return 0; 

which returns 'failed' for cs_CZ.utf-8 locale and works correctly for cs_CZ.iso8859-2 locale.

like image 340
Trakhan Avatar asked Dec 05 '10 12:12

Trakhan


1 Answers

The code below might help you :)

#include <codecvt> #include <string>  // convert UTF-8 string to wstring std::wstring utf8_to_wstring (const std::string& str) {     std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;     return myconv.from_bytes(str); }  // convert wstring to UTF-8 string std::string wstring_to_utf8 (const std::wstring& str) {     std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;     return myconv.to_bytes(str); } 
like image 54
skyde Avatar answered Oct 06 '22 08:10

skyde