I need to convert between wstring and string. I figured out, that using codecvt facet should do the trick, but it doesn't seem to work for utf-8 locale.
My idea is, that when I read utf-8 encoded file to chars, one utf-8 character is read into two normal characters (which is how utf-8 works). I'd like to create this utf-8 string from wstring representation for library I use in my code.
Does anybody know how to do it?
I already tried this:
locale mylocale("cs_CZ.utf-8"); mbstate_t mystate; wstring mywstring = L"čřžýáí"; const codecvt<wchar_t,char,mbstate_t>& myfacet = use_facet<codecvt<wchar_t,char,mbstate_t> >(mylocale); codecvt<wchar_t,char,mbstate_t>::result myresult; size_t length = mywstring.length(); char* pstr= new char [length+1]; const wchar_t* pwc; char* pc; // translate characters: myresult = myfacet.out (mystate, mywstring.c_str(), mywstring.c_str()+length+1, pwc, pstr, pstr+length+1, pc); if ( myresult == codecvt<wchar_t,char,mbstate_t>::ok ) cout << "Translation successful: " << pstr << endl; else cout << "failed" << endl; return 0;
which returns 'failed' for cs_CZ.utf-8 locale and works correctly for cs_CZ.iso8859-2 locale.
The code below might help you :)
#include <codecvt> #include <string> // convert UTF-8 string to wstring std::wstring utf8_to_wstring (const std::string& str) { std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv; return myconv.from_bytes(str); } // convert wstring to UTF-8 string std::string wstring_to_utf8 (const std::wstring& str) { std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv; return myconv.to_bytes(str); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With