Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a utf16 ushort array to a utf8 std::string?

Currently I'm writing a plugin which is just a wrapper around an existing library. The plugin's host passes to me an utf-16 formatted string defined as following

typedef unsigned short PA_Unichar;

And the wrapped library accepts only a const char* or a std::string utf-8 formatted string I tried writing a conversion function like

std::string toUtf8(const PA_Unichar* data)
{
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,char16_t> convert;
return std::string(convert.to_bytes(static_cast<const char16_t*>(data));
}

But obviously this doesn't work, throwing me a compile error "static_cast from 'const pointer' (aka 'const unsigned short*') to 'const char16_t *' is not allowed"

So what's the most elegant/correct way to do it?

Thank you in advance.

like image 338
Robotex Avatar asked Dec 15 '12 09:12

Robotex


1 Answers

You could convert the PA_unichar string to a string of char16_t using the basic_string(Iterator, Iterator) constructor, then use the std::codecvt_utf8_utf16 facet as you attempted:

std::string conv(const PA_unichar* str, size_t len)
{
  std::u16string s(str, str+len);
  std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,char16_t> convert;
  return convert.to_bytes(s);
}

I think that's right. Unfortunately I can't test this, as my implementation doesn't support it yet. I have an implementation of wstring_convert which I plan to include in GCC 4.9, but I don't have an implementation of codecvt_utf8_utf16 to test it with.

like image 162
Jonathan Wakely Avatar answered Oct 24 '22 18:10

Jonathan Wakely