Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I convert a string in UTF-16 to UTF-8 in C++

Consider:

STDMETHODIMP CFileSystemAPI::setRRConfig( BSTR config_str, VARIANT* ret )
{
mReportReaderFactory.reset( new sbis::report_reader::ReportReaderFactory() );

USES_CONVERSION;
std::string configuration_str = W2A( config_str );

But in config_str I get a string in UTF-16. How can I convert it to UTF-8 in this piece of code?

like image 584
user3252635 Avatar asked Jan 30 '14 12:01

user3252635


People also ask

How do I encode strings to UTF-8?

In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. where charsetName is the specific charset by which the String is encoded into an array of bytes.

How do I encode Unicode to UTF-8?

Base Convert Unicode symbols to UTF-8 in this base. Set the byte delimiter character here. Add a Prefix Use prefix "0b" for binary, prefix "o" for octal, and prefix "0x" for hex values. Add Padding Add zero padding to small values to make them all the same length.

Does STD string support UTF-8?

UTF-8 actually works quite well in std::string . Most operations work out of the box because the UTF-8 encoding is self-synchronizing and backward compatible with ASCII.

Does UTF-16 support all languages?

UTF-16 is space-efficient for East Asian languages (but not for ASCII or English or European languages), while it's never more space-efficient than alternative encodings, than e.g. GB 18030 which is used on the web, and supports all languages.


1 Answers

You can do something like this

std::string WstrToUtf8Str(const std::wstring& wstr)
{
  std::string retStr;
  if (!wstr.empty())
  {
    int sizeRequired = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, NULL, 0, NULL, NULL);

    if (sizeRequired > 0)
    {
      std::vector<char> utf8String(sizeRequired);
      int bytesConverted = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(),    
                           -1, &utf8String[0], utf8String.size(), NULL, 
                           NULL);
      if (bytesConverted != 0)
      {
        retStr = &utf8String[0];
      }
      else
      {
        std::stringstream err;
        err << __FUNCTION__ 
            << " std::string WstrToUtf8Str failed to convert wstring '"
            << wstr.c_str() << L"'";
        throw std::runtime_error( err.str() );
      }
    }
  }
  return retStr;
}

You can give your BSTR to the function as a std::wstring

like image 179
AndersK Avatar answered Oct 12 '22 11:10

AndersK