From char* to wstring : char* str = "hello worlddd"; wstring wstr (str, str+strlen(str)); From string to wstring : string str = "hello worlddd"; wstring wstr (str.
String overview std::string is used for standard ascii and utf-8 strings. std::wstring is used for wide-character/unicode (utf-16) strings. There is no built-in class for utf-32 strings (though you should be able to extend your own from basic_string if you need one).
The wchar_t type is an implementation-defined wide character type. In the Microsoft compiler, it represents a 16-bit wide character used to store Unicode encoded as UTF-16LE, the native character type on Windows operating systems.
This LPCWSTR is Microsoft defined. So to use them we have to include Windows. h header file into our program. To convert std::wstring to wide character array type string, we can use the function called c_str() to make it C like string and point to wide character string.
Assuming that the input string in your example (おはよう) is a UTF-8 encoded (which it isn't, by the looks of it, but let's assume it is for the sake of this explanation :-)) representation of a Unicode string of your interest, then your problem can be fully solved with the standard library (C++11 and newer) alone.
The TL;DR version:
#include <locale>
#include <codecvt>
#include <string>
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::string narrow = converter.to_bytes(wide_utf16_source_string);
std::wstring wide = converter.from_bytes(narrow_utf8_source_string);
Longer online compilable and runnable example:
(They all show the same example. There are just many for redundancy...)
Note (old):
As pointed out in the comments and explained in https://stackoverflow.com/a/17106065/6345 there are cases when using the standard library to convert between UTF-8 and UTF-16 might give unexpected differences in the results on different platforms. For a better conversion, consider std::codecvt_utf8
as described on http://en.cppreference.com/w/cpp/locale/codecvt_utf8
Note (new):
Since the codecvt
header is deprecated in C++17, some worry about the solution presented in this answer were raised. However, the C++ standards committee added an important statement in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0618r0.html saying
this library component should be retired to Annex D, along side , until a suitable replacement is standardized.
So in the foreseeable future, the codecvt
solution in this answer is safe and portable.
int StringToWString(std::wstring &ws, const std::string &s)
{
std::wstring wsTmp(s.begin(), s.end());
ws = wsTmp;
return 0;
}
Your question is underspecified. Strictly, that example is a syntax error. However, mbstowcs
is probably what you're looking for.
It is a C-library function and operates on buffers, but here's an easy-to-use idiom, courtesy of Mooing Duck:
std::wstring ws(s.size(), L' '); // Overestimate number of code points.
ws.resize(::mbstowcs_s(&ws[0], ws.size(), s.c_str(), s.size())); // Shrink to fit.
If you are using Windows/Visual Studio and need to convert a string to wstring you could use:
#include <AtlBase.h>
#include <atlconv.h>
...
string s = "some string";
CA2W ca2w(s.c_str());
wstring w = ca2w;
printf("%s = %ls", s.c_str(), w.c_str());
Same procedure for converting a wstring to string (sometimes you will need to specify a codepage):
#include <AtlBase.h>
#include <atlconv.h>
...
wstring w = L"some wstring";
CW2A cw2a(w.c_str());
string s = cw2a;
printf("%s = %ls", s.c_str(), w.c_str());
You could specify a codepage and even UTF8 (that's pretty nice when working with JNI/Java). A standard way of converting a std::wstring to utf8 std::string is showed in this answer.
//
// using ATL
CA2W ca2w(str, CP_UTF8);
//
// or the standard way taken from the answer above
#include <codecvt>
#include <string>
// convert UTF-8 string to wstring
std::wstring utf8_to_wstring (const std::string& str) {
std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
return myconv.from_bytes(str);
}
// convert wstring to UTF-8 string
std::string wstring_to_utf8 (const std::wstring& str) {
std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
return myconv.to_bytes(str);
}
If you want to know more about codepages there is an interesting article on Joel on Software: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets.
These CA2W (Convert Ansi to Wide=unicode) macros are part of ATL and MFC String Conversion Macros, samples included.
Sometimes you will need to disable the security warning #4995', I don't know of other workaround (to me it happen when I compiled for WindowsXp in VS2012).
#pragma warning(push)
#pragma warning(disable: 4995)
#include <AtlBase.h>
#include <atlconv.h>
#pragma warning(pop)
Edit: Well, according to this article the article by Joel appears to be: "while entertaining, it is pretty light on actual technical details". Article: What Every Programmer Absolutely, Positively Needs To Know About Encoding And Character Sets To Work With Text.
Windows API only, pre C++11 implementation, in case someone needs it:
#include <stdexcept>
#include <vector>
#include <windows.h>
using std::runtime_error;
using std::string;
using std::vector;
using std::wstring;
wstring utf8toUtf16(const string & str)
{
if (str.empty())
return wstring();
size_t charsNeeded = ::MultiByteToWideChar(CP_UTF8, 0,
str.data(), (int)str.size(), NULL, 0);
if (charsNeeded == 0)
throw runtime_error("Failed converting UTF-8 string to UTF-16");
vector<wchar_t> buffer(charsNeeded);
int charsConverted = ::MultiByteToWideChar(CP_UTF8, 0,
str.data(), (int)str.size(), &buffer[0], buffer.size());
if (charsConverted == 0)
throw runtime_error("Failed converting UTF-8 string to UTF-16");
return wstring(&buffer[0], charsConverted);
}
Here's a way to combining string
, wstring
and mixed string constants to wstring
. Use the wstringstream
class.
This does NOT work for multi-byte character encodings. This is just a dumb way of throwing away type safety and expanding 7 bit characters from std::string into the lower 7 bits of each character of std:wstring. This is only useful if you have a 7-bit ASCII strings and you need to call an API that requires wide strings.
#include <sstream>
std::string narrow = "narrow";
std::wstring wide = L"wide";
std::wstringstream cls;
cls << " abc " << narrow.c_str() << L" def " << wide.c_str();
std::wstring total= cls.str();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With