Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert wchar_t to char

Tags:

c++

I was wondering is it safe to do so?

wchar_t wide = /* something */;
assert(wide >= 0 && wide < 256 &&);
char myChar = static_cast<char>(wide);

If I am pretty sure the wide char will fall within ASCII range.

like image 405
Cheok Yan Cheng Avatar asked Jun 11 '10 03:06

Cheok Yan Cheng


2 Answers

Why not just use a library routine wcstombs.

like image 147
Igor Zevaka Avatar answered Oct 15 '22 01:10

Igor Zevaka


You are looking for wctomb(): it's in the ANSI standard, so you can count on it. It works even when the wchar_t uses a code above 255. You almost certainly do not want to use it.


wchar_t is an integral type, so your compiler won't complain if you actually do:

char x = (char)wc;

but because it's an integral type, there's absolutely no reason to do this. If you accidentally read Herbert Schildt's C: The Complete Reference, or any C book based on it, then you're completely and grossly misinformed. Characters should be of type int or better. That means you should be writing this:

int x = getchar();

and not this:

char x = getchar(); /* <- WRONG! */

As far as integral types go, char is worthless. You shouldn't make functions that take parameters of type char, and you should not create temporary variables of type char, and the same advice goes for wchar_t as well.

char* may be a convenient typedef for a character string, but it is a novice mistake to think of this as an "array of characters" or a "pointer to an array of characters" - despite what the cdecl tool says. Treating it as an actual array of characters with nonsense like this:

for(int i = 0; s[i]; ++i) {
  wchar_t wc = s[i];
  char c = doit(wc);
  out[i] = c;
}

is absurdly wrong. It will not do what you want; it will break in subtle and serious ways, behave differently on different platforms, and you will most certainly confuse the hell out of your users. If you see this, you are trying to reimplement wctombs() which is part of ANSI C already, but it's still wrong.

You're really looking for iconv(), which converts a character string from one encoding (even if it's packed into a wchar_t array), into a character string of another encoding.

Now go read this, to learn what's wrong with iconv.

like image 16
geocar Avatar answered Oct 15 '22 01:10

geocar