I'm trying to understand how does printf work with wide characters (wchar_t
).
I've made the following code samples :
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
wchar_t *s;
s = (wchar_t *)malloc(sizeof(wchar_t) * 2);
s[0] = 42;
s[1] = 0;
printf("%ls\n", s);
free(s);
return (0);
}
output :
*
Everything is fine here : my character (*
) is correctly displayed.
I wanted to display an other kind of character. On my system, wchar_t
seem encoded on 4 bytes. So I tried to display the following character :
É
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
wchar_t *s;
s = (wchar_t *)malloc(sizeof(wchar_t) * 2);
s[0] = 0xC389;
s[1] = 0;
printf("%ls\n", s);
free(s);
return (0);
}
But there is no output this time, I tried with many values from the "encoding" section (cf. previous link) for s[0]
(0xC389, 201, 0xC9)... But I never get the É
character displayed. I also tried with %S
instead of %ls
.
If I try to call printf like this : printf("<%ls>\n", s)
the only character printed is '<'
, the display is truncated.
Why do I have this problem? How should I do?
% indicates a format escape sequence used for formatting the variables passed to printf() . So you have to escape it to print the % character.
A wide character is a computer character datatype that generally has a size greater than the traditional 8-bit character. The increased datatype size allows for the use of larger coded character sets.
Put an l (lowercased letter L) directly before the specifier. printf("%ld", ULONG_MAX) outputs the value as -1. Should be printf("%lu", ULONG_MAX) for unsigned long as described by @Blorgbeard below. Actually, you should change it to be %ld , to be more harmonic with OP question.
The field width can also be specified as asterisk (*) in which case an additional argument of type int is accessed to determine the field width. For example, to print an integer x in a field width determined by the value of the int variable w, you would write the D statement: printf("%*d", w, x);
I found a simple way to print wide chars. One key point is setlocale()
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main(int argc, char *argv[])
{
setlocale(LC_ALL, "");
// setlocale(LC_ALL, "C.UTF-8"); // this also works
wchar_t hello_eng[] = L"Hello World!";
wchar_t hello_china[] = L"世界, 你好!";
wchar_t *hello_japan = L"こんにちは日本!";
printf("%ls\n", hello_eng);
printf("%ls\n", hello_china);
printf("%ls\n", hello_japan);
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With