I have accented characters in my source code and have tried replacing them with the unicode equivalent. The program compiles and works properly if I use the actual non-ascii character but I'm concerned this may impact portability. When I try using the unicode equivalent I get warning: case label value exceeds maximum value for type or warning: character constant too long for its type and the case is never matched when I run the program.
for(int i = 0; i < ent->d_namlen; i++)
{
switch(ent->d_name[i])
{
case 'á' : //0x00E1
...
}
}
ent is struct dirent *ent
that gets passed from a calling function.
In place of case 'á' :
I've tried case '0x00E1' :
, case L 'u00E1 :
, case \U000000E9 :
and case '\u00E1' :
I've tried all without single quotes in which case it won't compile (e.g. says that \u00E1 was not declared in this scope).
á
is a non-ASCII character and is being represented as multiple bytes in either your source code, the struct dirent
, or both.
If you turn on -Wmultichar
you will probably get the warning
warning: multi-character character constant
indicating that the character constant 'á'
consists of more than one byte, in which case it's probably in UTF-8, but check (e.g. using file
). You'll also need to find out the encoding of the dirent
entries.
In order to match non-ASCII characters in a string you need to:
int
), orLook at http://en.cppreference.com/w/cpp/locale/codecvt_utf8 for an example of how to do the conversions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With