I have a C code in which I am using standard library function isalpha() in ctype.h, This is on Visual Studio 2010-Windows. In below code, if char c is '£', the isalpha call returns an assertion as shown in the snapshot below:
char c='£';
if(isalpha(c))
{
printf ("character %c is alphabetic\n",c);
}
else
{
printf ("character %c is NOT alphabetic\n",c);
}
I can see that this might be because 8 bit ASCII does not have this character.
So how do I handle such Non-ASCII characters outside of ASCII table?
What I want to do is if any non-alphabetic character is found(even if it includes such character not in 8-bit ASCII table) i want to be able to neglect it.
You may want to cast the value sent to isalpha
(and the other functions declared in <ctype.h>
) to unsigned char
isalpha((unsigned char)value)
It's one of the (not so) few occasions where a cast is appropriate in C.
Edited to add an explanation.
According to the standard, emphasis is mine
7.4
1 The header
<ctype.h>
declares several functions useful for classifying and mapping characters. In all cases the argument is anint
, the value of which shall be representable as anunsigned char
or shall equal the value of the macroEOF
. If the argument has any other value, the behavior is undefined.
The cast to unsigned char
ensures calling isalpha()
does not invoke Undefined Behaviour.
You must pass an int
to isalpha()
, not a char
. Note the standard prototype for this function:
int isalpha(int c);
Passing an 8-bit signed character will cause the value to be converted into a negative integer, resulting in an illegal negative offset into the internal arrays typically used by isxxxx()
.
However you must ensure that your char
is treated as unsigned
when casting - you can't simply cast it directly to an int
, because if it's an 8-bit character the resulting int
would still be negative.
The typical way to ensure this works is to cast it to an unsigned char
, and then rely on implicit type conversion to convert that into an int
.
e.g.
char c = '£';
int a = isalpha((unsigned char) c);
You may be compiling using wchar (UNICODE) as character type, in that case the isalpha method to use is iswalpha
http://msdn.microsoft.com/en-us/library/xt82b8z8.aspx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With