I'm fairly new to programming and was just wondering by why this code:
for ( ; *p; ++p) *p = tolower(*p);
works to lower a string case in c, when p points to a string?
In general, this code:
for ( ; *p; ++p) *p = tolower(*p);
does not
” works to lower a string case in c, when p points to a string?
It does work for pure ASCII, but since char
usually is a signed type, and since tolower
requires a non-negative argument (except the special value EOF
), the piece will in general have Undefined Behavior.
To avoid that, cast the argument to unsigned char
, like this:
for ( ; *p; ++p) *p = tolower( (unsigned char)*p );
Now it can work for single-byte encodings like Latin-1, provided you have set the correct locale via setlocale
, e.g. setlocale( LC_ALL, "" );
. However, note that very common UTF-8 encoding is not a single byte per character. To deal with UTF-8 text you can convert it to a wide string and lowercase that.
Details:
*p
is an expression that denotes the object that p
points to, presumably a char
.
As a continuation condition for the for
loop, any non-zero char
value that *p
denotes, has the effect of logical True, while the zero char
value at the end of the string has the effect of logical False, ending the loop.
++p
advances the pointer to point to the next char
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With