Some PHP string functions (like strtoupper, etc) are locale dependent. But it is still not clear whether locale is important when I do really know that particular string is made of ASCII (0-127) characters only. Can I be guaranteed that strtoupper('abc..xyz')
will always return ABC..XYZ
independently of locale. Do PHP string functions work the same in ASCII range independently of locale?
While the answer about strtoupper
is important to me, the question is more general about all string functions library.
I want to be sure that user selected locale (on a multi-language site) will not break my core functionality which has nothing to do with internationalization.
Do PHP string functions work the same in ASCII range independent from locale?
No, I'm afraid not. The primary counterexample is the dreaded Turkish dotted-I:
setlocale(LC_CTYPE, "tr_TR");
echo strtoupper('hi!');
-> 'H\xDD!' ('Hİ!' in ISO-8859-9)
In the worst case you may have to provide your own locale-independent string handling. Calling setlocale
to revert to C
or some other locale is kind-of a fix, but the POSIX process-level locale model is a really bad fit for modern client/server apps.
PHP string functions treat one byte as one character. In the ASCII range 0-127
that is fine.
To safely handle multiple languages using UTF-8, use mb_*()
functions, a UTF-8 library or wait til 2030 when PHP6 is released.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With