Note: What I think I know is probably wrong, so please kindly fix my knowledge :)
I just answered a question about UTF-8 and PHP.
I suggested using str_ireplace('Волгоград', '', $a)
.
I didn't expect this to work, but it did.
I always thought PHP treated one byte as one character, hence why you need to use mb_*
functions to get accurate results when using characters outside of ASCII range.
I assumed the Russian characters would take > 1 byte each.
I thought str_replace()
would work because the bytes could be matched regardless of whether they are multibyte or not, as long as they are in order.
I thought str_ireplace()
would not work because PHP wouldn't know how to map the non ASCII characters to their alternate case equivalent. But, it did work.
Where and how am I wrong? Give me as much information as you can :)
It works by making the text lower case by passing it to the libc functions which are dependent on the locale settings; appropriate settings means that the text will lower case properly if the correct charset is used for the bytes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With