Why did this str_ireplace() work on a non ASCII string?

Question

Note: What I think I know is probably wrong, so please kindly fix my knowledge :)

I just answered a question about UTF-8 and PHP.

I suggested using str_ireplace('Волгоград', '', $a).

I didn't expect this to work, but it did.

I always thought PHP treated one byte as one character, hence why you need to use mb_* functions to get accurate results when using characters outside of ASCII range.

I assumed the Russian characters would take > 1 byte each.

I thought str_replace() would work because the bytes could be matched regardless of whether they are multibyte or not, as long as they are in order.

I thought str_ireplace() would not work because PHP wouldn't know how to map the non ASCII characters to their alternate case equivalent. But, it did work.

Where and how am I wrong? Give me as much information as you can :)

Ignacio Vazquez-Abrams · Accepted Answer

It works by making the text lower case by passing it to the libc functions which are dependent on the locale settings; appropriate settings means that the text will lower case properly if the correct charset is used for the bytes.

Why did this str_ireplace() work on a non ASCII string?

Tags:

php

character-encoding

utf-8

alex

1 Answers

Ignacio Vazquez-Abrams

Recent Activity

Donate For Us

Why did this str_ireplace() work on a non ASCII string?

Tags:

php

character-encoding

utf-8

alex

1 Answers

Ignacio Vazquez-Abrams

Related questions

Recent Activity

Donate For Us