I was wondering what does the following sentence mean in simple terms for us dummies?
And what is byte sequence? And how many characters in a byte?
iconv_strlen() counts the occurrences of characters in the given byte sequence str on the basis of the specified character set, the result of which is not necessarily identical to the length of the string in byte.
Let's take for example the Japanese character 'こ'. Assuming UTF-8 encoding, this is a 3 byte character (0xE3 0x81 0x93). Let's see what happens when we use strlen
instead:
$ php -r 'echo strlen("こ") . "\n";'
3
The result is 3, since strlen
is counting bytes. However, this is only a single character according to UTF-8 encoding. That's where iconv_strlen
comes in. It knows that in UTF-8, this is a single character, even though it's made up of 3 bytes. So if we try this instead:
$ php -r 'echo iconv_strlen("こ", "UTF-8") . "\n";'
1
We get 1. That's what that explanation is meant to point out.
"The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With