Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP iconv_strlen() meaning question

Tags:

php

iconv

I was wondering what does the following sentence mean in simple terms for us dummies?

And what is byte sequence? And how many characters in a byte?

iconv_strlen() counts the occurrences of characters in the given byte sequence str on the basis of the specified character set, the result of which is not necessarily identical to the length of the string in byte.

like image 759
HELP Avatar asked May 22 '11 04:05

HELP


2 Answers

Let's take for example the Japanese character 'こ'. Assuming UTF-8 encoding, this is a 3 byte character (0xE3 0x81 0x93). Let's see what happens when we use strlen instead:

$ php -r 'echo strlen("こ") . "\n";'
3

The result is 3, since strlen is counting bytes. However, this is only a single character according to UTF-8 encoding. That's where iconv_strlen comes in. It knows that in UTF-8, this is a single character, even though it's made up of 3 bytes. So if we try this instead:

$ php -r 'echo iconv_strlen("こ", "UTF-8") . "\n";'
1

We get 1. That's what that explanation is meant to point out.

like image 150
onteria_ Avatar answered Oct 24 '22 23:10

onteria_


"The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"

like image 21
Ignacio Vazquez-Abrams Avatar answered Oct 24 '22 22:10

Ignacio Vazquez-Abrams