Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

strlen() php function giving the wrong length of unicode characters [duplicate]

Tags:

php

strlen

I am trying to get the length of this unicode characters string

$text = 'نام سلطان م';
$length = strlen($text);
echo $length;

output

20

How it determines the length of unicode characters string?

like image 920
Munib Avatar asked Apr 05 '13 08:04

Munib


People also ask

What is strlen () used for in PHP?

The strlen() function returns the length of a string.

Does strlen count special characters?

The strlen() function in C returns an integer with the length of the string excluding the NULL character. The strlen() function counts the alphabets, whitespaces, special symbols, and numbers until it encounters the NULL character in the string.

How do you check strlen?

Use the strlen() function provided by the C standard library string. h header file. char name[7] = "Flavio"; strlen(name); This function will return the length of a string as an integer value.

What is returns strlen?

The strlen() function returns the length of string.


3 Answers

strlen() is not handling multibyte characters correctly, as it assumes 1 char equals 1 byte, which is simply invalid for unicode. This behavior is clearly documented:

strlen() returns the number of bytes rather than the number of characters in a string.

The solution is to use mb_strlen() function instead (mb stands for multi byte) (see mb_strlen() docs).

EDIT

If for any reason change in code is not possible/doable, one may want to ensure string functions are automatically overloaded by multi-byte counterparts:

To use function overloading, set mbstring.func_overload in php.ini to a positive value that represents a combination of bitmasks specifying the categories of functions to be overloaded. It should be set to 1 to overload the mail() function. 2 for string functions, 4 for regular expression functions. For example, if it is set to 7, mail, strings and regular expression functions will be overloaded.

This is supported by PHP and documented here (note this feature is deprecated in PHP 7.2 and newer).

Please note that you may also need to edit your php.ini to ensure mb_string module is enabled. Available settings are documented here.

like image 85
Marcin Orlowski Avatar answered Oct 19 '22 15:10

Marcin Orlowski


You are looking for mb_strlen.

like image 4
Jon Avatar answered Oct 19 '22 16:10

Jon


Function strlnen does not count the number of characters, but the number of bytes. For multibyte characters it will return higher numbers.
Use mb_strlen() instead to count the actual count of characters.

like image 3
Tomáš Zato - Reinstate Monica Avatar answered Oct 19 '22 14:10

Tomáš Zato - Reinstate Monica