Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

utf8 string length

Tags:

string

php

utf-8

strlen() function in php could not return correctly string lenght of utf8 chars, for example سلام is 4 char but after using strlen thats return 8 chr

<?php
echo strlen('سلام');
?>
like image 974
DolDurma Avatar asked Nov 22 '12 08:11

DolDurma


2 Answers

The core PHP string functions all assume 1 character = 1 byte. They have no concept of different encodings. To figure out how many characters are in a UTF-8 string (not how many bytes), use the mb_strlen equivalent and tell it what encoding the string is in:

echo mb_strlen('سلام', 'UTF-8');
like image 94
deceze Avatar answered Sep 23 '22 03:09

deceze


You can get the number of UTF-8 Codepoints inside a binary PHP string (as long as it is valid UTF-8 encoded) (Demo):

$length = preg_match_all('(.)su', $subject);

You can also use the multibyte extension if you have it installed:

$length = mb_strlen($subject, 'UTF-8');

See also: PHP UTF-8 String Length

like image 40
hakre Avatar answered Sep 24 '22 03:09

hakre