I'm trying to split a utf8 encoded string into an array of chars. The function that I now use used to work, but for some reason it doesn't work anymore. What could be the reason. And better yet, how can I fix it?
This is my string:
Zelf heb ik maar één vraag: wie ben jij?
This is my function:
function utf8Split($str, $len = 1)
{
$arr = array();
$strLen = mb_strlen($str);
for ($i = 0; $i < $strLen; $i++)
{
$arr[] = mb_substr($str, $i, $len);
}
return $arr;
}
This is the result:
Array
(
[0] => Z
[1] => e
[2] => l
[3] => f
[4] =>
[5] => h
[6] => e
[7] => b
[8] =>
[9] => i
[10] => k
[11] =>
[12] => m
[13] => a
[14] => a
[15] => r
[16] =>
[17] => e
[18] => ́
[19] => e
[20] => ́
[21] => n
[22] =>
[23] => v
[24] => r
[25] => a
[26] => a
[27] => g
[28] => :
[29] =>
[30] => w
[31] => i
[32] => e
[33] =>
[34] => b
[35] => e
[36] => n
[37] =>
[38] => j
[39] => i
[40] => j
[41] => ?
)
This is the best solution!:
I've found this nice solution in the PHP manual pages.
preg_split('//u', $str, null, PREG_SPLIT_NO_EMPTY);
It works really fast:
In PHP 5.6.18 it split a 6 MB big text file in a matter of seconds.
Best of all. It doesn't need MultiByte (mb_) support!
Similar answer also here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With