Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split utf8 string into array of chars

Tags:

php

utf-8

I'm trying to split a utf8 encoded string into an array of chars. The function that I now use used to work, but for some reason it doesn't work anymore. What could be the reason. And better yet, how can I fix it?

This is my string:

Zelf heb ik maar één vraag: wie ben jij?

This is my function:

function utf8Split($str, $len = 1)
{
  $arr = array();
  $strLen = mb_strlen($str);
  for ($i = 0; $i < $strLen; $i++)
  {
    $arr[] = mb_substr($str, $i, $len);
  }
  return $arr;
}

This is the result:

Array
(
    [0] => Z
    [1] => e
    [2] => l
    [3] => f
    [4] =>  
    [5] => h
    [6] => e
    [7] => b
    [8] =>  
    [9] => i
    [10] => k
    [11] =>  
    [12] => m
    [13] => a
    [14] => a
    [15] => r
    [16] =>  
    [17] => e
    [18] => ́
    [19] => e
    [20] => ́
    [21] => n
    [22] =>  
    [23] => v
    [24] => r
    [25] => a
    [26] => a
    [27] => g
    [28] => :
    [29] =>  
    [30] => w
    [31] => i
    [32] => e
    [33] =>  
    [34] => b
    [35] => e
    [36] => n
    [37] =>  
    [38] => j
    [39] => i
    [40] => j
    [41] => ?
)
like image 551
tersmitten Avatar asked Nov 28 '22 02:11

tersmitten


1 Answers

This is the best solution!:

I've found this nice solution in the PHP manual pages.

preg_split('//u', $str, null, PREG_SPLIT_NO_EMPTY);

It works really fast:

In PHP 5.6.18 it split a 6 MB big text file in a matter of seconds.

Best of all. It doesn't need MultiByte (mb_) support!

Similar answer also here.

like image 136
Yani2000 Avatar answered Dec 08 '22 09:12

Yani2000