Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php explode: split string into words by using space a delimiter

$str = "This is a    string";
$words = explode(" ", $str);

Works fine, but spaces still go into array:

$words === array ('This', 'is', 'a', '', '', '', 'string');//true

I would prefer to have words only with no spaces and keep the information about the number of spaces separate.

$words === array ('This', 'is', 'a', 'string');//true
$spaces === array(1,1,4);//true

Just added: (1, 1, 4) means one space after the first word, one space after the second word and 4 spaces after the third word.

Is there any way to do it fast?

Thank you.

like image 524
Haradzieniec Avatar asked Sep 05 '13 14:09

Haradzieniec


3 Answers

For splitting the String into an array, you should use preg_split:

$string = 'This is a    string';
$data   = preg_split('/\s+/', $string);

Your second part (counting spaces):

$string = 'This is a    string';
preg_match_all('/\s+/', $string, $matches);
$result = array_map('strlen', $matches[0]);// [1, 1, 4]
like image 150
Alma Do Avatar answered Oct 17 '22 00:10

Alma Do


Here is one way, splitting the string and running a regex once, then parsing the results to see which segments were captured as the split (and therefore only whitespace), or which ones are words:

$temp = preg_split('/(\s+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

$spaces = array();
$words = array_reduce( $temp, function( &$result, $item) use ( &$spaces) {
    if( strlen( trim( $item)) === 0) {
        $spaces[] = strlen( $item);
    } else {
        $result[] = $item;
    }
    return $result;
}, array());

You can see from this demo that $words is:

Array
(
    [0] => This
    [1] => is
    [2] => a
    [3] => string
)

And $spaces is:

Array
(
    [0] => 1
    [1] => 1
    [2] => 4
)
like image 39
nickb Avatar answered Oct 16 '22 23:10

nickb


You can use preg_split() for the first array:

$str   = 'This is a    string';
$words = preg_split('#\s+#', $str);

And preg_match_all() for the $spaces array:

preg_match_all('#\s+#', $str, $m);
$spaces = array_map('strlen', $m[0]);
like image 32
silkfire Avatar answered Oct 17 '22 01:10

silkfire