Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP get position of each first character in a string into an array

Given a string, for example:

$string = "  this     is   a   string  ";

What is the best approach to return an csv array containing one number for each word that represents its first characters position like this:

$string = "  this     is   a   string  ";
             ^        ^    ^   ^
             2        11   16  20

Ideally the output would just be an array:

2,11,16,20

So far, here is what I have but I think this is a bit over my head given my limited skills:

$string = "  this     is   a   string  ";
$string = rtrim($string); //just trim the right sides spaces
$len = strlen($string);
$is_prev_white = true;
$result = "";
for( $i = 0; $i <= $len; $i++ ) {
    $char = substr( $string,$i,1);
    if(!preg_match("/\s/", $char) AND $prev_white){
        $result .= $i.",";
        $prev_white = false;
    }else{
        $prev_white = true;
    }   
}
echo $result;

I am getting: 2,4,11,16,20,22,24,26

like image 926
Shawn Cooke Avatar asked Feb 25 '16 17:02

Shawn Cooke


2 Answers

Simple, but progressive :) solution with preg_match_all and array_walk functions: Use preg_match_all function with PREG_OFFSET_CAPTURE flag:

PREG_OFFSET_CAPTURE : If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the value of matches into an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

$string = "  this     is   a   string  ";   // subject
preg_match_all("/\b\w+\b/iu", $string, $matches, PREG_OFFSET_CAPTURE);

array_walk($matches[0], function(&$v){   // filter string offsets
    $v = $v[1];
});
var_dump($matches[0]);

// the output:
array (size=4)
  0 => int 2
  1 => int 11
  2 => int 16
  3 => int 20

http://php.net/manual/en/function.preg-match-all.php

http://php.net/manual/en/function.array-walk.php

like image 112
RomanPerekhrest Avatar answered Nov 02 '22 22:11

RomanPerekhrest


You want the PREG_OFFSET_CAPTURE flag:

$string = "   this     is   a   string  ";
preg_match_all('/(?:^|\s)([^\s])/', $string, $matches, PREG_OFFSET_CAPTURE);

$result = $matches[1];

echo var_dump($result);

The regex is:

(?:^|\s) // Matches white space or the start of the string (non capturing group)
(^\s) // Matches anything *but* white space (capturing group)

Passing PREG_OFFSET_CAPTURE makes preg_match() or preg_match_all() return matches as two-element arrays that contain both the matching string and that match's index inside the searched string. The result of the above code is:

array(4) { 
    [0]=> array(2) { [0]=> string(1) "t" [1]=> int(2) } 
    [1]=> array(2) { [0]=> string(1) "i" [1]=> int(11) } 
    [2]=> array(2) { [0]=> string(1) "a" [1]=> int(16) } 
    [3]=> array(2) { [0]=> string(1) "s" [1]=> int(20) } 
}

So you could get the array of just the indexes with

$firstChars = array_column($result, 1);
like image 25
AmericanUmlaut Avatar answered Nov 02 '22 20:11

AmericanUmlaut