Get line number from preg_match_all()

Question

I'm using PHP's preg_match_all() to search a string imported using file_get_contents(). The regex returns matches but I would like to know at which line number those matches are found. What's the best technique to achieve this?

I could read the file as an array and perform the regex for each line, but the problem is that my regex matches results across carriage returns (new lines).

Javier · Accepted Answer

well it's kinda late, maybe you alrady solved this, but i had to do it and it's fairly simple. using PREG_OFFSET_CAPTURE flag in preg_match will return the character position of the match. lets assume $charpos, so

list($before) = str_split($content, $charpos); // fetches all the text before the match

$line_number = strlen($before) - strlen(str_replace("
", "", $before)) + 1;

voilá!

Mihai Toader · Answer

You can't do this with only regexs. At least not cleanly. What can you do it to use the PREG_OFFSET_CAPTURE flag of the preg_match_all and do a post parsing of the entire file.

I mean after you have the array of matches strings and starting offsets for each string just count how many or or are between the beginning of the file and the offset for each match. The line number of the match would be the number of distinct EOL terminators ( | | ) plus 1.

B Brendler · Answer

$data = "Abba
Beegees
Beatles";

preg_match_all('/Abba|Beegees|Beatles/', $data, $matches, PREG_OFFSET_CAPTURE);
foreach (current($matches) as $match) {
    $matchValue = $match[0];
    $lineNumber = substr_count(mb_substr($data, 0, $match[1]), PHP_EOL) + 1;

    echo "`{$matchValue}` at line {$lineNumber}
";
}

Output

`Abba` at line 1
`Beegees` at line 2
`Beatles` at line 3

(check your performance requirements)

iquito · Answer

Using preg_match_all with the PREG_OFFSET_CAPTURE flag is necessary to solve this problem, the code comments should explain what kind of array preg_match_all returns and how the line numbers can be calculated:

// Given string to do a match with
$string = "

abc
whatever

def";

// Match "abc" and "def" in a string
if(preg_match_all("#(abc).*(def)#si", $string, $matches, PREG_OFFSET_CAPTURE)) {
  // Now $matches[0][0][0] contains the complete matching string
  // $matches[1][0][0] contains the results for the first substring (abc)
  // $matches[2][0][0] contains the results for the second substring (def)
  // $matches[0][0][1] contains the string position of the complete matching string
  // $matches[1][0][1] contains the string position of the first substring (abc)
  // $matches[2][0][1] contains the string position of the second substring (def)

  // First (abc) match line number
  // Cut off the original string at the matching position, then count
  // number of line breaks (
) for that subset of a string
  $line = substr_count(substr($string, 0, $matches[1][0][1]), "
") + 1;
  echo $line . "
";

  // Second (def) match line number
  // Cut off the original string at the matching position, then count
  // number of line breaks (
) for that subset of a string
  $line = substr_count(substr($string, 0, $matches[2][0][1]), "
") + 1;
  echo $line . "
";
}

This will return 3 for the first substring and 6 for the second substring. You can change to or if you use different newlines.

Get line number from preg_match_all()

Tags:

regex

php

bart

4 Answers

Javier

Mihai Toader

B Brendler

iquito

Recent Activity

Donate For Us

Get line number from preg_match_all()

Tags:

regex

php

bart

4 Answers

Javier

Mihai Toader

B Brendler

iquito

Related questions

Recent Activity

Donate For Us