Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php preg_replace remove entire line (from a block of many lines ) if it contains an occurence of a word

Guys ( preg_replace gurus );

I am looking for a preg_replace snippet , that i can use in a php file whereby if a word appears in a particular line, that entire line is deleted/replaced with an empty line

pseudocode:

$unwanted_lines=array("word1","word2"."word3");
$new_block_of_lines=preg_replace($unwanted_lines, block_of_lines);

Thanx.

like image 793
MarcoZen Avatar asked Jul 04 '13 09:07

MarcoZen


2 Answers

The expression

First, let's work out the expression you will need to match the array of words:

/(?:word1|word2|word3)/

The (?: ... ) expression creates a group without capturing its contents into a memory location. The words are separated by a pipe symbol, so that it matches either word.

To generate this expression with PHP, you need the following construct:

$unwanted_words = array("word1", "word2", "word3");
$unwanted_words_match = '(?:' . join('|', array_map(function($word) {
    return preg_quote($word, '/');
}, $unwanted_words)) . ')';

You need preg_quote() to generate a valid regular expression from a regular string, unless you're sure that it's valid, e.g. "abc" doesn't need to be quoted.

See also: array_map() preg_quote()

Using an array of lines

You can split the block of text into an array of lines:

$lines = preg_split('/\r?\n/', $block_of_lines);

Then, you can use preg_grep() to filter out the lines that don't match and produce another array:

$wanted_lines = preg_grep("/$unwanted_words_match/", $lines, PREG_GREP_INVERT);

See also: preg_split() preg_grep()

Using a single preg_replace()

To match a whole line containing an unwanted word inside a block of text with multiple lines, you need to use line anchors, like this:

/^.*(?:word1|word2|word3).*$/m

Using the /m modifier, the anchors ^ and $ match the start and end of the line respectively. The .* on both sides "flush" the expression left and right of the matched word.

One thing to note is that $ matches just before the actual line ending character (either \r\n or \n). If you perform replacement using the above expression it will not replace the line endings themselves.

You need to match those extra characters by extending the expression like this:

/^.*(?:word1|word2|word3).*$(?:\r\n|\n)?/m

I've added (?:\r\n|\n)? behind the $ anchor to match the optional line ending. This is the final code to perform the replacement:

$replace_match = '/^.*' . $unwanted_words_match . '.*$(?:\r\n|\n)?/m';
$result = preg_replace($replace_match, '', $block_of_lines);

Demo

like image 104
Ja͢ck Avatar answered Oct 13 '22 01:10

Ja͢ck


This regular expression can remove the match from a line

$newstring = preg_replace("/^.*word1.*$/", "", $string);
like image 21
DevZer0 Avatar answered Oct 13 '22 01:10

DevZer0