Guys ( preg_replace gurus );
I am looking for a preg_replace snippet , that i can use in a php file whereby if a word appears in a particular line, that entire line is deleted/replaced with an empty line
pseudocode:
$unwanted_lines=array("word1","word2"."word3");
$new_block_of_lines=preg_replace($unwanted_lines, block_of_lines);
Thanx.
The expression
First, let's work out the expression you will need to match the array of words:
/(?:word1|word2|word3)/
The (?: ... )
expression creates a group without capturing its contents into a memory location. The words are separated by a pipe symbol, so that it matches either word.
To generate this expression with PHP, you need the following construct:
$unwanted_words = array("word1", "word2", "word3");
$unwanted_words_match = '(?:' . join('|', array_map(function($word) {
return preg_quote($word, '/');
}, $unwanted_words)) . ')';
You need preg_quote()
to generate a valid regular expression from a regular string, unless you're sure that it's valid, e.g. "abc"
doesn't need to be quoted.
See also: array_map()
preg_quote()
Using an array of lines
You can split the block of text into an array of lines:
$lines = preg_split('/\r?\n/', $block_of_lines);
Then, you can use preg_grep()
to filter out the lines that don't match and produce another array:
$wanted_lines = preg_grep("/$unwanted_words_match/", $lines, PREG_GREP_INVERT);
See also: preg_split()
preg_grep()
Using a single preg_replace()
To match a whole line containing an unwanted word inside a block of text with multiple lines, you need to use line anchors, like this:
/^.*(?:word1|word2|word3).*$/m
Using the /m
modifier, the anchors ^
and $
match the start and end of the line respectively. The .*
on both sides "flush" the expression left and right of the matched word.
One thing to note is that $
matches just before the actual line ending character (either \r\n
or \n
). If you perform replacement using the above expression it will not replace the line endings themselves.
You need to match those extra characters by extending the expression like this:
/^.*(?:word1|word2|word3).*$(?:\r\n|\n)?/m
I've added (?:\r\n|\n)?
behind the $
anchor to match the optional line ending. This is the final code to perform the replacement:
$replace_match = '/^.*' . $unwanted_words_match . '.*$(?:\r\n|\n)?/m';
$result = preg_replace($replace_match, '', $block_of_lines);
Demo
This regular expression can remove the match from a line
$newstring = preg_replace("/^.*word1.*$/", "", $string);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With