i want to highlight text in a given string with given keywords and add a random number of surrounding words.
Example sentence:
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.
Example keyword:
dolore magna
Desired result: (mark 0-4 words before and after the keyword
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et **dolore magna** aliquyam erat, sed
.
What did i try?
( [\w,\.-\?]+){0,5} ".$myKeyword." (.+ ){2,5}
and
([a-zA-Z,. ]+){1,3} ".$n." ([a-zA-Z,. ]+){1,3}
Any ideas how to improve this and make it more robust?
Do this by double-clicking the blank space, holding it then dragging it to the right. Now, press and hold the CTRL key and select the next one. Repeat the same steps to go over all of the blank spaces that you need to highlight.
For highlighting use preg_replace function. Here's an idea: $s = "dolore magna";
$str = preg_replace(
'/\b(?>[\'\w-]+\W+){0,4}'.preg_quote($s, "/").'(?:\W+[\'\w-]+){0,4}/i',
'<b>$0</b>', $str);
Test the pattern at regex101 or php test at eval.in. echo $str;
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.
Using i
flag for caseless matching - drop if not wanted. First group ?>
atomic for performance.
['\w-]
(\w
shorthand for word character, '
and -
)\W
matches a character, that is not a word character (negated \w
)\b
matches a word boundary. Used it for better performance.I think this would accomplish what you are after. Please see the demo for an explanation of everything the regex is doing, or post a comment if you have a question.
Regex:
((?:[\w,.\-?]+\h){0,5})\b' . . '\b((?:.+\h){2,5})
Demo: https://regex101.com/r/vG8qT2/1
PHP:
<?php
$string = 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.';
$term = 'dolore magna';
$min = 0;
$max = 5;
preg_match('~((?:[\w,.\-?]+\h){'.$min.','.$max. '})\b' . preg_quote($term) . '\b((?:.+\h){'.$min.','.$max.'})~', $string, $matches);
print_r($matches);
Demo: https://eval.in/410063
Note the captured values will be in $matches[1]
and $matches[2]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With