Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

highlight text with surrounding words

Tags:

regex

php

i want to highlight text in a given string with given keywords and add a random number of surrounding words.

Example sentence:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.

Example keyword:

dolore magna

Desired result: (mark 0-4 words before and after the keyword

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et **dolore magna** aliquyam erat, sed.

What did i try?

( [\w,\.-\?]+){0,5} ".$myKeyword." (.+ ){2,5} and ([a-zA-Z,. ]+){1,3} ".$n." ([a-zA-Z,. ]+){1,3}

Any ideas how to improve this and make it more robust?

like image 989
pila Avatar asked Aug 01 '15 13:08

pila


People also ask

How do you highlight text with spaces?

Do this by double-clicking the blank space, holding it then dragging it to the right. Now, press and hold the CTRL key and select the next one. Repeat the same steps to go over all of the blank spaces that you need to highlight.


2 Answers

For highlighting use preg_replace function. Here's an idea:   $s = "dolore magna";

$str = preg_replace(
       '/\b(?>[\'\w-]+\W+){0,4}'.preg_quote($s, "/").'(?:\W+[\'\w-]+){0,4}/i',
       '<b>$0</b>', $str);

Test the pattern at regex101 or php test at eval.in.   echo $str;

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.

Using i flag for caseless matching - drop if not wanted. First group ?> atomic for performance.

  • As word character I used ['\w-] (\w shorthand for word character, ' and -)
  • \W matches a character, that is not a word character (negated \w)
  • \b matches a word boundary. Used it for better performance.
like image 63
Jonny 5 Avatar answered Sep 29 '22 06:09

Jonny 5


I think this would accomplish what you are after. Please see the demo for an explanation of everything the regex is doing, or post a comment if you have a question.

Regex:

((?:[\w,.\-?]+\h){0,5})\b' . . '\b((?:.+\h){2,5})

Demo: https://regex101.com/r/vG8qT2/1

PHP:

<?php
$string = 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.';
$term = 'dolore magna';
$min = 0;
$max = 5;
preg_match('~((?:[\w,.\-?]+\h){'.$min.','.$max. '})\b' . preg_quote($term) . '\b((?:.+\h){'.$min.','.$max.'})~', $string, $matches);
print_r($matches);

Demo: https://eval.in/410063

Note the captured values will be in $matches[1] and $matches[2].

like image 21
chris85 Avatar answered Sep 29 '22 06:09

chris85