Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search and Replace Words in HTML

Tags:

what I'm trying to do is make a 'jargon buster'. Basically I have some html and some glossary terms in a database. When the person clicks on jargon buster it replaces the words in the text with a nice tooltip (wztooltip) which shows them the meanings.

I've been trying hard on this one and been looking heavily at this question Regex / DOMDocument - match and replace text not in a link

and it seems like the answer lies in the simple_html_dom libs but I'm having trouble getting it to work. Obviously any words already linked don't get touched. Here is a strip down of what I've got.

$html = str_get_html($article['content']);  $query_glossary = "SELECT word,glossary_term_id,info FROM glossary_terms WHERE status = 1  ORDER BY LENGTH(word) DESC"; $result_glossary = mysql_query_run($query_glossary);  while($glossary = mysql_fetch_array($result_glossary)) {     $glossary_link = SITEURL.'/glossary/term/'.string_to_url($glossary['word']).'-'.$glossary['glossary_term_id'];     if(strlen($glossary['info'])>400) {         $glossary_info = substr(strip_tags($glossary['info']),0,350).' ...<br /> <a href="'.$glossary_link.'">Read More</a>';     }     else {         $glossary_info = $glossary['info'];     }     $glossary_tip = 'href="javascript:;" onmouseout="UnTip();" class="article_jargon_highligher" onmouseover="'.tooltip_javascript('<a href="'.$glossary_link.'">'.$glossary['word'].'</a>',$glossary_info,400,1,0,1).'"';     $glossary_word = $glossary['word'];     $glossary_word = preg_quote($glossary_word,'/');      //once done we can replace the words with a nice tip         foreach ($html->find('text') as $element) {         if (!in_array($element->parent()->tag,array())) {             //problems are case aren't taken into account and grammer             $element->innertext = str_ireplace(''.$glossary['word'].' ',' <a '.$glossary_tip.' >'.$glossary['word'].'</a> ', $element->innertext);             //$element->innertext = str_ireplace(''.$glossary['word'].',',' <a '.$glossary_tip.'>'.$glossary['word'].'</a> ', $element->innertext);            //$element->innertext = preg_replace ("/\s(".$glossary_word.")\s/ise","nothing(' <a'.'$glossary_tip.'>'.'$1'.'</a> ')" , $element->innertext);           // $element->innertext = str_replace('__glossary_tip_replace__',$glossary_tip, $element->innertext);         }     } } $article['content'] = $html->save(); 
like image 206
Richard Housham Avatar asked Jun 29 '11 12:06

Richard Housham


People also ask

How do I search and replace text?

Alternatively, you can press Ctrl+H on your keyboard. The Find and Replace dialog box will appear. Type the text you want to find in the Find what: field. Type the text you want to replace it with in the Replace with: field.

How do I search for a word in HTML?

Select some text in your HTML Editor (1) and click Add to Find button (2). Word to HTML will automatically add selected text into find input field (3). The same can be done with Ctrl+C (copy) and Ctrl+V (paste).


1 Answers

Use the inverted word character \W to select for any characters other than numbers and letters in your regex pattern. Because this would still fail at the boundaries of the text blob, you would also need to test those conditions as well. Thus using the word 'term' as the text you are searching for:

(^term$)|(^term\W)|(\Wterm\W)|(\Wterm$) 

The first condition checks to make sure that term isn't the only contents of the blob, the second checks if its the first word, the third if it contained within the blob, and the last if its the last word.

If you want to consider any other characters as word characters (say a hyphen) you would need to repace the \W with [^\w\-].

Hope this helps. There are probably optimizations that can performed as well, but this should at least be a good starting point.

like image 66
Rodaine Avatar answered Oct 19 '22 09:10

Rodaine