Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keep HTMLformat after replace some text (using PHP and JS)

I would like modify HTML like

I am <b>Sadi, novice</b> programmer.

to

I am <b>Sadi, learner</b> programmer.

To do it I will search using a string "novice programmer". How can I do it please? Any idea?

It search using more than one word "novice programmer". It could be a whole sentence. The extra white space (e.g. new line, tab) should be ignored and any tag must be ignored during the search. But during the replacement tag must be preserved.

It is a sort of converter. It will be better if it is case insensitive.

Thank you

Sadi


More clarification:

I get some nice reply with possible solution. But please keep posting if you have any idea in mind.

I would like to more clarify the problem just in case anyone missed it. Main post shows the problem as an example scenario.

1) Now the problem is find and replace some string without considering the tags. The tags may shows up within a single word. String may contain multiple word. Tag only appear in the content string or the document. The search phrase never contain any tags.

We can easily remove all tags and do some text operation. But here the another problem shows up.

2) The tags must be preserve, even after replacing the text. That is what the example shows.

Thank you Again for helping

like image 900
Sadi Avatar asked Apr 01 '10 21:04

Sadi


2 Answers

ok i think this is what you want. it takes your input search and replace, splits them into arrays of strings delimited by space, generates a regexp that finds the input sentence with any number of whitespace/html tags, and replaces it with the replacement sentence with the same tags replaced between the words.

if the wordcount of the search sentence is higher than that of the replacement, it just uses spaces between any extra words, and if the replacement wordcount is higher than the search, it will add all 'orphaned' tags on the end. it also handles regexp chars in the find and replace.

<?php
function htmlFriendlySearchAndReplace($find, $replace, $subject) {
    $findWords = explode(" ", $find);
    $replaceWords = explode(" ", $replace);

    $findRegexp = "/";
    for ($i = 0; $i < count($findWords); $i++) {
        $findRegexp .= preg_replace("/([\\$\\^\\|\\.\\+\\*\\?\\(\\)\\[\\]\\{\\}\\\\\\-])/", "\\\\$1", $findWords[$i]);
        if ($i < count($findWords) - 1) {
            $findRegexp .= "(\s?(?:<[^>]*>)?\s(?:<[^>]*>)?)";
        }
    }
    $findRegexp .= "/i";

    $replaceRegexp = "";
    for ($i = 0; $i < count($findWords) || $i < count($replaceWords); $i++) {
        if ($i < count($replaceWords)) {
            $replaceRegexp .= str_replace("$", "\\$", $replaceWords[$i]);
        }
        if ($i < count($findWords) - 1) {
            $replaceRegexp .= "$" . ($i + 1);
        } else {
            if ($i < count($replaceWords) - 1) {
                $replaceRegexp .= " ";
            }
        }
    }

    return preg_replace($findRegexp, $replaceRegexp, $subject);
}
?>

here are the results of a few tests :

Original : <b>Novice Programmer</b>
Search : Novice Programmer
Replace : Advanced Programmer
Result : <b>Advanced Programmer</b>

Original : Hi, <b>Novice Programmer</b>
Search : Novice Programmer
Replace : Advanced Programmer
Result : Hi, <b>Advanced Programmer</b>

Original : I am not a <b>Novice</b> Programmer
Search : Novice Programmer
Replace : Advanced Programmer
Result : I am not a <b>Advanced</b> Programmer

Original : Novice <b>Programmer</b> in the house
Search : Novice Programmer
Replace : Advanced Programmer
Result : Advanced <b>Programmer</b> in the house

Original : <i>I am not a <b>Novice</b> Programmer</i>
Search : Novice Programmer
Replace : Advanced Programmer
Result : <i>I am not a <b>Advanced</b> Programmer</i>

Original : I am not a <b><i>Novice</i> Programmer</b> any more
Search : Novice Programmer
Replace : Advanced Programmer
Result : I am not a <b><i>Advanced</i> Programmer</b> any more

Original : I am not a <b><i>Novice</i></b> Programmer any more
Search : Novice Programmer
Replace : Advanced Programmer
Result : I am not a <b><i>Advanced</i></b> Programmer any more

Original : I am not a Novice<b> <i> </i></b> Programmer any more
Search : Novice Programmer
Replace : Advanced Programmer
Result : I am not a Advanced<b> <i> </i></b> Programmer any more

Original : I am not a Novice <b><i> </i></b> Programmer any more
Search : Novice Programmer
Replace : Advanced Programmer
Result : I am not a Advanced <b><i> </i></b> Programmer any more

Original : <i>I am a <b>Novice</b> Programmer</i> too, now
Search : Novice Programmer too
Replace : Advanced Programmer
Result : <i>I am a <b>Advanced</b> Programmer</i> , now

Original : <i>I am a <b>Novice</b> Programmer</i>, now
Search : Novice Programmer
Replace : Advanced Programmer Too
Result : <i>I am a <b>Advanced</b> Programmer Too</i>, now

Original : <i>I make <b>No money</b>, now</i>
Search : No money
Replace : Mucho$1 Dollar$
Result : <i>I make <b>Mucho$1 Dollar$</b>, now</i>

Original : <i>I like regexp, you can do [A-Z]</i>
Search : [A-Z]
Replace : [Z-A]
Result : <i>I like regexp, you can do [Z-A]</i>
like image 135
chris Avatar answered Oct 05 '22 21:10

chris


I would do this:

if (preg_match('/(.*)novice((?:<.*>)?\s(?:<.*>)?programmer.*)/',$inString,$attributes) {
  $inString = $attributes[1].'learner'.$attributes[2];
}

It should match any of the following:

novice programmer
novice</b> programmer
novice </b>programmer
novice<span> programmer

A test version of what the regex states would be something like: Match any set of characters until you reach "novice" and put it into a capturing group, then maybe match something that starts with a '<' and has any number of characters after it and then ends with '>' (but don't capture it), but then there only match something with a white space and then maybe match again something that starts with a '<' and has any number of characters after it and then ends with '>' (but don't capture it) which must then be followed by programmer followed by any number of characters and put that into a capture group.

I would do some specific testing though, as I may have missed some stuff. Regex is a programmers best friend!

like image 33
Kitson Avatar answered Oct 05 '22 23:10

Kitson