Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the Java equivalent to this preg_replace?

<?php
    $str = "word <a href=\"word\">word</word>word word";
    $str = preg_replace("/word(?!([^<]+)?>)/i","repl",$str);
    echo $str;
    # repl <word word="word">repl</word>
?>

source: http://pureform.wordpress.com/2008/01/04/matching-a-word-characters-outside-of-html-tags/

Unfortunality my project needs a semantic libs avaliable only for Java...

// Thanks Celso

like image 729
celsowm Avatar asked Jul 22 '10 00:07

celsowm


3 Answers

Use the String.replaceAll() method:

class Test {
  public static void main(String[] args) {
    String str = "word <a href=\"word\">word</word>word word";
    str = str.replaceAll("word(?!([^<]+)?>)", "repl");
    System.out.println(str);
  }
}

Hope this helps.

like image 67
kolrie Avatar answered Nov 16 '22 18:11

kolrie


To translate that regex for use in Java, all you have to do is get rid of the / delimiters and change the trailing i to an inline modifier, (?i). But it's not a very good regex; I would use this instead:

(?i)word(?![^<>]++>)

According to RegexBuddy's Debug feature, when it tries to match the word in <a href="word">, the original regex requires 23 steps to reject it, while this one takes only seven steps. The actual Java code is

str = str.replaceAll("(?i)word(?![^<>]++>)", "repl");
like image 36
Alan Moore Avatar answered Nov 16 '22 17:11

Alan Moore


Before providing a further answer, are you trying to parse an html document? If so, don't use regexes, use an html parser.

like image 23
Zak Avatar answered Nov 16 '22 18:11

Zak