Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences in regex patterns between JavaScript and Java?

In JavaScript, I have the following:

function replaceURLWithHTMLLinks(text) {
    var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
    return text.replace(exp,"<a href='$1'>$1</a>"); 
}

It replaces all of the URLs in the input string with a version of the URL that has an anchor tag wrapped around it to turn it into a link. I'm trying to duplicate this functionality in Java with the following function:

private String replaceURLWithHTMLLinks(String text) {
    String pattern = "/(\\b(https?|ftp|file):\\/\\/[-A-Z0-9+&@#\\/%?=~_|!:,.;]*[-A-Z0-9+&@#\\/%=~_|])/i";
    return text.replaceAll(pattern, "<a href=\"$1\">$1</a>");
}

However, while it works fine in JavaScript it doesn't find any matches in Java, even for the same input string. Do I need to change something in the pattern, or what's going on?

like image 449
Alex Zylman Avatar asked Dec 25 '11 09:12

Alex Zylman


2 Answers

You need to get rid of the slashes around the expression and the i at the end for the Java example. You can specify the i flag separately. So JavaScript's /blarg/i would be turned into "(?i)blarg".

Your code would become something like:

private String replaceURLWithHTMLLinks(String text) {
  String pattern = "(?i)(\\b(https?|ftp|file):\\/\\/[-A-Z0-9+&@#\\/%?=~_|!:,.;]*[-A-Z0-9+&@#\\/%=~_|])";
  return text.replaceAll(pattern, "<a href=\"$1\">$1</a>");
}
like image 124
Tikhon Jelvis Avatar answered Sep 30 '22 17:09

Tikhon Jelvis


That is normal: Java's Pattern does not work this way.

Your regex is compatible with both engines, however you do not specify modifiers this way with Java.

Do:

Pattern pattern = Pattern.compile("\\b(https?|ftp|file):\\/\\/[-A-Z0-9+&@#\\/%?=~_|!:,.;]*[-A-Z0-9+&@#\\/%=~_|])", Pattern.CASE_INSENSITIVE);
retrun pattern.matcher(text).replaceAll("<a href=\"$1\">$1</a>");
like image 36
fge Avatar answered Sep 30 '22 17:09

fge