Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do regular expressions in Java and Perl act differently?

Tags:

java

regex

perl

My understanding is that Java's implementation of regular expressions is based on Perl's. However, in the following example, if I execute the same regex with the same string, Java and Perl return different results.

Here's the Java example:

public class RegexTest {
    public static void main( String args[] ) {
        String sentence = "This is a test of regular expressions.";
        System.out.println( sentence.matches( "\\w" ) ? "Matches" : "Doesn't match" );
    }
}

This returns: Doesn't match

Here's the Perl example:

my $sentence = 'This is a test of regular expressions.';
print ( $sentence =~ /\w/ ? "Matches" : "Doesn't match" ) . "\n";

This returns: Matches

To me, the Perl result makes sense. It looks for a match for a single word character. I don't understand why Java doesn't consider it a match. What's the reason for the difference?

like image 690
Phillip Lemky Avatar asked Apr 24 '09 02:04

Phillip Lemky


People also ask

Is regular expression same in all programming languages?

Regular expression synax varies slightly between languages but for the most part the details are the same. Some regex implementations support slightly different variations on how they process as well as what certain special character sequences mean.

What regex does Perl use?

Perl uses Perl regular expressions, not POSIX ones. You can compare the syntaxes yourself, for example in regex(7) .

Can regular expressions be used in Java?

Regular expressions can be used to perform all types of text search and text replace operations. Java does not have a built-in Regular Expression class, but we can import the java.util.regex package to work with regular expressions.

Are Java and JavaScript regex the same?

There is a difference between Java and JavaScript regex flavors: JS does not support lookbehind. A tabulation of differences between regex flavors can be found on Wikipedia. However, this does not apply to your case.


2 Answers

The Java matches method is testing whether the regex matches the entire String. To test whether a regex can be found anywhere in a string, create a Matcher and use its find method.

like image 104
erickson Avatar answered Sep 21 '22 17:09

erickson


Additionally, the Perl regex syntax is NOT the Java Regex Syntax.

It doesn't apply necessarily in this case, but this is a more answer to your more general question.

Java has a regular expression syntax known as "PCRE", ie: Perl Compatible.

This name is however grossly misleading, because there is very very little about it which is really Perl compatible.

For instance, Perl regular expressions permit executing code in the expression itself, and lots of other advanced operators, and some syntax are different in Perl as they are in other languages ( ie: many languages use \> and \< as word boundary markers, but Perl just uses '\b' )

Spend a few minutes to read some of the PerlRe Documentation and you'll discover lots of awesome tricks that Perl's regular expression engine can do that nothing else seems to do.

like image 38
Kent Fredric Avatar answered Sep 20 '22 17:09

Kent Fredric