Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java comparing two Pattern objects

Tags:

java

regex

Is there an easy way to compare two Pattern objects?

I have a Pattern which compiled using the regex "//" to check for comments in a code.

Since there are several regex to describe comments, I want to find a way to difference them.

How can it be done? the Pattern class does not implements the equals method.

like image 208
La bla bla Avatar asked Apr 07 '12 13:04

La bla bla


People also ask

How do I compare two fields of objects in Java?

reflect. Field is used to compare two field objects. This method compares two field objects and returns true if both objects are equal otherwise false. The two Field objects are considered equal if and only if when they were declared by the same class and have the same name and type.

Can you compare two objects in Java?

In Java, the == operator compares that two references are identical or not. Whereas the equals() method compares two objects. Objects are equal when they have the same state (usually comparing variables). Objects are identical when they share the class identity.

Is there pattern matching in Java?

Pattern matching has modified two syntactic elements of the Java language: the instanceof keyword and switch statements. They were both extended with a special kind of patterns called type patterns. There is more to come in the near future.

Which of the following methods checks if two object references point to the same object?

equals() versus == The == operator compares whether two object references point to the same object. For example: System.


3 Answers

You can compare Pattern objects by comparing the result of calling pattern() or toString but this doesn't do what you want (if I understand your question correctly). Specifically, this compares the strings that were passed to the Pattern.compile(...) factory method. However, this takes no account of flags passed separately to the pattern string.

There is no simple way to test if two non-identical regexes are equivalent. For example ".+" and "..*" represent equivalent regexes, but there is no straight-forward way to determine this using the Pattern API.


I don't know if the problem is theoretically solvable ... in the general case. @Akim comments:

There is no finite axiomatization to regex equivalence, so the short answer is "this is not doable by tree transformations of the regexes themselves". However one can compare the languages of two automata (test their equality), so one can compute whether two regexes are equivalent. Note that I'm referring to the "genuine" regexes, with no extensions such as back-references to capture groups, which escape the realm of rational languages, i.e., that of automata.


I also want to comment on the accepted answer. The author provides some code that he claims shows that Pattern's equals method is inherited from Object. In fact, the output he is seeing is consistent with that ... but it doesn't show it.

The correct way to know if this is the case is to look at the javadoc ... where the equals method is listed in the list of inherited methods. That is definitive.

So why doesn't the example show what the author says it shows?

  1. It is possible for two methods to behave the same way, but be implemented differently. If we treat the Pattern class as a black box, then we cannot show that this is not happening. (Or at least ... not without using reflection.)

  2. The author has only run this on one platform. Other platforms could behave differently.

On the second point, my recollection is that in the earlier implementation of Pattern (in Java 1.4) the Pattern.compile(...) methods kept a cache of recently compiled pattern objects1. If you compiled a particular pattern string twice, the second time you might get the same object as was returned the first time. That would cause the test code to output:

  true
  true
  true
  true

But what does that show? Does it show that Pattern overrides Object.equals? No!

The lesson here is that you should figure out how a Java library method behaves primarily by looking at the javadocs:

  • If you write a "black box" test, you are liable to draw incorrect conclusions ... or at least, conclusions that may not be true for all platforms.

  • If you base your conclusions on "reading the code", you run the risk of drawing conclusions that are invalid for other platforms.


1 - Even if my recollection is incorrect, such an implementation would be consistent with the javadocs for the Pattern.compile(...) methods. They do not say that each compile call returns a new Pattern object.

like image 132
Stephen C Avatar answered Nov 15 '22 19:11

Stephen C


Maybe I do not fully understand to the question. But as you can see in the following example, there is a default java.lang.Object.equals(Object) method for every Java Object. This method compares the references to the objects, i.e. uses the == operator.


package test;

import java.util.regex.Pattern;

public class Main {

  private static final Pattern P1 = Pattern.compile("//.*");
  private static final Pattern P2 = Pattern.compile("//.*");

  public static void main(String[] args) {
    System.out.println(P1.equals(P1));
    System.out.println(P1.equals(P2));
    System.out.println(P1.pattern().equals(P1.pattern()));
    System.out.println(P1.pattern().equals(P2.pattern()));
  }
}

Outputs:


true
false
true
true

like image 33
Jiri Patera Avatar answered Nov 15 '22 20:11

Jiri Patera


For mysterious reasons, the Pattern object doesn't implement equals(). For example, this simple unittest will fail:

    @Test
    public void testPatternEquals() {
        Pattern p1 = Pattern.compile("test");
        Pattern p2 = Pattern.compile("test");
        assertEquals(p1, p2); // fails!
    }

The most common workaround for this seems to be to compare the string representations of the Pattern objects (which returns the String used to create the Pattern):

    @Test
    public void testPatternEquals() {
        Pattern p1 = Pattern.compile("test");
        Pattern p2 = Pattern.compile("test");
        assertEquals(p1.toString(), p2.toString()); // succeeds!
    }
like image 41
njudge Avatar answered Nov 15 '22 20:11

njudge