Is there an easy way to compare two Pattern
objects?
I have a Pattern
which compiled using the regex "//"
to check for comments in a code.
Since there are several regex to describe comments, I want to find a way to difference them.
How can it be done? the Pattern
class does not implements the equals
method.
reflect. Field is used to compare two field objects. This method compares two field objects and returns true if both objects are equal otherwise false. The two Field objects are considered equal if and only if when they were declared by the same class and have the same name and type.
In Java, the == operator compares that two references are identical or not. Whereas the equals() method compares two objects. Objects are equal when they have the same state (usually comparing variables). Objects are identical when they share the class identity.
Pattern matching has modified two syntactic elements of the Java language: the instanceof keyword and switch statements. They were both extended with a special kind of patterns called type patterns. There is more to come in the near future.
equals() versus == The == operator compares whether two object references point to the same object. For example: System.
You can compare Pattern
objects by comparing the result of calling pattern()
or toString
but this doesn't do what you want (if I understand your question correctly). Specifically, this compares the strings that were passed to the Pattern.compile(...)
factory method. However, this takes no account of flags passed separately to the pattern string.
There is no simple way to test if two non-identical regexes are equivalent. For example ".+"
and "..*"
represent equivalent regexes, but there is no straight-forward way to determine this using the Pattern
API.
I don't know if the problem is theoretically solvable ... in the general case. @Akim comments:
There is no finite axiomatization to regex equivalence, so the short answer is "this is not doable by tree transformations of the regexes themselves". However one can compare the languages of two automata (test their equality), so one can compute whether two regexes are equivalent. Note that I'm referring to the "genuine" regexes, with no extensions such as back-references to capture groups, which escape the realm of rational languages, i.e., that of automata.
I also want to comment on the accepted answer. The author provides some code that he claims shows that Pattern's equals
method is inherited from Object
. In fact, the output he is seeing is consistent with that ... but it doesn't show it.
The correct way to know if this is the case is to look at the javadoc ... where the equals
method is listed in the list of inherited methods. That is definitive.
So why doesn't the example show what the author says it shows?
It is possible for two methods to behave the same way, but be implemented differently. If we treat the Pattern
class as a black box, then we cannot show that this is not happening. (Or at least ... not without using reflection.)
The author has only run this on one platform. Other platforms could behave differently.
On the second point, my recollection is that in the earlier implementation of Pattern
(in Java 1.4) the Pattern.compile(...)
methods kept a cache of recently compiled pattern objects1. If you compiled a particular pattern string twice, the second time you might get the same object as was returned the first time. That would cause the test code to output:
true
true
true
true
But what does that show? Does it show that Pattern
overrides Object.equals
? No!
The lesson here is that you should figure out how a Java library method behaves primarily by looking at the javadocs:
If you write a "black box" test, you are liable to draw incorrect conclusions ... or at least, conclusions that may not be true for all platforms.
If you base your conclusions on "reading the code", you run the risk of drawing conclusions that are invalid for other platforms.
1 - Even if my recollection is incorrect, such an implementation would be consistent with the javadocs for the Pattern.compile(...)
methods. They do not say that each compile
call returns a new Pattern
object.
Maybe I do not fully understand to the question. But as you can see in the following example, there is a default java.lang.Object.equals(Object)
method for every Java Object. This method compares the references to the objects, i.e. uses the ==
operator.
package test;
import java.util.regex.Pattern;
public class Main {
private static final Pattern P1 = Pattern.compile("//.*");
private static final Pattern P2 = Pattern.compile("//.*");
public static void main(String[] args) {
System.out.println(P1.equals(P1));
System.out.println(P1.equals(P2));
System.out.println(P1.pattern().equals(P1.pattern()));
System.out.println(P1.pattern().equals(P2.pattern()));
}
}
Outputs:
true
false
true
true
For mysterious reasons, the Pattern object doesn't implement equals(). For example, this simple unittest will fail:
@Test
public void testPatternEquals() {
Pattern p1 = Pattern.compile("test");
Pattern p2 = Pattern.compile("test");
assertEquals(p1, p2); // fails!
}
The most common workaround for this seems to be to compare the string representations of the Pattern objects (which returns the String used to create the Pattern):
@Test
public void testPatternEquals() {
Pattern p1 = Pattern.compile("test");
Pattern p2 = Pattern.compile("test");
assertEquals(p1.toString(), p2.toString()); // succeeds!
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With