In my program, I have a string (obtained from an external library) which doesn't match any regular expression.
String content = // extract text from PDF assertTrue(content.matches(".*")); // fails assertTrue(content.contains("S P E C I A L")); // passes assertTrue(content.matches("S P E C I A L")); // fails
Any idea what might be wrong? When I print content
to stdout, it looks ok.
Here is the code for extracting text from the PDF (I am using iText 5.0.1):
PdfReader reader = new PdfReader(source); PdfTextExtractor extractor = new PdfTextExtractor(reader, new SimpleTextExtractingPdfContentRenderListener()); return extractor.getTextFromPage(1);
You should not use == (equality operator) to compare these strings because they compare the reference of the string, i.e. whether they are the same object or not. On the other hand, equals() method compares whether the value of the strings is equal, and not the object itself.
The == operator does the type conversion of string into a number. The first output is true as 10, and 10 are equal, thus output true for == operator, the second output is false as 10 and 99 aren't equal. The third output is true as 10 and 99 aren't equal, thus output is true for !=
Definition and UsageThe equals() method compares two strings, and returns true if the strings are equal, and false if not. Tip: Use the compareTo() method to compare two strings lexicographically.
Java - String matches() MethodThis method tells whether or not this string matches the given regular expression. An invocation of this method of the form str. matches(regex) yields exactly the same result as the expression Pattern. matches(regex, str).
By default, the .
does not match line breaks. So my guess is that your content
contains a line break.
Also note that matches
will match the entire string, not just a part of it: it does not do what contains
does!
Some examples:
String s = "foo\nbar"; System.out.println(s.matches(".*")); // false System.out.println(s.matches("foo")); // false System.out.println(s.matches("foo\nbar")); // true System.out.println(s.matches("(?s).*")); // true
The (?s)
in the last example will cause the .
to match line breaks as well. So (?s).*
will match any string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With