I'm supporting this Java application where the devs implemented some filtering based on RegEx. To be as generic as possible, they compile the patterns with the MULTILINE flag.
The other day I noticed something unexpected.
In Java, the pattern "^\\s*$"
does not match ""
with the MULTILINE flag. It does match without that flag.
Pattern pattern = Pattern.compile("^\\s*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("");
System.out.println("Multiline: "+matcher.find());
pattern = Pattern.compile("^\\s*$");
matcher = pattern.matcher("");
System.out.println("No-multiline: "+matcher.find());
This produces the following output
Multiline: false
Non-Multiline: true
Same results can be seen for matches()
:
System.out.println("Multiline: " + ("".matches("(?m)^\\s*$")));
System.out.println("No-multiline: " + ("".matches("^\\s*$")));
I would expect all cases to match.
In Python, this is the case. This:
import re
print(re.search(r'^\s*$', "", re.MULTILINE))
print(re.search(r'^\s*$', ""))
gives:
<_sre.SRE_Match object; span=(0, 0), match=''>
<_sre.SRE_Match object; span=(0, 0), match=''>
In Perl, both cases match as well and I think I remember it being the same for PHP.
I'd really appreciate if someone could explain the reasoning behind the way Java handles this case.
You pass an empty string to the matcher. With Pattern.MULTILINE
, the ^
is expected to match at the beginning of the string, but in Java it can be a bit different:
If
MULTILINE
mode is activated then^
matches at the beginning of input and after any line terminator except at the end of input.
Since the string is empty, the beginning of input is its end.
Note: If you pass the flag by default, but in fact, you want patterns to match at the start of a string, you can use \A
instead of ^
and \z
for the end of string instead of $
that will match the string start/end even with Pattern.MULTILINE
(and even an empty string will pass the \\A\\s*\\z
test).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With