So, some way or another (playing around), I found myself with a regex like \d{1}{2}
.
Logically, to me, it should mean:
(A digit exactly once) exactly twice, i.e. a digit exactly twice.
But it, in fact, appears to just mean "a digit exactly once" (thus ignoring the {2}
).
String regex = "^\\d{1}{2}$"; // ^$ to make those not familiar with 'matches' happy System.out.println("1".matches(regex)); // true System.out.println("12".matches(regex)); // false
Similar results can be seen using {n}{m,n}
or similar.
Why does this happen? Is it explicitly stated in regex / Java documentation somewhere or is it just a decision Java developers made on-the-fly or is it maybe a bug?
Or is it in fact not ignored and it actually means something else entirely?
Not that it matters much, but it's not across-the-board regex behaviour, Rubular does what I expect.
Note - the title is mainly for searchability for users who want to know how it works (not why).
The { n , m } quantifier matches the preceding element at least n times, but no more than m times, where n and m are integers. { n , m } is a greedy quantifier whose lazy equivalent is { n , m }? .
For instance, the pattern ou? r looks for o followed by zero or one u , and then r . Means “zero or more”, the same as {0,} . That is, the character may repeat any times or be absent.
IEEE-Standard 1003.1 says:
The behavior of multiple adjacent duplication symbols ( '*' and intervals) produces undefined results.
So every implementation can do as it pleases, just don't rely on anything specific...
When I input your regex in RegexBuddy using the Java regex syntax, it displays following message
Quantifiers must be preceded by a token that can be repeated «{2}»
Changing the regex to explicitly use a grouping ^(\d{1}){2}
solves that error and works as you expect.
I assume that the java regex engine simply neglects the error/expression and works with what has been compiled so far.
Edit
The reference to the IEEE-Standard in @piet.t's answer seems to support that assumption.
Edit 2 (kudos to @fncomp)
For completeness, one would typically use (?:)
to avoid capturing the group. The complete regex then becomes ^(?:\d{1}){2}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With