Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex exactly n OR m times

Tags:

java

regex

php

People also ask

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.

How does regex match 4 digits?

Add the $ anchor. /^SW\d{4}$/ . It's because of the \w+ where \w+ match one or more alphanumeric characters. \w+ matches digits as well.

What does * quantifier represent in regex?

The *? quantifier matches the preceding element zero or more times but as few times as possible.

Which regex matches one or more digits?

Occurrence Indicators (or Repetition Operators): +: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.


There is no single quantifier that means "exactly m or n times". The way you are doing it is fine.

An alternative is:

X{m}(X{k})?

where m < n and k is the value of n-m.


Here is the complete list of quantifiers (ref. http://www.regular-expressions.info/reference.html):

  • ?, ?? - 0 or 1 occurences (?? is lazy, ? is greedy)
  • *, *? - any number of occurences
  • +, +? - at least one occurence
  • {n} - exactly n occurences
  • {n,m} - n to m occurences, inclusive
  • {n,m}? - n to m occurences, lazy
  • {n,}, {n,}? - at least n occurence

To get "exactly N or M", you need to write the quantified regex twice, unless m,n are special:

  • X{n,m} if m = n+1
  • (?:X{n}){1,2} if m = 2n
  • ...

No, there is no such quantifier. But I'd restructure it to /X{m}(X{m-n})?/ to prevent problems in backtracking.


Very old post, but I'd like to contribute sth that might be of help. I've tried it exactly the way stated in the question and it does work but there's a catch: The order of the quantities matters. Consider this:

#[a-f0-9]{6}|#[a-f0-9]{3}

This will find all occurences of hex colour codes (they're either 3 or 6 digits long). But when I flip it around like this

#[a-f0-9]{3}|#[a-f0-9]{6}

it will only find the 3 digit ones or the first 3 digits of the 6 digit ones. This does make sense and a Regex pro might spot this right away, but for many this might be a peculiar behaviour. There are some advanced Regex features that might avoid this trap regardless of the order, but not everyone is knee-deep into Regex patterns.


TLDR; (?<=[^x]|^)(x{n}|x{m})(?:[^x]|$)

Looks like you want "x n times" or "x m times", I think a literal translation to regex would be (x{n}|x{m}). Like this https://regex101.com/r/vH7yL5/1

or, in a case where you can have a sequence of more than m "x"s (assuming m > n), you can add 'following no "x"' and 'followed by no "x", translating to [^x](x{n}|x{m})[^x] but that would assume that there are always a character behind and after you "x"s. As you can see here: https://regex101.com/r/bB2vH2/1

you can change it to (?:[^x]|^)(x{n}|x{m})(?:[^x]|$), translating to "following no 'x' or following line start" and "followed by no 'x' or followed by line end". But still, it won't match two sequences with only one character between them (because the first match would require a character after, and the second a character before) as you can see here: https://regex101.com/r/oC5oJ4/1

Finally, to match the one character distant match, you can add a positive look ahead (?=) on the "no 'x' after" or a positive look behind (?<=) on the "no 'x' before", like this: https://regex101.com/r/mC4uX3/1

(?<=[^x]|^)(x{n}|x{m})(?:[^x]|$)

This way you will match only the exact number of 'x's you want.


Taking a look at Enhardened's answer, they state that their penultimate expression won't match sequences with only one character between them. There is an easy way to fix this without using look ahead/look behind, and that's to replace the start/end character with the boundary character. This lets you match against word boundaries which includes start/end. As such, the appropriate expression should be:

(?:[^x]|\b)(x{n}|x{m})(?:[^x]|\b)

As you can see here: https://regex101.com/r/oC5oJ4/2.