Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex group match exactly n times

I have to validate next string format:

text-text-id-text

Separator is character '-'. Third column must always be id. I wrote next regex (in python) which validates string:

import re

s = 'col1-col2-col3-id' # any additional text at the end
                        # is allowed e.g. -col4-col5
print re.match('^(.*-){3}id(-.*)?$', s) # ok 
print re.match('^(.*-){1}id(-.*)?$', s) # still ok, is should not be

I tried adding non-greedy mode, but result is still the same:

^(.*?-){1}id(-.*)?$

What am I missing in my regex? I could just validate string like this:

>>> import re
>>> print re.split('-', 'col1-col2-col3-id')
['col1', 'col2', 'col3', 'id']

And then check if the third element matches id, but I am interested in why does the first regex works as mentioned above.

like image 478
broadband Avatar asked Aug 15 '14 11:08

broadband


People also ask

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

Which pattern matches the preceding pattern atleast n times?

The { n ,} quantifier matches the preceding element at least n times, where n is any integer. { n ,} is a greedy quantifier whose lazy equivalent is { n ,}? .

What does regex (? S match?

Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.

Which regex matches one or more digits?

+: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.


1 Answers

Your first regex is incorrect because it asserts that id is present after the first three items.
Your second regex matches the string incorrectly because .* matches hyphens as well.

You should use this regex:

/^(?:[^-]+-){2}id/

Here is a regex demo!

And if you feel a need to anchor a regex to the end, use /^(?:[^-]*-){2}id.*$/!


As mentioned by Tim Pietzcker, consider asserting id at the end of the item:

/^(?:[^-]+-){2}id(?![^-])/

Here is an UPDATED regex demo!

like image 108
Unihedron Avatar answered Oct 06 '22 21:10

Unihedron