Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the better way to match two different repetitions of the same character class in a regex?

Tags:

python

regex

I had been using [0-9]{9,12} all along to signify that the numeric string has a length of 9 or 12 characters. However I now realized that it will match input strings of length 10 or 11 as well. So I came out with the naive:

( [0-9]{9} | [0-9]{12} )

Is there a more succinct regex to represent this ?

like image 298
Frankie Ribery Avatar asked Jun 28 '11 06:06

Frankie Ribery


People also ask

Which regular expression do you use to match one or more of the preceding characters?

The character + in a regular expression means "match the preceding character one or more times". For example A+ matches one or more of character A. The plus character, used in a regular expression, is called a Kleene plus .

Which pattern matches the preceding pattern zero or more occurrences?

Matching 0, 1, or More Occurrences. * matches zero or more occurrences of the preceding character. The fewest possible occurrences of a pattern will satisfy the match. Example: a*b will match b, ab, aab, aaab, aaaab, and so on.

Which method is used to match the regex?

The Match(String, String, RegexOptions) method returns the first substring that matches a regular expression pattern in an input string. For information about the language elements used to build a regular expression pattern, see Regular Expression Language - Quick Reference.

How do you repeat a pattern in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.


1 Answers

You could save one character by using

[0-9]{9}([0-9]{3})?

but in my opinion your way is better because it conveys your intention more clearly. Regexes are hard enough to read already.

Of course you could use \d instead of [0-9].

(Edit: I first thought you could drop the parens around [0-9]{3} but you can't; the question mark will be ignored. So you only save one character, not three.)

(Edit 2: You will also need to anchor the regex with ^ and $ (or \b) or re.match() will also match 123456789 within 1234567890.)

like image 87
Tim Pietzcker Avatar answered Sep 18 '22 12:09

Tim Pietzcker