Reuse part of a Regex pattern

People also ask

How do you substitute in regex?

Substitutions are language elements that are recognized only within replacement patterns. They use a regular expression pattern to define all or part of the text that is to replace matched text in the input string. The replacement pattern can consist of one or more substitutions along with literal characters.

What is a capturing group regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

How do I match a pattern in regex?

Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself. For example, = matches "=" ; @ matches "@" .

No, when using the standard library re module, regular expression patterns cannot be 'symbolized'.

You can always do so by re-using Python variables, of course:

Click to copy

digit_letter_letter_digit = r'\d\w\w\d'

then use string formatting to build the larger pattern:

Click to copy

match(r"{0},{0}".format(digit_letter_letter_digit), inputtext)

or, using Python 3.6+ f-strings:

Click to copy

dlld = r'\d\w\w\d'
match(fr"{dlld},{dlld}", inputtext)

I often do use this technique to compose larger, more complex patterns from re-usable sub-patterns.

If you are prepared to install an external library, then the regex project can solve this problem with a regex subroutine call. The syntax (?<digit>) re-uses the pattern of an already used (implicitly numbered) capturing group:

Click to copy

(\d\w\w\d),(?1)
^........^ ^..^
|           \
|             re-use pattern of capturing group 1  
\
  capturing group 1

You can do the same with named capturing groups, where (?<groupname>...) is the named group groupname, and (?&groupname), (?P&groupname) or (?P>groupname) re-use the pattern matched by groupname (the latter two forms are alternatives for compatibility with other engines).

And finally, regex supports the (?(DEFINE)...) block to 'define' subroutine patterns without them actually matching anything at that stage. You can put multiple (..) and (?<name>...) capturing groups in that construct to then later refer to them in the actual pattern:

Click to copy

(?(DEFINE)(?<dlld>\d\w\w\d))(?&dlld),(?&dlld)
          ^...............^ ^......^ ^......^
          |                    \       /          
 creates 'dlld' pattern      uses 'dlld' pattern twice

Just to be explicit: the standard library re module does not support subroutine patterns.

Note: this will work with PyPi regex module, not with re module.

You could use the notation (?group-number), in your case:

Click to copy

(\d\w\w\d),(?1)

it is equivalent to:

Click to copy

(\d\w\w\d),(\d\w\w\d)

Be aware that \w includes \d. The regex will be:

Click to copy

(\d[a-zA-Z]{2}\d),(?1)

Related questions
                            
                                Is it possible to "transfer" a session between selenium.webdriver and requests.session
                            
                                What is co_names?
                            
                                hook into the builtin python f-string format machinery
                            
                                An enterprise scheduler for python (like quartz)
                            
                                How to read a RSA public key in PEM + PKCS#1 format
                            
                                Threading in python using queue
                            
                                Is the behaviour of Python's list += iterable documented anywhere?
                            
                                python pandas - dividing column by another column
                            
                                How can I manage units in pandas data?
                            
                                Why does .loc have inclusive behavior for slices?
                            
                                Why can a 352GB NumPy ndarray be used on an 8GB memory macOS computer?
                            
                                Abstract class + mixin + multiple inheritance in python
                            
                                Why the "mutable default argument fix" syntax is so ugly, asks python newbie
                            
                                setup.py not installing data files
                            
                                How to chain Django querysets preserving individual order
                            
                                reStructuredText in Sphinx and Docstrings: single vs. double back-quotes or back-ticks difference
                            
                                How to use authenticated proxy in selenium chromedriver?
                            
                                What does the `platforms` argument to `setup()` in `setup.py` do?
                            
                                How >> operator defines task dependencies in Airflow?
                            
                                Subclass - Arguments From Superclass

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Reuse part of a Regex pattern

Tags:

python

regex

People also ask

Recent Activity

Donate For Us