While matching an email address, after I match something like <code>yasar@webmail</code>, I want to capture one or more of <code>(\.\w+)</code>(what I am doing is a little bit more complicated, this is just an example), I tried adding (.\w+)+ , but it only captures last match. For example, <code>yasar@webmail.something.edu.tr</code> matches but only include <code>.tr</code> after <code>yasar@webmail</code> part, so I lost <code>.something</code> and <code>.edu</code> groups. Can I do this in Python regular expressions, or would you suggest matching everything at first, and split the subpatterns later?

<code>re</code> module doesn't support repeated captures (<code>regex</code> supports it): <pre class="prettyprint"><code>>>> m = regex.match(r'([.\w]+)@((\w+)(\.\w+)+)', 'yasar@webmail.something.edu.tr') >>> m.groups() ('yasar', 'webmail.something.edu.tr', 'webmail', '.tr') >>> m.captures(4) ['.something', '.edu', '.tr'] </code></pre> In your case I'd go with splitting the repeated subpatterns later. It leads to a simple and readable code e.g., see the code in @Li-aung Yip's answer.

Capturing repeating subpatterns in Python regex

Tags:

python

regex

While matching an email address, after I match something like yasar@webmail, I want to capture one or more of (\.\w+)(what I am doing is a little bit more complicated, this is just an example), I tried adding (.\w+)+ , but it only captures last match. For example, [email protected] matches but only include .tr after yasar@webmail part, so I lost .something and .edu groups. Can I do this in Python regular expressions, or would you suggest matching everything at first, and split the subpatterns later?

992

asked Mar 19 '12 04:03

yasar

1 Answers

re module doesn't support repeated captures (regex supports it):

>>> m = regex.match(r'([.\w]+)@((\w+)(\.\w+)+)', '[email protected]') >>> m.groups() ('yasar', 'webmail.something.edu.tr', 'webmail', '.tr') >>> m.captures(4) ['.something', '.edu', '.tr']

In your case I'd go with splitting the repeated subpatterns later. It leads to a simple and readable code e.g., see the code in @Li-aung Yip's answer.

answered Oct 13 '22 18:10

jfs

Related questions
                            
                                Python difflib: highlighting differences inline?
                            
                                Recommended Python cryptographic module?
                            
                                BeautifulSoup: AttributeError: 'NavigableString' object has no attribute 'name'
                            
                                PEP8: conflict between W292 and W391
                            
                                Is it possible to read FTP files without writing them using Python?
                            
                                Python lambda closure scoping
                            
                                Unbalanced classification using RandomForestClassifier in sklearn
                            
                                How to Create a form from a json-schema? [closed]
                            
                                Parse XML from URL into python object
                            
                                Modify a particular row/column of a NumPy array
                            
                                Histogram in matplotlib, time on x-Axis
                            
                                What is the most pythonic way to iterate over OrderedDict
                            
                                Difference between hash() and id()
                            
                                How to rotate X-axis labels in bokeh figure?
                            
                                "pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available"
                            
                                Is it possible to pass arguments into event bindings?
                            
                                How do I access part of a list in Jinja2
                            
                                python yaml.dump bad indentation
                            
                                Why don't I have xlrd?
                            
                                Scipy/Numpy FFT Frequency Analysis

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With