I have below code: <pre class="prettyprint"><code>import re line = "78349999234"; searchObj = re.search(r'9*', line) if searchObj: print "searchObj.group() : ", searchObj.group() else: print "Nothing found!!" </code></pre> However the output is empty. I thought <code>*</code> means: Causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. <code>ab*</code> will match <code>‘a’</code>, <code>‘ab’</code>, or <code>‘a’</code> followed by any number of <code>‘b’</code>s. Why am I not able to see any result in this case?

I think the regular expression matches left to right. So the first pattern that matches is the empty string before <code>7...</code>. If it find a <code>9</code>, it will indeed match it greedy: and try to "eat" (that's the correct terminology) as many characters as possible. If you query for: <pre class="prettyprint"><code>>>> print(re.findall(r'9*',line)); ['', '', '', '', '9999', '', '', '', ''] </code></pre> It matches all empty strings between the characters and as you can see, <code>9999</code> is matched as well. The main reason is probably performance: if you search for a pattern in a string of 10M+ characters, you're very happy if the pattern is already in the first 10k characters. You don't want to waste effort on finding the "nicest" match... <hr> EDIT With 0 or more occurrence one means the group (in this case <code>9</code>) is repeated zero or more times. In an empty string, the characters is repeated exactly 0 times. If you want to match patterns where the characters is repeated one or more times, you should use <pre class="prettyprint"><code>9+ </code></pre> This results in: <pre class="prettyprint"><code>>>> print(re.search(r'9+', line)); <_sre.SRE_Match object; span=(4, 8), match='9999'> </code></pre> <code>re.search</code> for a pattern that accepts the empty string, is probably not that much helpful since it will always match the empty string before the actual start of the string first.

Regular expression result

Tags:

python

regex

I have below code:

import re

line = "78349999234";

searchObj = re.search(r'9*', line)

if searchObj:
   print "searchObj.group() : ", searchObj.group()
else:
   print "Nothing found!!"

However the output is empty. I thought * means: Causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. ab* will match ‘a’, ‘ab’, or ‘a’ followed by any number of ‘b’s. Why am I not able to see any result in this case?

284

asked Oct 14 '14 23:10

user3369157

2 Answers

I think the regular expression matches left to right. So the first pattern that matches is the empty string before 7.... If it find a 9, it will indeed match it greedy: and try to "eat" (that's the correct terminology) as many characters as possible.

If you query for:

>>> print(re.findall(r'9*',line));
['', '', '', '', '9999', '', '', '', '']

It matches all empty strings between the characters and as you can see, 9999 is matched as well.

The main reason is probably performance: if you search for a pattern in a string of 10M+ characters, you're very happy if the pattern is already in the first 10k characters. You don't want to waste effort on finding the "nicest" match...

EDIT

With 0 or more occurrence one means the group (in this case 9) is repeated zero or more times. In an empty string, the characters is repeated exactly 0 times. If you want to match patterns where the characters is repeated one or more times, you should use

9+

This results in:

>>> print(re.search(r'9+', line));
<_sre.SRE_Match object; span=(4, 8), match='9999'>

re.search for a pattern that accepts the empty string, is probably not that much helpful since it will always match the empty string before the actual start of the string first.

138

answered Sep 27 '22 18:09

Willem Van Onsem

The main reason is , re.search function stops searching for strings once it finds a match. 9* means match the digit 9 zero or more times. Because an empty string exists before each and every character, re.search function stops it searching after finding the first empty string. That's why you got an empty string as output...

answered Sep 27 '22 17:09

Avinash Raj

Related questions
                            
                                Force implementation of a method in all inheriting classes
                            
                                Importing a local variable in a function into timeit
                            
                                Speedup sympy-lamdified and vectorized function
                            
                                How to work around the Queue corruption when using Process.Terminate()
                            
                                Code 200 httpresponse on django
                            
                                How to write the result of a calculation to a file in python?
                            
                                Python: Exception raised even when caught in try/except clause [duplicate]
                            
                                transpose multiple columns Pandas dataframe
                            
                                Python Syntax: Subprocess Call PostgreSQL Query, "Error: Only ASCII Characters Allowed"
                            
                                Why do I get Pandas data frame with only one column vs Series?
                            
                                Arithmetic operations on datetime index in pandas
                            
                                python pandas groupby for first date
                            
                                In-place shuffling of multidimensional arrays
                            
                                How to lambdify a SymPy expression containing the erf function for use with NumPy
                            
                                How to get XKCD font on matplotlib
                            
                                How to get 100% in my coverage tests for a model?
                            
                                How can I retrieve a JavaScript variable using Python? [closed]
                            
                                How to assert a negative fact in Pyke?
                            
                                How do I add a 'previous chapter' and 'next chapter' link in documentation generated by Sphinx?
                            
                                Does python-pptx support saving a file as pdf?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With