Capture groups with Regular Expression (Python)

Tags:

python

regex

Kind of a noob here, apologies if I misstep.

I'm learning regular expressions and am on this lesson: https://regexone.com/lesson/capturing_groups.

In the python interpreter, I try to use the parentheses to only capture what precedes the .pdf part of the search string but my result captures it despite using the parens. What am I doing wrong?

import re string_one = 'file_record_transcript.pdf' string_two = 'file_07241999.pdf' string_three = 'testfile_fake.pdf.tmp'  pattern = '^(file.+)\.pdf$' a = re.search(pattern, string_one) b = re.search(pattern, string_two) c = re.search(pattern, string_three)  print(a.group() if a is not None else 'Not found') print(b.group() if b is not None else 'Not found') print(c.group() if c is not None else 'Not found')

Returns

file_record_transcript.pdf file_07241999.pdf Not found

But should return

file_record_transcript file_07241999 Not found

Thanks!

652

asked Feb 10 '18 10:02

L. Robinson

Video Answer

1 Answers

You need the first captured group:

a.group(1) b.group(1) ...

without any captured group specification as argument to group(), it will show the full match, like what you're getting now.

Here's an example:

In [8]: string_one = 'file_record_transcript.pdf'  In [9]: re.search(r'^(file.*)\.pdf$', string_one).group() Out[9]: 'file_record_transcript.pdf'  In [10]: re.search(r'^(file.*)\.pdf$', string_one).group(1) Out[10]: 'file_record_transcript'

106

answered Oct 15 '22 08:10

heemayl

Related questions
                            
                                Re-assigning a name to itself
                            
                                Python Multiprocessing: What's the difference between map and imap?
                            
                                When to use imshow over pcolormesh?
                            
                                Using Design by Contract in Python
                            
                                python dict.update vs. subscript to add a single key/value pair [closed]
                            
                                What do * (single star) and / (slash) do as independent parameters? [duplicate]
                            
                                Make @lru_cache ignore some of the function arguments
                            
                                Can a simple difference in Python3 variable names alter the way code runs? [duplicate]
                            
                                Measuring elapsed time in python
                            
                                What does np.r_ do (numpy)?
                            
                                Weighted random selection with and without replacement
                            
                                functools.partial on class method
                            
                                Python on Electron framework
                            
                                Download HTML page and its contents
                            
                                Typehints for Sized Iterable in Python
                            
                                How can I install various Python libraries in Jython?
                            
                                scoped_session(sessionmaker()) or plain sessionmaker() in sqlalchemy?
                            
                                How to unpack tuple of length n to m<n variables [duplicate]
                            
                                Why does creating a list from a list make it larger?
                            
                                How do I use '~' (tilde) in the context of paths?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With