I'm just learning Python, and I can't seem to figure out regular expressions. <pre class="prettyprint"><code>r1 = re.compile("$.pdf") if r1.match("spam.pdf"): print 'yes' else: print 'no' </code></pre> I want this code to print 'yes', but it obstinately prints 'no'. I've also tried each of the following: <pre class="prettyprint"><code>r1 = re.compile(r"$.pdf") r1 = re.compile("$ .pdf") r1 = re.compile('$.pdf') if re.match("$.pdf", "spam.pdf") r1 = re.compile(".pdf") </code></pre> Plus countless other variations. I've been searching for quite a while, but can't find/understand anything that solves my problem. Can someone help out a newbie?

You've tried all the variations except the one that works. The <code>$</code> goes at the end of the pattern. Also, you'll want to escape the period so it actually matches a period (usually it matches any character). <pre class="prettyprint"><code>r1 = re.compile(r"\.pdf$") </code></pre> However, an easier and clearer way to do this is using the string's <code>.endswith()</code> method: <pre class="prettyprint"><code>if filename.endswith(".pdf"): # do something </code></pre> That way you don't have to decipher the regular expression to understand what's going on.

<h3>Behaviour of <code>re.match()</code> and <code>re.search()</code> </h3> There is one significant difference: <code>re.match()</code> checks the beginning of string, you are most likely looking for <code>re.search()</code>. Comparison of both methods is clearly shown in the Python documentation chapter called "search() vs. match()" <h3>Special characters in regular expression</h3> Also the meaning of characters in regular expressions is different than you are trying to use it (see Regular Expression Syntax for details): <ul> <li> <code>^</code> matches the beginning: <blockquote> (Caret.) Matches the start of the string, and in MULTILINE mode also matches immediately after each newline. </blockquote> </li> <li> <code>$</code> matches the end: <blockquote> Matches the end of the string or just before the newline at the end of the string, and in <code>MULTILINE</code> mode also matches before a newline. foo matches both ‘<code>foo</code>’ and ‘<code>foobar</code>’, while the regular expression <code>foo$</code> matches only ‘<code>foo</code>’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘<code>foo2</code>’ normally, but ‘<code>foo1</code>’ in <code>MULTILINE</code> mode; searching for a single <code>$</code> in '<code>foo\n</code>' will find two (empty) matches: one just before the newline, and one at the end of the string. </blockquote> </li> </ul> <h3>Complete answer</h3> The solution you are looking for may be: <pre class="prettyprint"><code>import re r1 = re.compile("\.pdf$") # regular expression corrected if r1.search("spam.pdf"): # re.match() replaced with re.search() print "yes" else: print "no" </code></pre> which checks, if the string ends with "<code>.pdf</code>". Does the same as kindall's answer with <code>.endswith()</code>, but if kindall's answer works for you, choose it (it is cleaner as you may not need regular expressions at all).

Regular expression in Python won't match end of a string

Tags:

python

regex

I'm just learning Python, and I can't seem to figure out regular expressions.

r1 = re.compile("$.pdf")
if r1.match("spam.pdf"):
    print 'yes'
else:
    print 'no'

I want this code to print 'yes', but it obstinately prints 'no'. I've also tried each of the following:

r1 = re.compile(r"$.pdf")

r1 = re.compile("$ .pdf")

r1 = re.compile('$.pdf')

if re.match("$.pdf", "spam.pdf")

r1 = re.compile(".pdf")

Plus countless other variations. I've been searching for quite a while, but can't find/understand anything that solves my problem. Can someone help out a newbie?

514

asked Aug 29 '12 23:08

user1634426

2 Answers

You've tried all the variations except the one that works. The $ goes at the end of the pattern. Also, you'll want to escape the period so it actually matches a period (usually it matches any character).

r1 = re.compile(r"\.pdf$")

However, an easier and clearer way to do this is using the string's .endswith() method:

if filename.endswith(".pdf"):
    # do something

That way you don't have to decipher the regular expression to understand what's going on.

answered Sep 30 '22 23:09

kindall

Behaviour of `re.match()` and `re.search()`

There is one significant difference: re.match() checks the beginning of string, you are most likely looking for re.search().

Comparison of both methods is clearly shown in the Python documentation chapter called "search() vs. match()"

Special characters in regular expression

Also the meaning of characters in regular expressions is different than you are trying to use it (see Regular Expression Syntax for details):

^ matches the beginning:

(Caret.) Matches the start of the string, and in MULTILINE mode also matches immediately after each newline.
$ matches the end:

Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline. foo matches both ‘foo’ and ‘foobar’, while the regular expression foo$ matches only ‘foo’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘foo2’ normally, but ‘foo1’ in MULTILINE mode; searching for a single $ in 'foo\n' will find two (empty) matches: one just before the newline, and one at the end of the string.

Complete answer

The solution you are looking for may be:

import re
r1 = re.compile("\.pdf$")  # regular expression corrected
if r1.search("spam.pdf"):  # re.match() replaced with re.search()
    print "yes"
else:
    print "no"

which checks, if the string ends with ".pdf". Does the same as kindall's answer with .endswith(), but if kindall's answer works for you, choose it (it is cleaner as you may not need regular expressions at all).

answered Oct 01 '22 01:10

Tadeck

Related questions
                            
                                Flattening a list recursively [duplicate]
                            
                                Delete files with python through OS shell
                            
                                What are quines? Any specific purpose to have them? [closed]
                            
                                How to read contents of an Table in MS-Word file Using Python?
                            
                                Naming conflict with built-in function
                            
                                SyntaxError: Non-UTF-8 code starting with '\x91'
                            
                                make django model field read only or disable in admin while saving the object first time
                            
                                Django switching, for a block of code, switch the language so translations are done in one language
                            
                                How to install python-levenshtein on Windows?
                            
                                Pandas data precision [duplicate]
                            
                                Unable to load the spacy model 'en_core_web_lg' on Google colab
                            
                                python dictionary update method
                            
                                Why is (python|ruby) interpreted?
                            
                                Squaring all elements in a list
                            
                                flask-bcrypt - ValueError: Invalid salt
                            
                                Installing Pillow/PIL on Mavericks
                            
                                Django image resizing and convert before upload
                            
                                Range of python's random.random() from the standard library
                            
                                indent python file (with pydev) in eclipse
                            
                                How to unpack a tuple from left to right?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular expression in Python won't match end of a string

Tags:

python

regex

user1634426

People also ask

2 Answers

kindall

Behaviour of `re.match()` and `re.search()`

Special characters in regular expression

Complete answer

Tadeck

Recent Activity

Donate For Us

Regular expression in Python won't match end of a string

Tags:

python

regex

user1634426

People also ask

2 Answers

kindall

Behaviour of re.match() and re.search()

Special characters in regular expression

Complete answer

Tadeck

Related questions

Recent Activity

Donate For Us

Behaviour of `re.match()` and `re.search()`