Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I match the start and end in Python's regex?

Tags:

python

regex

People also ask

How do you match a pattern exactly at the beginning in Python?

match() function of re in Python will search the regular expression pattern and return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object.

How do I match a pattern in regex?

Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself. For example, = matches "=" ; @ matches "@" .

What matches the end of the string in Python?

Description. Python string method endswith() returns True if the string ends with the specified suffix, otherwise return False optionally restricting the matching with the given indices start and end.

How do you match part of a string in Python?

Use the string method startswith() for forward matching, i.e., whether a string starts with the specified string. You can also specify a tuple of strings. True is returned if the string starts with one of the elements of the tuple, and False is returned if the string does not start with any of them.


How about not using a regular expression at all?

if string.startswith("ftp://") and string.endswith(".jpg"):

Don't you think this reads nicer?

You can also support multiple options for start and end:

if (string.startswith(("ftp://", "http://")) and 
    string.endswith((".jpg", ".png"))):

re.match will match the string at the beginning, in contrast to re.search:

re.match(r'(ftp|http)://.*\.(jpg|png)$', s)

Two things to note here:

  • r'' is used for the string literal to make it trivial to have backslashes inside the regex
  • string is a standard module, so I chose s as a variable
  • If you use a regex more than once, you can use r = re.compile(...) to built the state machine once and then use r.match(s) afterwards to match the strings

If you want, you can also use the urlparse module to parse the URL for you (though you still need to extract the extension):

>>> allowed_schemes = ('http', 'ftp')
>>> allowed_exts = ('png', 'jpg')
>>> from urlparse import urlparse
>>> url = urlparse("ftp://www.somewhere.com/over/the/rainbow/image.jpg")
>>> url.scheme in allowed_schemes
True
>>> url.path.rsplit('.', 1)[1] in allowed_exts
True

Don't be greedy, use ^ftp://(.*?)\.jpg$


Try

 re.search(r'^ftp://.*\.jpg$' ,string)

if you want a regular expression search. Note that you have to escape the period because it has a special meaning in regular expressions.


import re

s = "ftp://www.somewhere.com/over/the/rainbow/image.jpg"
print(re.search("^ftp://.*\.jpg$", s).group(0))