match() function of re in Python will search the regular expression pattern and return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object.
Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself. For example, = matches "=" ; @ matches "@" .
Description. Python string method endswith() returns True if the string ends with the specified suffix, otherwise return False optionally restricting the matching with the given indices start and end.
Use the string method startswith() for forward matching, i.e., whether a string starts with the specified string. You can also specify a tuple of strings. True is returned if the string starts with one of the elements of the tuple, and False is returned if the string does not start with any of them.
How about not using a regular expression at all?
if string.startswith("ftp://") and string.endswith(".jpg"):
Don't you think this reads nicer?
You can also support multiple options for start and end:
if (string.startswith(("ftp://", "http://")) and
string.endswith((".jpg", ".png"))):
re.match
will match the string at the beginning, in contrast to re.search
:
re.match(r'(ftp|http)://.*\.(jpg|png)$', s)
Two things to note here:
r''
is used for the string literal to make it trivial to have backslashes inside the regexstring
is a standard module, so I chose s
as a variabler = re.compile(...)
to built the state machine once and then use r.match(s)
afterwards to match the stringsIf you want, you can also use the urlparse
module to parse the URL for you (though you still need to extract the extension):
>>> allowed_schemes = ('http', 'ftp')
>>> allowed_exts = ('png', 'jpg')
>>> from urlparse import urlparse
>>> url = urlparse("ftp://www.somewhere.com/over/the/rainbow/image.jpg")
>>> url.scheme in allowed_schemes
True
>>> url.path.rsplit('.', 1)[1] in allowed_exts
True
Don't be greedy, use ^ftp://(.*?)\.jpg$
Try
re.search(r'^ftp://.*\.jpg$' ,string)
if you want a regular expression search. Note that you have to escape the period because it has a special meaning in regular expressions.
import re
s = "ftp://www.somewhere.com/over/the/rainbow/image.jpg"
print(re.search("^ftp://.*\.jpg$", s).group(0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With