I am facing a problem to match and replace certain words, not contained in http://
Present Regex:
http://.*?\s+
This matches the pattern http://www.egg1.com http://www.egg2.com
I need a regex to match certain words contained outside the http://
Example:
"This is a sample. http://www.egg1.com and http://egg2.com. This regex will only match
this egg1 and egg2 and not the others contained inside http:// "
Match: egg1 egg2
Replaced: replaced1 replaced2
Final Output :
"This is a sample. http://www.egg1.com and http://egg2.com. This regex will only
match this replaced1 and replaced2 and not the others contained inside http:// "
QUESTION: Need to match certain patterns (as in example : egg1 egg2) unless they are part of http:// .Do not match egg1 and egg2 if they are present within http://
Pattern matching in Python with Regex 1 Following regex is used in Python to match a string of three numbers, a hyphen, three more numbers, another hyphen, and... 2 Regular expressions can be much more sophisticated. For example, adding a 3 in curly brackets ( {3}) after a pattern is... More ...
In this article, You will learn how to match a regex pattern inside the target string using the match (), search (), and findall () method of a re module. The re.match () method will start matching a regex pattern from the very first character of the text, and if the match found, it will return a re.Match object.
Following regex is used in Python to match a string of three numbers, a hyphen, three more numbers, another hyphen, and four numbers. Regular expressions can be much more sophisticated. For example, adding a 3 in curly brackets ( {3}) after a pattern is like saying, “ Match this pattern three times.” So the slightly shorter regex
If zero or more characters at the beginning of the string match the regular expression pattern, It returns a corresponding match object instance i.e., re.Match object. The match object contains the locations at which the match starts and ends and the actual match value.
One solution I can think of is to form a combined pattern for HTTP-URLs and your pattern, then filter the matches accordingly:
import re
t = "http://www.egg1.com http://egg2.com egg3 egg4"
p = re.compile('(http://\S+)|(egg\d)')
for url, egg in p.findall(t):
if egg:
print egg
prints:
egg3 egg4
UPDATE: To use this idiom with re.sub()
, just supply a filter function:
p = re.compile(r'(http://\S+)|(egg(\d+))')
def repl(match):
if match.group(2):
return 'spam{0}'.format(match.group(3))
return match.group(0)
print p.sub(repl, t)
prints:
http://www.egg1.com http://egg2.com spam3 spam4
This will not capture http://...
:
(?:http://.*?\s+)|(egg1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With