Python regex not to match http://

Tags:

I am facing a problem to match and replace certain words, not contained in http://

Present Regex:

 http://.*?\s+

This matches the pattern http://www.egg1.com http://www.egg2.com

I need a regex to match certain words contained outside the http://

Example:

Click to copy

"This is a sample. http://www.egg1.com and http://egg2.com. This regex will only match 
 this egg1 and egg2 and not the others contained inside http:// "

 Match: egg1 egg2

 Replaced: replaced1 replaced2

Final Output :

Click to copy

 "This is a sample. http://www.egg1.com and http://egg2.com. This regex will only 
  match this replaced1 and replaced2 and not the others contained inside http:// "

QUESTION: Need to match certain patterns (as in example : egg1 egg2) unless they are part of http:// .Do not match egg1 and egg2 if they are present within http://

718

asked Jul 28 '11 13:07

c_prog_90

2 Answers

One solution I can think of is to form a combined pattern for HTTP-URLs and your pattern, then filter the matches accordingly:

Click to copy

import re

t = "http://www.egg1.com http://egg2.com egg3 egg4"

p = re.compile('(http://\S+)|(egg\d)')
for url, egg in p.findall(t):
  if egg:
    print egg

prints:

Click to copy

egg3
egg4

UPDATE: To use this idiom with re.sub(), just supply a filter function:

Click to copy

p = re.compile(r'(http://\S+)|(egg(\d+))')

def repl(match):
    if match.group(2):
        return 'spam{0}'.format(match.group(3))
    return match.group(0)

print p.sub(repl, t)

prints:

Click to copy

http://www.egg1.com http://egg2.com spam3 spam4

137

answered Oct 14 '22 23:10

Ferdinand Beyer

This will not capture http://...:

Click to copy

(?:http://.*?\s+)|(egg1)

answered Oct 15 '22 00:10

Karolis

Related questions
                            
                                Problems linking to static files in Django 1.3
                            
                                Are python's file write() and urlopen() methods asynchronous?
                            
                                Python tkinter Entry widget status switch via Radio buttons
                            
                                Getting rid of artifacts/grid-lines when plotting 3d surfaces
                            
                                Using Python to extract images and text from a word document
                            
                                Is there an easy way to capture all Frame/Window keystrokes in Python or wxPython
                            
                                Run a .bat program in the background on Windows
                            
                                Installing Pygame for Python 3.1.2 in Ubuntu
                            
                                threading - how to get parent id/name?
                            
                                Django - No module named PIL
                            
                                Django from the point of view of Zend Framework developer
                            
                                mod_wsgi process getting killed and django stops working
                            
                                Twisted application without twistd
                            
                                Incorrect exit code in python when calling windows script
                            
                                How to populate shelf with existing dictionary
                            
                                Django model layer for HBase support
                            
                                inspect.getfile () vs inspect.getsourcefile()
                            
                                SqlAlchemy Migrate Declarative
                            
                                Is Python on every GNU/Linux distribution?
                            
                                Import arbitrary-named file as a Python module, without generating bytecode file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python regex not to match http://

Tags:

python

regex

regex-negation

c_prog_90

People also ask

2 Answers

Ferdinand Beyer

Karolis

Recent Activity

Donate For Us