Regex matching between two strings?

Tags:

I can't seem to find a way to extract all comments like in following example.

>>> import re
>>> string = '''
... <!-- one 
... -->
... <!-- two -- -- -->
... <!-- three -->
... '''
>>> m = re.findall ( '<!--([^\(-->)]+)-->', string, re.MULTILINE)
>>> m
[' one \n', ' three ']

block with two -- -- is not matched most likely because of bad regex. Can someone please point me in right direction how to extract matches between two strings.

Hi I've tested what you guys suggested in comments.... here is working solution with little upgrade.

>>> m = re.findall ( '<!--(.*?)-->', string, re.MULTILINE)
>>> m
[' two -- -- ', ' three ']
>>> m = re.findall ( '<!--(.*\n?)-->', string, re.MULTILINE)
>>> m
[' one \n', ' two -- -- ', ' three ']

thanks!

318

asked Oct 04 '12 21:10

2 Answers

this should do the trick

 m = re.findall ( '<!--(.*?)-->', string, re.DOTALL)

129

answered Oct 18 '22 02:10

iruvar

In general, it is impossible to do arbitrary matching between two delimiters with a regular grammar.

Specifcally, if you allow nesting,

<!-- how do you deal <!-- with nested --> comments? -->

you'll run in to issues. So, while you may be able to solve this specific problem with a regular expression, any regular expression that you write will be able to be broken by some other strange nesting of comments.

To parse arbitrary comments, you'll need to move on to a method of parsing context free grammars. A simple method to do so is to use a pushdown automaton.

answered Oct 18 '22 01:10

Wilduck

Related questions
                            
                                heroku: no default language could be detected for this app
                            
                                MacOS: How to downgrade homebrew Python?
                            
                                What column type does SQLAlchemy use for "Text" on MySQL?
                            
                                How to drop DataFrame columns based on dtype
                            
                                Docker compose script complaining about a python module import
                            
                                Python at AWS Lambda: `requests` from botocore.vendored deprecated, but `requests` not available
                            
                                Correct way to detect sequence parameter?
                            
                                What is a simple way to generate keywords from a text?
                            
                                How to use cherrypy as a web server for static files?
                            
                                Unescape Python Strings From HTTP
                            
                                how to isinstance(x, module)?
                            
                                Execute python code inside browser without Jython
                            
                                Python equivalent of Curl HTTP post
                            
                                Set python virtualenv in vim
                            
                                Override Django form field's name attr
                            
                                Build error with variables and url_for in Flask
                            
                                python RuntimeError: dictionary changed size during iteration
                            
                                Efficient FIFO queue for arbitrarily sized chunks of bytes in Python
                            
                                How to generate an html directory list using Python
                            
                                BeautifulSoup, a dictionary from an HTML table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regex matching between two strings?

Tags:

python

regex

regex-negation

python-3.x

Hrvoje Špoljar

People also ask

2 Answers

iruvar

Wilduck

Recent Activity

Donate For Us