Here is a sample custom tag i have from a sitemap.xml <pre class="prettyprint"><code><url> <loc>http://sitename.com/programming/php/?C=D;O=A</loc> <changefreq>weekly</changefreq> <priority>0.64</priority> </url> </code></pre> There are many entries like this and if you see loc tag it has c=d;0=a at the end. I want to remove all entries starting with <code><url></code> ending with <code></url></code> which contains C=D;0=A or similar patterns like that. The following expression matched the whole of the above specified tag <pre class="prettyprint"><code><url>(.|\r\n)*?<\/url> </code></pre> but I want to match like what i had specified in the above statement. How do we form regex to match such conditions(patterns) ?

Try this: <pre class="prettyprint"><code>/<url>(?:(?!<\/url>).)*C=D;O=A.*?<\/url>/m </code></pre> The negative lookahead guaranties that you do not match multiple nodes. See here: rubular

regex matching an open and close tag and a certain text patterns inside that tag [duplicate]

Q: What is regex pattern?

A regex pattern matches a target string. The pattern is composed of a sequence of atoms. An atom is a single point within the regex pattern which it tries to match to the target string. The simplest atom is a literal, but grouping parts of the pattern to match an atom will require using ( ) as metacharacters.

Q: What are word characters in regex?

A word character is a character a-z, A-Z, 0-9, including _ (underscore).

Tags:

regex

xml

Here is a sample custom tag i have from a sitemap.xml

<url>
  <loc>http://sitename.com/programming/php/?C=D;O=A</loc>
  <changefreq>weekly</changefreq>
  <priority>0.64</priority>
</url>

There are many entries like this and if you see loc tag it has c=d;0=a at the end. I want to remove all entries starting with <url> ending with </url> which contains C=D;0=A or similar patterns like that.

The following expression matched the whole of the above specified tag

<url>(.|\r\n)*?<\/url>

but I want to match like what i had specified in the above statement.

How do we form regex to match such conditions(patterns) ?

495

asked Jun 16 '11 08:06

Jayapal Chandran

1 Answers

Try this:

/<url>(?:(?!<\/url>).)*C=D;O=A.*?<\/url>/m

The negative lookahead guaranties that you do not match multiple nodes.

See here: rubular

196

answered Sep 23 '22 09:09

morja

Related questions
                            
                                xmlstarlet sel on large file
                            
                                nested scrollView doesn't recognize the toolbar
                            
                                XML parsing error: why is semicolon expected? [duplicate]
                            
                                JavaFx TableView Columns don't fill the TableView Width
                            
                                Splash Image size
                            
                                What is the difference between XML and AXML in Xamarin.Android Apps?
                            
                                Select unique XElements (by attribute) with a filter using LinqToXml
                            
                                Wrapping Arbitrary XML within XML
                            
                                xml.parsers.expat.ExpatError on parsing XML
                            
                                Creating a SOAP request with PHP - how do I add attributes to the XML tags?
                            
                                SQL Server XML output with CDATA
                            
                                Calling a Stored Procedure with XML Datatype
                            
                                How do I alter XML with PowerShell/XPath and save the document?
                            
                                An XML viewer/editor that provides XPath for nodes
                            
                                Importing data from an XML file into R
                            
                                How can I process xml asynchronously in python?
                            
                                Java saying XML Document Not Well Formed
                            
                                XSLT To remove empty nodes and nodes with -1
                            
                                xslt need to select a single quote
                            
                                What is the correct terminology for an HTML/XML tag that closes itself?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With