Regular expression for getting text between XML elements

Question

I am looking at this regular expressions

<(\w*)>\.*</(\w*)>

Going thru tutorials etc. I understand it as reading, match anything that follows the form

<tag1>blah</tag1>

i.e. an XML element, some text and a closing XML element. However, when I run it on various regular expression checkers for example, Expresso it is not matching what I think it should.

Note: to complicate matters further this regular expression is in Java which as I understand means there are some subtle differences.

What are my missing?

Anything appreciated...

Thanks

Kirill Polishchuk · Accepted Answer

Use:

<(\w*)>.*</(\w*)>

\w – literal \, then w
\ – literal \

Aram Kocharyan · Answer

Escaping is only needed for literals, but some languages use \ to escape characters in strings themselves, forcing you to use \ in the string to mean \ in regex land. And trying to pull off \ (a literal \ in regex) can be \\ in such languages. I think this can be the cause of the confusion when seeing \ in example code.

Improving the regex:

If someone wanted to be a douche and construct an irregular expression like:

< _some_tag some="stuff" >
    some <strong>content</strong>
< / _some_tag >

You can use this more generic regex that will capture the tag name, content and attributes.

<\s*([A-Za-z_]\w*)\s*([^\>]+)>(.*?)<\s*/\s\1\s*>

Note that .*? is required in case the same tag exists further in the page, otherwise keeping it greedy will make it capture everything until the last tag with that name closes. Also <tag1>blah</tag2> is obviously bogus, but if you wanted to have that flexible you could just change the last part of this regex.

Regular expression for getting text between XML elements

Tags:

java

regex

xml

dublintech

2 Answers

Kirill Polishchuk

Aram Kocharyan

Recent Activity

Donate For Us

Regular expression for getting text between XML elements

Tags:

java

regex

xml

dublintech

2 Answers

Kirill Polishchuk

Aram Kocharyan

Related questions

Recent Activity

Donate For Us