Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to match all characters between <h1> tag

Tags:

I'm using sublime text 2 editor. I would like to use regex to match all character between all h1 tags.

As of now i'm using like this

<h1>.+</h1>

Its working fine if the h1 tag doesn't have breaks.

I mean for

<h1>Hello this is a hedaer</h1>

its working fine.

But its not working if the tag look like this

<h1>
   Hello this is a hedaer
</h1>

Can someone help me with the syntax?

like image 925
PrivateUser Avatar asked Jan 25 '13 15:01

PrivateUser


1 Answers

By default . matches every character except new line character.

In this case, you will need DOTALL option, which will make . matches any character, including new line character. DOTALL option can be specified inline as (?s). For example:

(?s)<h1>.+</h1>

However, you will see that it will not work, since the default behavior of the quantifier is greedy (in this case its +), which means that it will try to consume as many characters as possible. You will need to make it lazy (consume as few characters as possible) by adding extra ? after the quantifier +?:

(?s)<h1>.+?</h1>

Alternatively, the regex can be <h1>[^<>]*</h1>. In this case, you don't need to specify any option.

like image 139
Anirudha Avatar answered Oct 11 '22 21:10

Anirudha