What's the best way to select all text between 2 comment tags? E.g.
<!-- Text 1
Text 2
Text 3
-->
<\!--.*
will capture <!-- Text 1
but not Text 2
, Text 3
, or -->
Edit
As per Basti M's answer, <\!--((?:.*\n)*)-->
will select everything between the first <!--
and last -->
. I.e. lines 1 to 11 below.
How would I modify this to select just lines within separate tags? i.e. lines 1 to 4:
1 <!-- Text 1 //First
2 Text 2
3 Text 3
4 -->
5
6 More text
7
8 <!-- Text 4
9 Text 5
10 Text 6
11 --> //Last
Depending on your underlying engine use the s
-modifier (and add -->
at the end of your expression.
This will make the .
match newline-characters aswell.
If the s
-flag is not available to you, you may use
<!--((?:.*\r?\n?)*)-->
Explanation:
<!-- #start of comment
( #start of capturing group
(?: #start of non-capturing group
.*\r?\n? #match every character including a line-break
)* #end of non-capturing group, repeated between zero and unlimited times
) #end of capturing group
--> #end of comment
To match multiple comment blocks you can use
/(?:<!--((?:.*?\r?\n?)*)-->)+/g
Demo @ Regex101
Use the s
modifier to match new lines. E.g.:
/<!--(.*)-->/s
Demo: http://regex101.com/r/lH0jK9
Regex is not the right tool to parse html or xml, use a proper parser, I use xpath here :
$ cat file.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test>
<!-- Text 1
Text 2
Text 3
-->
</test>
The test :
$ xmllint --xpath '/test/comment()' file.xml
<!-- Text 1
Text 2
Text 3
-->
If you parse html, use the --html
switch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With