What's the best way to select all text between 2 comment tags? E.g.
<!-- Text 1
Text 2
Text 3
-->
<\!--.* will capture <!-- Text 1 but not Text 2, Text 3, or -->
Edit
As per Basti M's answer, <\!--((?:.*\n)*)--> will select everything between the first <!-- and last -->. I.e. lines 1 to 11 below.
How would I modify this to select just lines within separate tags? i.e. lines 1 to 4:
1 <!-- Text 1 //First
2 Text 2
3 Text 3
4 -->
5
6 More text
7
8 <!-- Text 4
9 Text 5
10 Text 6
11 --> //Last
Depending on your underlying engine use the s-modifier (and add --> at the end of your expression.
This will make the . match newline-characters aswell.
If the s-flag is not available to you, you may use
<!--((?:.*\r?\n?)*)-->
Explanation:
<!-- #start of comment
( #start of capturing group
(?: #start of non-capturing group
.*\r?\n? #match every character including a line-break
)* #end of non-capturing group, repeated between zero and unlimited times
) #end of capturing group
--> #end of comment
To match multiple comment blocks you can use
/(?:<!--((?:.*?\r?\n?)*)-->)+/g
Demo @ Regex101
Use the s modifier to match new lines. E.g.:
/<!--(.*)-->/s
Demo: http://regex101.com/r/lH0jK9
Regex is not the right tool to parse html or xml, use a proper parser, I use xpath here :
$ cat file.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test>
<!-- Text 1
Text 2
Text 3
-->
</test>
The test :
$ xmllint --xpath '/test/comment()' file.xml
<!-- Text 1
Text 2
Text 3
-->
If you parse html, use the --html switch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With