Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match content between HTML specific tags with attribute using grep?

Which regular expression should I use with the command grep if I wanted to match the text contained within the tag <div class="Message"> and its closing tag </div> in an HTML file?

like image 227
Albz Avatar asked Nov 26 '12 14:11

Albz


People also ask

How do you get the contents between tags in HTML?

The preg_match() function is the best option to extract text between HTML tags with REGEX in PHP. If you want to get content between tags, use regular expressions with preg_match() function in PHP. You can also extract the content inside element based on class name or ID using PHP.

Can regex be used with grep?

GNU grep supports three regular expression syntaxes, Basic, Extended, and Perl-compatible. In its simplest form, when no regular expression type is given, grep interpret search patterns as basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or --extended-regexp ) option.

How do you grep for lines that don't match?

To display only the lines that do not match a search pattern, use the -v ( or --invert-match ) option. The -w option tells grep to return only those lines where the specified string is a whole word (enclosed by non-word characters). By default, grep is case-sensitive.


1 Answers

Here's one way using GNU grep:

grep -oP '(?<=<div class="Message"> ).*?(?= </div>)' file

If your tags span multiple lines, try:

< file tr -d '\n' | grep -oP '(?<=<div class="Message"> ).*?(?= </div>)'
like image 191
Steve Avatar answered Sep 18 '22 08:09

Steve