Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: Scrub HTML

I have a bunch of HTML code where I want to remove all the HTML markup.

I think this is possible with Regex (Regular expression). With search and replace, how would I do this?

I tried <*> where I thought * was a wildcard, but apparently not. How would I make regex find all the
< text > ?

like image 593
dukevin Avatar asked Dec 02 '22 03:12

dukevin


2 Answers

A simple version would be:

<[^>]+>

[] defines a character class, ^ excludes characters. Here is an example.

like image 75
miku Avatar answered Dec 04 '22 18:12

miku


Take a look at this: http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx

like image 20
limc Avatar answered Dec 04 '22 17:12

limc