I am using the following Regular Expresion to remove html tags from a string. It works except I leave the closing tag. If I attempt to remove: <a href="blah">blah</a>
it leaves the <a/>
.
I do not know Regular Expression syntax at all and fumbled through this. Can someone with RegEx knowledge please provide me with a pattern that will work.
Here is my code:
string sPattern = @"<\/?!?(img|a)[^>]*>"; Regex rgx = new Regex(sPattern); Match m = rgx.Match(sSummary); string sResult = ""; if (m.Success) sResult = rgx.Replace(sSummary, "", 1);
I am looking to remove the first occurence of the <a>
and <img>
tags.
Approach: Select the HTML element which need to remove. Use JavaScript remove() and removeChild() method to remove the element from the HTML document.
Regular expressions, or regex for short, are a series of special characters that define a search pattern. These expressions can remove lengthy validation functions and replace them with simple expressions.
To remove html tags from string in react js, just use the /(<([^>]+)>)/ig regex with replace() method it will remove tags with their attribute and return new string.
The strip_tags() function strips a string from HTML, XML, and PHP tags.
Using a regular expression to parse HTML is fraught with pitfalls. HTML is not a regular language and hence can't be 100% correctly parsed with a regex. This is just one of many problems you will run into. The best approach is to use an HTML / XML parser to do this for you.
Here is a link to a blog post I wrote awhile back which goes into more details about this problem.
That being said, here's a solution that should fix this particular problem. It in no way is a perfect solution though.
var pattern = @"<(img|a)[^>]*>(?<content>[^<]*)<"; var regex = new Regex(pattern); var m = regex.Match(sSummary); if ( m.Success ) { sResult = m.Groups["content"].Value;
To turn this:
'<td>mamma</td><td><strong>papa</strong></td>'
into this:
'mamma papa'
You need to replace the tags with spaces:
.replace(/<[^>]*>/g, ' ')
and reduce any duplicate spaces into single spaces:
.replace(/\s{2,}/g, ' ')
then trim away leading and trailing spaces with:
.trim();
Meaning that your remove tag function look like this:
function removeTags(string){ return string.replace(/<[^>]*>/g, ' ') .replace(/\s{2,}/g, ' ') .trim(); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With