For the example, I'm trying to replace
<script type='text/javascript'>some stuff</script>
with:
<div type='text/javascript'>some stuff</div>
I'm currently testing with:
alert( o.replace( /(?:<\s*\/?\s*)(script)(?:\s*([^>]*)?\s*>)/gi ,'div') );
But what I'm getting is:
divsomestuffdiv
How can I get this to only replace the "script" portion and preserve the other markup and attribute characters?
The Regex. Replace(String, String, MatchEvaluator, RegexOptions) method is useful for replacing a regular expression match if any of the following conditions is true: If the replacement string cannot readily be specified by a regular expression replacement pattern.
Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.
While arbitrary HTML with only a regex is impossible, it's sometimes appropriate to use them for parsing a limited, known set of HTML. If you have a small set of HTML pages that you want to scrape data from and then stuff into a database, regexes might work fine.
You have keep the opening and closing tag brackets. So try this:
o.replace(/(<\s*\/?\s*)script(\s*([^>]*)?\s*>)/gi ,'$1div$2')
A naive but readable way would be to do it in two passes i suppose and first match and replace the
<script
part with
<div
and then another which would match
</script>
and replace it with
</div>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With