Let's say I have the following string:
var myString = "<p>hello</p><script>console.log('hello')</script><h1>Test</h1><script>console.log('world')</script>"
I would like to use split to get an array with the contents of the script tags. e.g. I want my output to be:
["console.log('hello')", "console.log('world')"]
I tried doing myString.split(/[<script></script>]/)
But did not get the expected output.
Any help is appreciated.
The <script> tag can be placed in the <head> section of your HTML or in the <body> section, depending on when you want the JavaScript to load.
The <script> tag is used to embed a client-side script (JavaScript). The <script> element either contains scripting statements, or it points to an external script file through the src attribute. Common uses for JavaScript are image manipulation, form validation, and dynamic changes of content.
The JSON.parse() method parses a JSON string, constructing the JavaScript value or object described by the string. An optional reviver function can be provided to perform a transformation on the resulting object before it is returned.
You can't parse (X)HTML with regex.
Instead, you can parse it using innerHTML
.
var element = document.createElement('div');
element.innerHTML = myString; // Parse HTML properly (but unsafely)
However, this is not safe. Even if innerHTML
doesn't run the JS inside script
elements, malicious strings can still run arbitrary JS, e.g. with <img src="//" onerror="alert()">
.
To avoid that problem, you can use DOMImplementation.createHTMLDocument
to create a new document, which can be used as a sandbox.
var doc = document.implementation.createHTMLDocument(); // Sandbox
doc.body.innerHTML = myString; // Parse HTML properly
Alternatively, new browsers support DOMParser
:
var doc = new DOMParser().parseFromString(myString, 'text/html');
Once the HTML string has been parsed to the DOM, you can use DOM methods like getElementsByTagName
or querySelectorAll
to get all the script
elements.
var scriptElements = doc.getElementsByTagName('script');
Finally, [].map
can be used to obtain an array with the textContent
of each script
element.
var arrayScriptContents = [].map.call(scriptElements, function(el) {
return el.textContent;
});
The full code would be
var doc = document.implementation.createHTMLDocument(); // Sandbox
doc.body.innerHTML = myString; // Parse HTML properly
[].map.call(doc.getElementsByTagName('script'), function(el) {
return el.textContent;
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With