I spent some time looking best way to escape html string and found some discussions on that: discussion 1 discussion 2. It leads me to replaceAll function. Then I did performance tests and tried to find solution achieving similar speed with no success :(
Here is my final test case set. I found it on net and expand with my tries (4 cases at bottom) and still can not reach replaceAll()
performance.
What is secret witch makes replaceAll()
solution so speedy?
Greets!
Code snippets:
String.prototype.replaceAll = function(str1, str2, ignore)
{
return this.replace(new RegExp(str1.replace(/([\/\,\!\\\^\$\{\}\[\]\(\)\.\*\+\?\|\<\>\-\&])/g,"\\$&"),(ignore?"gi":"g")),(typeof(str2)=="string")?str2.replace(/\$/g,"$$$$"):str2);
};
credits for qwerty
Fastest case so far:
html.replaceAll('&', '&').replaceAll('"', '"').replaceAll("'", ''').replaceAll('<', '<').replaceAll('>', '>');
The replaceAll() method returns a new string with all matches of a pattern replaced by a replacement . The pattern can be a string or a RegExp , and the replacement can be a string or a function to be called for each match. The original string is left unchanged.
Finally i found it! Thanks Jack for pointing me on jsperf specific
I should note that the test results are strange; when .replaceAll() is defined inside Benchmark.prototype.setup it runs twice as fast compared to when it's defined globally (i.e. inside a tag). I'm still not sure why that is, but it definitely must be related to how jsperf itself works.
The answer is:
replaceAll
- this reach jsperf limit/bug, caused by special sequence "\\$&"
, so results was wrong.
compile()
- when called with no argument it changes regexp definition to /(?:)
. I dont know if it is bug or something, but performance result was crappy after it was called.
Here is my result safe tests.
Finally I prepared proper test cases.
The result is, that for HTML escape best way it to use native DOM based solution, like:
document.createElement('div').appendChild(document.createTextNode(html)).parentNode.innerHTML
or if you repeat it many times you can do it with once prepared variables:
//prepare variables
var DOMtext = document.createTextNode("test");
var DOMnative = document.createElement("span");
DOMnative.appendChild(DOMtext);
//main work for each case
function HTMLescape(html){
DOMtext.nodeValue = html;
return DOMnative.innerHTML
}
Thank you all for collaboration & posting comments and directions.
jsperf bug description
The String.prototype.replaceAll
was defined as followed:
function (str1, str2, ignore) {
return this.replace(new RegExp(str1.replace(repAll, "\\#{setup}"), (ignore ? "gi" : "g")), (typeof(str2) == "string") ? str2.replace(/\$/g, "$$") : str2);
}
As far as performance goes, I find that the below function is as good as it gets:
String.prototype.htmlEscape = function() {
var amp_re = /&/g, sq_re = /'/g, quot_re = /"/g, lt_re = /</g, gt_re = />/g;
return function() {
return this
.replace(amp_re, '&')
.replace(sq_re, ''')
.replace(quot_re, '"')
.replace(lt_re, '<')
.replace(gt_re, '>');
}
}();
It initializes the regular expressions and returns a closure that actually performs the replacement.
Performance test
I should note that the test results are strange; when .replaceAll()
is defined inside Benchmark.prototype.setup
it runs twice as fast compared to when it's defined globally (i.e. inside a <script>
tag). I'm still not sure why that is, but it definitely must be related to how jsperf itself works.
Using RegExp.compile()
I wanted to avoid using a deprecated function, mostly because this kind of performance should be done automatically by modern browsers. Here's a version with compiled expressions:
String.prototype.htmlEscape2 = function() {
var amp_re = /&/g, sq_re = /'/g, quot_re = /"/g, lt_re = /</g, gt_re = />/g;
if (RegExp.prototype.compile) {
amp_re.compile();
sq_re.compile();
quot_re.compile();
lt_re.compile();
gt_re.compile();
}
return function() {
return this
.replace(amp_re, '&')
.replace(sq_re, ''')
.replace(quot_re, '"')
.replace(lt_re, '<')
.replace(gt_re, '>');
}
}
Doing so blows everything else out of the water!
Performance test
The reason why .compile()
gives such a performance boost is because when you compile a global expression, e.g. /a/g
it gets converted to /(?:)/
(on Chrome), which renders it useless.
If compilation can't be done, a browser should throw an error instead of silently destroying it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With