Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write regex to remove whitespace between tag and words for HTMl minifier [duplicate]

I am building a very simple HTML minifier. So far so good.

var file = process.argv[2], html = "", fs = require("fs");

html = fs.readFileSync(file, "utf8");
string = html.replace(/\n/g, "");
var x = string.replace(/[\t ]+\</g, "<");
var y = x.replace(/\>[\t ]+\</g, "><");
var z = y.replace(/\>[\t ]+$/g, ">");

console.log(z)

returns string: <div id="hello"><p class="new"> Hello</p></div>

How do I write a regex to get rid of any space that will appear between words and tags (before and after)? Should return: <div id="hello"><p class="new">Hello</p></div>

like image 570
Dear1ofGdBear Avatar asked Dec 21 '25 21:12

Dear1ofGdBear


1 Answers

This should work for you:

var html = '<div id="hello"><p class="new">            Hello  friend  </p></div>';

var result = html.replace(/>\s+|\s+</g, function(m) {
    return m.trim();
});

https://jsfiddle.net/5gbhhh25/

It will only remove spaces between a tag and a word (opening and closing). So it won't affect text in tags or spaces between text.

torazaburo makes a good point about a potential pitfall in OP's requirements where a single space is required to preserve the structure of the text. So Tushar's solution of str.replace(/\s+/g, ' '); would work perfectly in that case.

like image 123
lintmouse Avatar answered Dec 23 '25 10:12

lintmouse



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!