Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A simple regex search and replacement in php for minifying/compressing javascript?

Can you post a regex search and replacement in php for minifying/compressing javascript?

For example, here's a simple one for CSS

  header('Content-type: text/css');
  ob_start("compress");
  function compress($buffer) {
    /* remove comments */
    $buffer = preg_replace('!/\*[^*]*\*+([^/][^*]*\*+)*/!', '', $buffer);
    /* remove tabs, spaces, newlines, etc. */
    $buffer = str_replace(array("\r\n", "\r", "\n", "\t", '  ', '    ', '    '), '', $buffer);
    return $buffer;
  }

  /* put CSS here */

  ob_end_flush();

And here's one for html:

<?php
/* Minify All Output - based on the search and replace regexes. */
function sanitize_output($buffer)
{
    $search = array(
        '/\>[^\S ]+/s', //strip whitespaces after tags, except space
        '/[^\S ]+\</s', //strip whitespaces before tags, except space
        '/(\s)+/s'  // shorten multiple whitespace sequences
        );
    $replace = array(
        '>',
        '<',
        '\\1'
        );
  $buffer = preg_replace($search, $replace, $buffer);
    return $buffer;
}
ob_start("sanitize_output");
?>
<html>...</html>

But what about one for javascript?

like image 609
trusktr Avatar asked May 14 '11 03:05

trusktr


2 Answers

A simple regex for minifying/compressing javascript is unlikely to exist anywhere. There are probably several good reasons for this, but here are a couple of these reasons:

Line breaks and semicolons Good javascript minifiers remove all extra line breaks, but because javascript engines will work without semicolons at the end of each statement, a minifier could easily break this code unless it is sophisticated enough to watch for and handle different coding styles.

Dynamic Language Constructs Many of the good javascript minifiers available will also change the names of your variables and functions to minify the code. For instance, a function named 'strip_white_space' that is called 12 times in your file might be renamed simple 'a', for a savings of 192 characters in your minified code. Unless your file has a lot of comments and/or whitespace, optimizations like these are where the majority of your filesize savings will come from.

Unfortunately, this is much more complicated than a simple regex should try to handle. Say you do something as simple as:

var length = 12, height = 15;
    // other code that uses these length and height values

var arr = [1, 2, 3, 4];
for (i = (arr.length - 1); i >= 0; --i) {
    //loop code
}

This is all valid code. BUT, how does the minifier know what to replace? The first "length" has "var" before it (but it doesn't have to), but "height" just has a comma before it. And if the minifier is smart enough to replace the first "length" properly, how smart does it have to be know NOT to change the word "length" when used as a property of the array? It would get even more complicated if you defined a javascript object where you specifically defined a "length" property and referred to it with the same dot-notation.

Non-regex Options Several projects exist to solve this problem using more complex solutions than just a simple regex, but many of them don't make any attempt to change variable names, so I still stick with Dean Edwards' packer or Douglas Crockford's JSMin or something like the YUI Compressor.

PHP implementation of Douglas Crockford's JSMin

https://github.com/mrclay/minify

like image 152
Lane Avatar answered Oct 23 '22 19:10

Lane


I had a better shot at this Gist by orangeexception than Jan or B.F's answers.

preg_replace('#(?s)\s|/\*.*?\*/|//[^\r\n]*#', '', $javascript);

https://gist.github.com/orangexception/1301150/ed16505e2cb200dee0b0ab582ebbc67d5f060fe8

like image 39
tim Avatar answered Oct 23 '22 20:10

tim