Joshua I. Miller

NAME

JavaScript::Squish - Reduce/Compact JavaScript code to as few characters as possible.

SYNOPSIS

use JavaScript::Squish; my $compacted = JavaScript::Squish->squish( $javascript, remove_comments_exceptions => qr/copyright/i ) or die $JavaScript::Squish::err_msg;

# OR, to just do a few steps #

 my $c = JavaScript::Squish->new();
 $c->data( $javascript );
 $c->extract_strings_and_comments();
 $c->replace_white_space();
 my $new = $c->data();

DESCRIPTION

This module provides methods to compact javascript source down to just what is needed. It can remove all comments, put everything on one line (semi-)safely, and remove extra whitespace.

Any one of the various compacting techniques can be applied individually, or with in any group.

It also provides a means by which to extract all text literals or comments in separate arrays in the order they appear.

Since JavaScript eats up bandwidth, this can be very helpful, and you can then be free to properly comment your JavaScript without fear of burning up too much bandwidth.

EXPORT

None by default.

"squish" may be exported via "use JavaScript::Squish qw(squish);"

METHODS

JavaScript::Squish->squish($js [, %options] )

Class method. This is a wrapper around all methods in here, to allow you to do all compacting operations in one call.

     my $squished = JavaScript::Squish->squish( $javascript );

Current supported options:

remove_comments_exceptions : array ref of regexp's
 B<JavaScript::Squish-E<gt>squish($js, remove_comments_exceptions =E<gt> [ qr/copyright/i ] )>

Any comment strings matching any of the supplied regexp's will not be removed. This is the recommended way to retain copyright notices, while still compacting out all other comments.

JavaScript::Squish->new()

Constructor. Currently takes no options. Returns JavaScript::Squish object.

NOTE: if you want to specify a "remove_comments_exceptions" option via one of these object, you must do so directly against the remove_comments() method (SEE BELOW).

$djc->data($js)

If the option $js is passed in, this sets the javascript that will be worked on.

If not passed in, this returns the javascript in whatever state it happens to be in (so you can step through, and pull the data out at any time).

$djc->strings()

Returns all strings extracted by either extract_literal_strings() or extract_strings_and_comments() (NOTE: be sure to call one of the aforementioned extract methods prior to strings(), or you won't get anything back).

$djc->comments()

Returns all comments extracted by either extract_comments() or extract_strings_and_comments() (NOTE: be sure to call one of the aforementioned extract methods prior to strings(), or you won't get anything back).

$djc->determine_line_ending()

Method to automatically determine the line ending character in the source data.

$djc->eol_char("\n")

Method to set/override the line ending character which will be used to parse/join lines. Set to "\r\n" if you are working on a DOS / Windows formatted file.

$djc->extract_strings_and_comments()

Finds all string literals (eg. things in quotes) and comments (// or /*...*/) and replaces them with tokens of the form "\0\0N\0\0" and "\0\0_N_\0\0" respectively, where N is the occurrance number in the file, and \0 is the null byte. The strings are stored inside the object so they may be resotred later.

After calling this, you may retrieve a list of all extracted strings or comments using the strings() or comments() methods.

$djc->extract_literal_strings()

This is a wrapper around extract_strings_and_comments(), which will restore all comments afterwards (if they had not been stripped prior to its call).

NOTE: sets $djc->strings()

$djc->extract_comments()

This is a wrapper around extract_strings_and_comments(), which will restore all literal strings afterwards (if they had not been stripped prior to its call).

NOTE: sets $djc->comments()

$djc->replace_white_space()

Per each line:

  • Removes all begining of line whitespace.

  • Removes all end of line whitespace.

  • Combined all series of whitespace into one space character (eg. s/\s+/ /g)

Comments and string literals (if still embeded) are untouched.

$djc->remove_blank_lines()

...does what it says.

Comments and string literals (if still embeded) are untouched.

$djc->combine_concats()

Removes any string literal concatenations. Eg.

    "bob and " +   "sam " + someVar;

Becomes:

    "bob and sam " + someVar

Comments (if still embeded) are untouched.

$djc->join_all()

Puts everything on one line.

Coments begining with "//", if still embeded, are the exception, as they require a new line character at the end of the comment.

$djc->replace_extra_whitespace()

This removes any excess whitespace. Eg.

    if (someVar = "foo") {

Becomes:

    if(someVar="foo"){

Comments and string literals (if still embeded) are untouched.

$djc->remove_comments(%options)

Current supported options:

exceptions : array ref of regexp's
 B<$djc-E<gt>remove_comments( exceptions =E<gt> [ qr/copyright/i ] )>

Any comment strings matching any of the supplied regexp's will not be removed. This is the recommended way to retain copyright notices, while still compacting out all other comments.

NOTE: this is destructive (ie. you cannot restore comments after this has been called).

$djc->restore_comments()

All comments that were extracted with $djc->extract_strings_and_comments() or $djc->extract_comments() are restored. Comments retain all spacing and extra lines and such.

$djc->restore_literal_strings()

All string literals that were extracted with $djc->extract_strings_and_comments() or $djc->extract_comments() are restored. String literals retain all spacing and extra lines and such.

$djc->replace_final_eol()

Prior to this being called, the end of line may not terminated with a new line character (especially after some of the steps above). This assures the data ends in at least one of whatever is set in $djc->eol_char().

NOTES

The following should only cause an issue in rare and odd situations... If the input file is in dos format (line termination with "\r\n" (ie. CR LF / Carriage return Line feed)), we'll attempt to make the output the same. If you have a mixture of embeded "\r\n" and "\n" characters (not escaped, those are still safe) then this script may get confused and make them all conform to whatever is first seen in the file.

The line-feed stripping isn't as thorough as it could be. It matches the behavior of JSMIN, and goes one step better with replace_extra_whitespace(), but I'm certain there are edge cases that could be optimised further. This shouldn't cause a noticable increase in size though.

TODO

Function and variable renaming, and other more dangerous compating techniques.

Currently, JavaScript::Squish::err_msg never gets set, as we die on any real errors. We should look into returning proper error codes and setting this if needed.

Fix Bugs :-)

BUGS

There are a few bugs, which may rear their head in some minor situations.

Statements not terminated by semi-colon.

These should be ok now - leaving a note here because this hasn't been thoroughly tested (I don't have any javascript to test with that meets this criteria).

This would affect statements like the following:

    i = 5.4
    j = 42

This used to become "i=5.4 j=42", and would generate an error along the lines of "expected ';' before statement".

The linebreak should be retained now. Please let me know if you see otherwise.

Ambiguous operator precidence

Operator precidence may get screwed up in ambiguous statements. Eg. "x = y + ++b;" will be compacted into "x=y+++b;", which means something different.

Still looking for them. If you find some, let us know.

SEE ALSO

Latest releases, bugzilla, cvs repository, etc:

https://developer.berlios.de/projects/jscompactor/

Simlar projects:
    http://crockford.com/javascript/jsmin
    http://search.cpan.org/%7Epmichaux/JavaScript-Minifier/lib/JavaScript/Minifier.pm
    http://dojotoolkit.org/docs/shrinksafe
    http://dean.edwards.name/packer/

AUTHOR

Joshua I. Miller <jmiller@puriifeddata.net>

COPYRIGHT AND LICENSE

Copyright (c) 2005 by CallTech Communications, Inc.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.3 or, at your option, any later version of Perl 5 you may have available.




Hosting generously
sponsored by Bytemark