-
-
07 Nov 2004 21:28:45 UTC
- Distribution: Apache-Dynagzip
- Module version: 0.16
- Source (raw)
- Browse (raw)
- Changes
- How to Contribute
- Issues
- Testers (6 / 0 / 0)
- Kwalitee
Bus factor: 0- % Coverage
- License: unknown
- Activity
24 month- Tools
- Download (23.96KB)
- MetaCPAN Explorer
- Permissions
- Subscribe to distribution
- Permalinks
- This version
- Latest version
- Dependencies
- unknown
- Reverse dependencies
- CPAN Testers List
- Dependency graph
- NAME
- ABSTRACT
- SYNOPSIS
- INTRODUCTION
- DESCRIPTION
- CUSTOMIZATION
- TROUBLESHOOTING
- DEPENDENCIES
- AUTHOR
- COPYRIGHT AND LICENSE
- SEE ALSO
NAME
Apache::Dynagzip - mod_perl extension for
Apache-1.3.X
to compress the response withgzip
format.ABSTRACT
This Apache handler provides dynamic content compression of the response data stream for
HTTP/1.0
andHTTP/1.1
requests. Standardgzip
compression is optionally combined with anextra light
compression that eliminates leading blank spaces and/or blank lines within the source document. Anextra light
compression could be applied even when the client (browser) is not capable to decompressgzip
format.Handler helps to compress the outbound HTML content usually by 3 to 20 times, and provides a list of useful features. This is particularly useful for compressing outgoing web content that is dynamically generated on the fly (using templates, DB data, XML, etc.), when at the time of the request it is impossible to determine the length of the document to be transmitted. Support for Perl, Java, and C source generators is provided.
Besides the benefits of reduced document size, this approach gains efficiency from being able to overlap the various phases of data generation, compression, transmission, and decompression. In fact, the browser can start to decompress a document, which has not yet been completely generated.
SYNOPSIS
There is more then one way to configure Apache to use this handler...
Compress regular (static) HTML files
====================================================== Static html file (size=149208) no light compression: ====================================================== httpd.conf: PerlModule Apache::Dynagzip <Files ~ "*\.html"> SetHandler perl-script PerlHandler Apache::Dynagzip </Files> client-side log: C05 --> S06 GET /html/wowtmovie.html HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Referer: http://devl4.outlook.net/html/ C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Fri, 31 May 2002 17:36:57 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Friday, 31-May-2002 17:41:57 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 9411 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 1314 (hex) = 4884 (dec) 3ed (hex) = 1005 (dec) 354 (hex) = 852 (dec) 450 (hex) = 1104 (dec) 5e6 (hex) = 1510 (dec) 0 (hex) = 0 (dec) == Latency = 0.170 seconds, Extra Delay = 0.440 seconds == Restored Body was 149208 bytes == ====================================================== Static html file (size=149208) with light compression: ====================================================== httpd.conf: PerlModule Apache::Dynagzip <Files ~ "*\.html"> SetHandler perl-script PerlHandler Apache::Dynagzip PerlSetVar LightCompression On </Files> client-side log: C05 --> S06 GET /html/wowtmovie.html HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Referer: http://devl4.outlook.net/html/ C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Fri, 31 May 2002 17:49:06 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Friday, 31-May-2002 17:54:06 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 8515 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 119f (hex) = 4511 (dec) 3cb (hex) = 971 (dec) 472 (hex) = 1138 (dec) 736 (hex) = 1846 (dec) 0 (hex) = 0 (dec) == Latency = 0.280 seconds, Extra Delay = 0.820 seconds == Restored Body was 128192 bytes ==
Default values for the
minChunkSizeSource
andminChunkSize
will be in effect in this case. In order to overwrite them one can try for example<IfModule mod_perl.c> PerlModule Apache::Dynagzip <Files ~ "*\.html"> SetHandler perl-script PerlHandler Apache::Dynagzip PerlSetVar minChunkSizeSource 36000 PerlSetVar minChunkSize 9 </Files> </IfModule>
Compress the output stream of the Perl scripts
=============================================================================== GET dynamically generated (by perl script) html file with no light compression: =============================================================================== httpd.conf: PerlModule Apache::Filter PerlModule Apache::Dynagzip <Directory /var/www/perl/> SetHandler perl-script PerlHandler Apache::RegistryFilter Apache::Dynagzip PerlSetVar Filter On PerlSendHeader Off PerlSetupEnv On AllowOverride None Options ExecCGI FollowSymLinks Order allow,deny Allow from all </Directory> client-side log: C05 --> S06 GET /perl/start_example.cgi HTTP/1.1 C05 --> S06 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/msword, */* C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Sat, 01 Jun 2002 16:59:47 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Saturday, 01-June-2002 17:04:47 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 758 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 2db (hex) = 731 (dec) 0 (hex) = 0 (dec) == Latency = 0.220 seconds, Extra Delay = 0.050 seconds == Restored Body was 1434 bytes == ============================================================================ GET dynamically generated (by perl script) html file with light compression: ============================================================================ httpd.conf: PerlModule Apache::Filter PerlModule Apache::Dynagzip <Directory /var/www/perl/> SetHandler perl-script PerlHandler Apache::RegistryFilter Apache::Dynagzip PerlSetVar Filter On PerlSetVar LightCompression On PerlSendHeader Off PerlSetupEnv On AllowOverride None Options ExecCGI FollowSymLinks Order allow,deny Allow from all </Directory> client-side log: C05 --> S06 GET /perl/start_example.cgi HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Sat, 01 Jun 2002 17:09:13 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Saturday, 01-June-2002 17:14:14 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 750 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 2d3 (hex) = 723 (dec) 0 (hex) = 0 (dec) == Latency = 0.280 seconds, Extra Delay = 0.000 seconds == Restored Body was 1416 bytes ==
Compress the outgoing stream from the CGI binary
==================================================================================== GET dynamically generated (by C-written binary) html file with no light compression: ==================================================================================== httpd.conf: PerlModule Apache::Dynagzip <Directory /var/www/cgi-bin/> SetHandler perl-script PerlHandler Apache::Dynagzip AllowOverride None Options +ExecCGI PerlSetupEnv On PerlSetVar BinaryCGI On Order allow,deny Allow from all </Directory> client-side log: C05 --> S06 GET /cgi-bin/mylook.cgi HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Fri, 31 May 2002 23:18:17 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Friday, 31-May-2002 23:23:17 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 1002 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 3cf (hex) = 975 (dec) 0 (hex) = 0 (dec) == Latency = 0.110 seconds, Extra Delay = 0.110 seconds == Restored Body was 1954 bytes == ================================================================================= GET dynamically generated (by C-written binary) html file with light compression: ================================================================================= httpd.conf: PerlModule Apache::Dynagzip <Directory /var/www/cgi-bin/> SetHandler perl-script PerlHandler Apache::Dynagzip AllowOverride None Options +ExecCGI PerlSetupEnv On PerlSetVar BinaryCGI On PerlSetVar LightCompression On Order allow,deny Allow from all </Directory> client-side log: C05 --> S06 GET /cgi-bin/mylook.cgi HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Fri, 31 May 2002 23:37:45 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Expires: Friday, 31-May-2002 23:42:45 GMT C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 994 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 3c7 (hex) = 967 (dec) 0 (hex) = 0 (dec) == Latency = 0.170 seconds, Extra Delay = 0.110 seconds == Restored Body was 1862 bytes ==
INTRODUCTION
From a historical point of view this package was developed primarily in order to compress the output of a proprietary CGI binary written in C that was widely used by Outlook Technologies, Inc. in order to deliver uncompressed dynamically generated HTML content over the Internet using
HTTP/1.0
since the mid-'90s. We were then presented with the challenge of using the content compression features overHTTP/1.1
on busy production servers, especially those serving heavy traffic on virtual hosts of popular American broadcasting companies.The very first our attempts to implement a static gzip approach in order to compress the dynamic content helped us to scale effectively the bandwidth at the cost of significantly increased latency of the content delivery.
That was why I came up with an idea to use chunked data transmission of the gzipped content, sharing a real time between the server side data creation/compression, media data transmission, and the client side data decompression/presentation in order to provide end users with the partially displayed content as soon as it's possible in particular conditions of the user's connection.
At the time we decided to go for dynamic compression there were no appropriate software on the market. Even later in February 2002 Nicholas Oxhøj wrote to the mod_perl mailing list about his experience of finding Apache gzipper for the streaming outgoing content:
"... I have been experimenting with all the different Apache compression modules I have been able to find, but have not been able to get the desired result. I have tried Apache::GzipChain, Apache::Compress, mod_gzip and mod_deflate, with different results. One I cannot get to work at all. Most work, but seem to collect all the output before compressing it and sending it to the browser...
... Wouldn't it be nice to have some option to specify that the handler should flush and send the currently compressed output every time it had received a certain amount of input or every time it had generated a certain amount of output?..
... So I am basically looking for anyone who has had any success in achieving this kind of "streaming" compression, who could direct me at an appropriate Apache module."
Unfortunately for him,
Apache::Dynagzip
has not yet been publicly available at that time...Since relesed this handler is especially useful when one needs to compress the outgoing web content that is dynamically generated on the fly using templates, DB data, XML, etc., and when at the time of the request it is impossible to determine the length of the response.
Content provider can benefit additionally from the fact that handler begins the transmission of compressed data concurent to further document creation. On the other hand, the internal buffer inside the handler prevents Apache from the creation of too short chunks over
HTTP/1.1
.In order to simplify the use of this handler on public/open-source sites, the capability of content compression over HTTP/1.0 was added to this handler since the version 0.06. This helps to avoid dynamic invocation of other Apache handlers for the content generation phase.
Acknowledgments
Thanks to Tom Evans, Valerio Paolini, and Serge Bizyayev for their valuable idea contributions and multiple testing. Thanks to Igor Sysoev and Henrik Nordstrom those helped me to understand better the HTTP/1.0 compression features. Thanks to Vlad Jebelev for the patch that helps to survive possible dynamical Apache downgrade from HTTP/1.1 to HTTP/1.0 (especially serving MSIE requests over SSL). Thanks to Rob Bloodgood and Damyan Ivanov for the patches those help to eliminate some unnecessary warnings. Thanks to John Siracusa for the hint that helps to control the content type properly. Thanks to Richard Chen for the bug report concerning some uncompressed responses.
Obviously, I hold a full responsibility for how all those contributions are implemented.
DESCRIPTION
The main pupose of this package is to serve the
content generation phase
within the mod_perl enabledApache 1.3.X
, providing dynamic on the fly compression of outgoing web content. This is done through the use ofzlib
library via theCompress::Zlib
perl interface to serve bothHTTP/1.0
andHTTP/1.1
requests from clients/browsers, capable to understandgzip
format and decompress it on the fly. This handler does nevergzip
content for clients/browsers those do not declare the ability to decompressgzip
format.In fact, this handler mainly serves as a kind of customizable filter of outbound web content for
Apache 1.3.X
.This handler is supposed to be used within
Apache::Filter
chain mostly in order to serve the outgoing content that is dynamically generated on the fly by Perl and/or Java. It is featured to serve the regular CGI binaries (C-written for examle) as a standalong handler out ofApache::Filter
chain. As an extra option, this handler can be used to compress dynamically the huge static files, and to transfer gzipped content in the form of a stream back to the client browser. For the last purposeApache::Dynagzip
handler should be configured as a standalong handler out ofApache::Filter
chain too.Working over
HTTP/1.0
this handler indicates the end of data stream by closing connection. Indeed, overHTTP/1.1
the outgoing data is compressed within a chunked outgoing stream, keeping the connection alive. Resonable control over the chunk-size is provided in this case.In order to serve better the older web clients, an
extra light
compression is provided independently in order to remove unnecessary leading blank spaces and/or blank lines from the outgoing web content. Thisextra light
compression could be combined with the maingzip
compression, when necessary.The list of features of this handler includes:
- · Support for both HTTP/1.0 and HTTP/1.1 requests.
- · Reasonable control over the size of content chunks for HTTP/1.1.
- · Support for Perl, Java, or C/C++ CGI applications in order to provide dynamic on-the-fly compression of outbound content.
- · Optional
extra light
compression for all browsers, including older ones that incapable to decompress gzipped content. - · Optional control over the duration of the content's life in client/proxy local cache.
- · Limited control over the proxy caching.
- · Optional support for server-side caching of dynamically generated content.
Compression Features
Apache::Dynagzip
provides content compression for bothHTTP/1.0
andHTTP/1.1
in accordance with the type of the initial request.There are two types of compression, which could be applied to outgoing content by this handler:
- extra light compression - gzip compression
These compressions could be applied independently, or in combination.
An
extra light
compression is provided in order to remove leading blank spaces and/or blank lines from the outgoing web content. It is supposed to serve the ASCII data types likehtml
,JavaScript
,css
, etc. The implementation ofextra light
compression is turned off by default. It could be turned on with the statementPerlSetVar LightCompression On
in
httpd.conf
. The value "On" is case-insensitive. Any other value turns theextra light
compression off.The main
gzip
format is described in rfc1952. This type of compression is applied when the client is recognized as one capable to decompressgzip
format on the fly. In this version the decision is under the control of whether the client sends theAccept-Encoding: gzip
HTTP header within the request, or not.On
HTTP/1.1
, when thegzip
compression is in effect, handler keeps the resonable control over the size of the chunks and over the compression ratio using the combination of two internal variables (those could be set inhttpd.conf
):minChunkSizeSource minChunkSize
minChunkSizeSource
defines the minimum length of the source stream thatzlib
may accumulate in its internal buffer.- Note:
-
The compression ratio depends on the length of the data accumulated in that buffer; More data we keep -- better ratio will be achieved...
When the length defined by the
minChunkSizeSource
is exceeded, the handler flushes the internal buffer ofzlib
and transfers the accumulated portion of the compressed data into the own internal buffer in order to create a chunk when appropriate.This buffer is not necessarily be fransfered to Appache immediately. The decision is under the control of the
minChunkSize
internal variable. When the size of the buffer exceeds the value ofminChunkSize
the handler chunks the internal buffer and transfers the accumulated data to the Client.This approach helps to create the effective compression combined with the limited latency.
For example, when I use
PerlSetVar minChunkSizeSource 16000 PerlSetVar minChunkSize 8
in my
httpd.conf
in order to compress the dynamically generated content of the size of some 54,000 bytes, the client side logC05 --> S06 GET /pipe/pp-pipe.pl/big.html?try=chunkOneMoreTime HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == ## Sockets 6 of 4,5,6 need checking ## C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Thu, 21 Feb 2002 20:01:47 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Vary: Accept-Encoding C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 6034 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 949 (hex) = 2377 (dec) 5e6 (hex) = 1510 (dec) 5c5 (hex) = 1477 (dec) 26e (hex) = 622 (dec) 0 (hex) = 0 (dec) == Latency = 0.990 seconds, Extra Delay = 0.110 seconds == Restored Body was 54655 bytes ==
shows that the first chunk consists of the gzip header only (10 bytes). This chunk was sent back to web client as soon as the handler received the first portion of the data generated by the CGI script. The data itself at that moment has been storied in the zlib's internal buffer, because the
minChunkSizeSource
is big enough.- Note:
-
Longer we allow zlib to keep its internal buffer -- better compression ratio it makes for us...
So far, in this example we have obtained the compression ratio at about 9 times.
In this version the handler provides defaults:
minChunkSizeSource = 32768 minChunkSize = 8
In case of
gzip
compressed response toHTTP/1.0
request, handler usesminChunkSize
andminChunkSizeSource
values in order to limit the minimum size of internal buffers providing appropriate compression ratio and avoiding multiple short outputs to the core Apache.Chunking Features
On
HTTP/1.1
this handler overwrites the default Apache behavior, and keeps own control over the chunk-size when it is possible. In fact, handler provides the soft control over the chunk-size only: It does never cut the incoming string in order to create a chunk of a particular size. Instead, it controls the minimum size of the chunk only. I consider this approach reasonable, because to date the HTTP chunk-size is not coordinated with the packet-size on transport level.In case of gzipped output the minimum size of the chunk is under the control of internal variable
minChunkSize
In case of uncompressed output, or the
extra light
compression only, the minimum size of the chunk is under the control of internal variableminChunkSizePP
In this version handler provides defaults:
minChunkSize = 8 minChunkSizePP = 8192
You may overwrite the default values of these variables in your
httpd.conf
if necessary.- Note:
-
The internal variable
minChunkSize
should be treated carefully together with theminChunkSizeSource
(see Compression Features).In this version handler does not keep control over the chunk-size when it serves the internally redirected request. An appropriate warning is placed to
error_log
in this case.
Filter Chain Features
As a member of
Apache::Filter
chain,Apache::Dynagzip
handler is supposed to be the last executable filter in the chain due to the features of it's functions.CGI Compatibility
When serving CGI binary this version of the handler is CGI/1.1 compatible. It accepts CGI headers from the binary and produces a set of required HTTP headers followed by gzipped content.
POST Request Features
I have to serve the POST requests for CGI binary with special care, because in this case the handler is standing along and have to serve all data flow in both directions at the time when
stdin
is tied into Apache, and could not be exposed to CGI binary transparently.To solve the problem I alter POST with GET request internally doing the required incoming data transformations on the fly.
This could cause a problem, when you have a huge incoming stream from your client (more than 4K bytes). Another problem could appear if your CGI binary is capable to distinguish POST and GET requests internally.
Control over the Client Cache
The control over the lifetime of the response in client's cache is provided through implementation of
Expires
HTTP header:The Expires entity-header field gives the date/time after which the response should be considered stale. A stale cache entry may not normally be returned by a cache (either a proxy cache or an user agent cache) unless it is first validated with the origin server (or with an intermediate cache that has a fresh copy of the entity). The format is an absolute date and time as defined by HTTP-date in section 3.3; it MUST be in rfc1123-date format:
Expires = "Expires" ":" HTTP-date
This handler creates the
Expires
HTTP header, adding thepageLifeTime
to the date-time of the request. The internal variablepageLifeTime
has default valuepageLifeTime = 300 # sec.
that could be overwriten in
httpd.conf
for example as:PerlSetVar pageLifeTime 1800
to make the
pageLifeTime = 30 minutes
.During the lifetime the client (browser) will not even try to access the server when user requests the same URL again. Instead, it restarts the page from the local cache.
It's important to point out here, that all initial JavaScripts will be restarted indeed, so you can rotate your advertisements and dynamic content when needed.
The second important point should be mentioned here: when user clicks the "Refresh" button, the browser will reload the page from the server unconditionally. This is right behavior, because it is exactly what the human user expects from "Refresh" button.
- Notes:
-
The lifetime defined by
Expires
depends on accuracy of time settings on client machine. If the client's local clock is running 1 hour back, the cached copy of the page will be alive 60 minutes longer on that machine.Apache::Dynagzip
never overwritesExpires
header set by earlier handler inside the filter-chain.
Support for the Server-Side Cache
In order to support the Server-Side Cache I place a reference to the dynamically generated document to the
notes()
when the Server-Side Cache Support is ordered. The referenced document could be already compressed with anextra light
compression (if anextra light
compression is in effect for the current request).In this case the regular dynamic
gzip
compression takes place as usual and the effectivegzip
compression is supposed to take place within thelog
stage of the request processing flow.You usually should not care about this feature of
Apache::Dynagzip
unless you use it in your own chain of handlers for the various phases of the request processing.Control over the Proxy Cache.
Control over the (possible) proxy cache is provided through the implementation of
Vary
HTTP header. WithinApache::Dynagzip
this header is under the control of few simple rules:Apache::Dynagzip
does never generate this header unlessgzip
compression is provided.The value of
Accept-Encoding
is always provided for this header, accompanyinggzip
compression.Advanced control over the proxy cache is provided since the version 0.07 with optional extension of Vary HTTP header. This extension could be placed into your configuration file, using directive
PerlSetVar Vary <value>
Particularly, it might be helpful to indicate the content, which depends on some conditions, other than just compression features. For example, when the content is personalized, someone might wish to use the "*"
Vary
extension in order to prevent any proxy caching.When the outgoing content is gzipped, this extension will be appended to the regular
Vary
header, like in the following example:Using the following fragment within the
httpd.conf
:PerlModule Apache::Dynagzip <Files ~ "*\.html"> SetHandler perl-script PerlHandler Apache::Dynagzip PerlSetVar LightCompression On PerlSetVar Vary * </Files>
We can observe the client-side log in the form of:
C05 --> S06 GET /devdoc/Dynagzip/Dynagzip.html HTTP/1.1 C05 --> S06 Accept: */* C05 --> S06 Referer: http://devl4.outlook.net/devdoc/Dynagzip/ C05 --> S06 Accept-Language: en-us C05 --> S06 Accept-Encoding: gzip, deflate C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) C05 --> S06 Host: devl4.outlook.net C05 --> S06 Pragma: no-cache C05 --> S06 Accept-Charset: ISO-8859-1 == Body was 0 bytes == C05 <-- S06 HTTP/1.1 200 OK C05 <-- S06 Date: Sun, 11 Aug 2002 21:28:43 GMT C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26 C05 <-- S06 X-Module-Sender: Apache::Dynagzip C05 <-- S06 Expires: Sunday, 11-August-2002 21:33:43 GMT C05 <-- S06 Vary: Accept-Encoding,* C05 <-- S06 Transfer-Encoding: chunked C05 <-- S06 Content-Type: text/html; charset=iso-8859-1 C05 <-- S06 Content-Encoding: gzip C05 <-- S06 == Incoming Body was 11311 bytes == == Transmission: text gzip chunked == == Chunk Log == a (hex) = 10 (dec) 1c78 (hex) = 7288 (dec) f94 (hex) = 3988 (dec) 0 (hex) = 0 (dec) == Latency = 0.160 seconds, Extra Delay = 0.170 seconds == Restored Body was 47510 bytes ==
Simple form
Vary: Accept-Encoding
is provided as a default for the gzipped content.
CUSTOMIZATION
Apache::Dynagzip
can be used in orderto compress dynamic web content generated in
Apache::Filter
chain;to compress the output of CGI-compatible binary program;
to stream huge static files providing on the fly compression of the stream.
These are the main regims, wich one can implement through the appropriate configuration of the handler. Every main regim can be tuned with some specific settings and can be accomplished with various control features. All these specific settings and control features could be addressed through additional configuration parameters unless provided defaults are sufficient.
- Note:
-
Do your best in order to avoid the implementation of this handler in internally redirected requests. It does not help much in this case. Read your
error_log
carefully in order to find appropriate warnings. Tune yourhttpd.conf
carefully in order to take the most from opportunities offered by this handler.Always use accomplishing
Apache::CompressClientFixup
handler in order to avoidgzip
compression for known buggy web clients.
Apache::Filter Chain
If your application is initially configured something like
PerlModule HTML::Mason::ApacheHandler <Directory /path/to/subdirectory> <FilesMatch "\.html$"> SetHandler perl-script PerlHandler HTML::Mason::ApacheHandler </FilesMatch> </Directory>
you might want just to replace it with the following:
PerlModule HTML::Mason::ApacheHandler PerlModule Apache::Dynagzip PerlModule Apache::CompressClientFixup <Directory /path/to/subdirectory> <FilesMatch "\.html$"> SetHandler perl-script PerlHandler HTML::Mason::ApacheHandler Apache::Dynagzip PerlSetVar Filter On PerlFixupHandler Apache::CompressClientFixup PerlSetVar LightCompression On </FilesMatch> </Directory>
in order to provide
gzip
compression of your content. You should be all set safely after that.In more common cases you need to replace the line
PerlHandler HTML::Mason::ApacheHandler
in your initial configuration file with the set of the following lines:
PerlHandler HTML::Mason::ApacheHandler Apache::Dynagzip PerlSetVar Filter On PerlFixupHandler Apache::CompressClientFixup
You might want to add optionally
PerlSetVar LightCompression On
to reduce the size of the stream even for clients incapable to speak
gzip
(like Microsoft Internet Explorer over HTTP/1.0).Finally, make sure you have somewhere declared
PerlModule Apache::Dynagzip PerlModule Apache::CompressClientFixup
Outgoing
Content-Type
will be set to defaulttext/html
unless you have another value defined by core Apache or generated by another perl handler included in the chain.In order to control the compression ratio and the minimum size of the chunk/buffer for gzipped content you can optionally use directives
PerlSetVar minChunkSizeSource <value> PerlSetVar minChunkSize <value>
for example you can try
PerlSetVar minChunkSizeSource 32768 PerlSetVar minChunkSize 8
which are the defaults in this version.
In order to control the minimum size of the chunk for uncompressed content over HTTP/1.1 you can optionally use the directive
PerlSetVar minChunkSizePP <value>
Default value is 8192 bytes in this version.
In order to control the
pageLifeTime
in client's local cache you can optionally use the directivePerlSetVar pageLifeTime <value>
where the value stands for the life-length in seconds.
PerlSetVar pageLifeTime 300
is default in this version.
Apache::Dynagzip
does not overwrite any existentExpires
HTTP header, whether one is set by core Apache, or by the previous perl handler.You might wish to place
PerlSetVar Vary User-Agent
in your
httpd.conf
file in order to notify possible proxies that you distinguish browsers in your content. Alternatively, you might want to placePerlSetVar Vary *
in order to prevent all proxies from caching your content.
you may use
Apache::Filter
chain to serve another sources, when you know what you are doing. You might wish to write your own handler and include it intoApache::Filter
chain, preprocessing the outgoing stream if necessary.In order to use your own handler (that might be generating its own HTTP headers) inside the Apache::Filter chain, make sure to register your handler with the Apache::Filter chain like
$r->filter_register();
when necessary. See Apache::Filter documentation for details.
CGI-Compatible Binary
Use the directives like
PerlModule Apache::Dynagzip PerlModule Apache::CompressClientFixup <Directory /path/to/subdirectory> SetHandler perl-script PerlHandler Apache::Dynagzip PerlSetVar BinaryCGI On Options +ExecCGI PerlFixupHandler Apache::CompressClientFixup PerlSetVar LightCompression On </Directory>
in order to indicate that the source-generator is supposed to be a CGI binary. Don't use
Apache::Filter
chain in this case. Support for CGI/1.1 headers defaults to "On" for this type of source generators.Outgoing
Content-Type
will be set to defaulttext/html
unless you have another value defined by core Apache for this binary, or binary itself generates appropriate CGI header.When your source is a very old CGI-application that fails to provide correct
Content-Type
CGI header, usePerlSetVar UseCGIHeadersFromScript Off
in your
httpd.conf
in order to overwrite the document's Content-Type totext/html
. All other CGI headers generated by the binary will be disregarded in this case too.Make sure that your POST requests do never exceed 4K bytes in body length. Longer POST body is not supported in this version of
Apache::Dynagzip
.In order to control the compression ratio and the minimum size of the chunk/buffer for gzipped content you can optionally use directives
PerlSetVar minChunkSizeSource <value> PerlSetVar minChunkSize <value>
For example you can try
PerlSetVar minChunkSizeSource 32768 PerlSetVar minChunkSize 8
which are the defaults in this version.
In order to control the minimum size of the chunk for uncompressed content over HTTP/1.1 you can optionally use the directive
PerlSetVar minChunkSizePP <value>
Default value is 8192 bytes in this version.
In order to control the
extra light
compression you can optionally use the directivePerlSetVar LightCompression <On/Off>
In order to turn "On" the
extra light
compression, use the directivePerlSetVar LightCompression On
Any other value turns the
extra light
compression "Off" (default).In order to control the
pageLifeTime
in client's local cache you can optionally use the directivePerlSetVar pageLifeTime <value>
where the value stands for the life-length in seconds.
PerlSetVar pageLifeTime 300
is default in this version.
You might wish to place
PerlSetVar Vary User-Agent
in your
httpd.conf
file in order to notify possible proxies that you distinguish browsers in your content. Alternatively, you might placePerlSetVar Vary *
in order to prevent all proxies from caching your content.
Stream Compression of Static File
It will be assumed the plain file transfer, when you use the standing-along handler with no BinaryCGI directive:
PerlModule Apache::Dynagzip PerlModule Apache::CompressClientFixup <Directory /path/to/subdirectory> SetHandler perl-script PerlHandler Apache::Dynagzip PerlFixupHandler Apache::CompressClientFixup PerlSetVar LightCompression On </Directory>
The
Content-Type
is determined by Apache in this case.In order to control the compression ratio and the minimum size of the chunk/buffer for gzipped content you can optionally use directives
PerlSetVar minChunkSizeSource <value> PerlSetVar minChunkSize <value>
For example you can try
PerlSetVar minChunkSizeSource 32768 PerlSetVar minChunkSize 8
which are the defaults in this version.
In order to control the minimum size of the chunk for uncompressed content over HTTP/1.1 you can optionally use the directive
PerlSetVar minChunkSizePP <value>
In order to control the
extra light
compression you can optionally use the directivePerlSetVar LightCompression <On/Off>
In order to turn "On" the
extra light
compression, use the directivePerlSetVar LightCompression On
Any other value turns the
extra light
compression "Off" (default).In order to control the
pageLifeTime
in client's local cache you can optionally use the directivePerlSetVar pageLifeTime <value>
where the value stands for the life-length in seconds.
PerlSetVar pageLifeTime 300
is default in this version.
You might wish to place
PerlSetVar Vary *
in order to prevent all proxies from caching your content.
Dynamic Setup/Configuration from the Perl Code
Alternatively, one can control this handler from the own perl-written handler serving the earlier phase of the request processing flow. For example, I'm using dynamic installation of
Apache::Dynagzip
from myPerlTransHandler
in order to serve the server-side content cache appropriately.use Apache::RegistryFilter; use Apache::Dynagzip; . . . $r->handler("perl-script"); $r->push_handlers(PerlHandler => \&Apache::RegistryFilter::handler); $r->push_handlers(PerlHandler => \&Apache::Dynagzip::handler);
In your perl code you can even extend the main
config
settings (for the current request) with:$r->dir_config->set(minChunkSizeSource => 36000); $r->dir_config->set(minChunkSize => 6);
for example...
TROUBLESHOOTING
This handler fails to keep control over the chunk-size when it serves the internally redirected request. At the same time it fails to provide
gzip
compression. A corresponding warning is placed toerror_log
in this case. Make the appropriate configuration tunings in order to avoid the implementation of this handler for internally redirected request(s).The handler logs
error
,warn
,info
, anddebug
messages to the Apacheerror_log
file. Please, read it first in case of any trouble.DEPENDENCIES
This module requires these other modules and libraries:
Apache::Constants; Apache::File; Apache::Filter 1.019; Apache::Log; Apache::URI; Apache::Util; Fcntl; FileHandle; Compress::LeadingBlankSpaces; Compress::Zlib 1.16; Note 1: the Compress::Zlib 1.16 requires the Info-zip zlib 1.0.2 or better (it is NOT compatible with versions of zlib <= 1.0.1). The zlib compression library is available at http://www.gzip.org/zlib/ note 2: it is recommended to have a mod_perl compiled with the EVERYTHING=1 switch. However, Apache::Dynagzip uses just fiew phases of the request processing flow: Content generation phase Logging phase
It is strongly recommended to use
Apache::CompressClientFixup
handler in order to avoid compression for known buggy browsers.Apache::CompressClientFixup
package can be found on CPAN at http://search.cpan.org/author/SLAVA/.AUTHOR
Slava Bizyayev <slava@cpan.org> - Freelance Software Developer & Consultant.
COPYRIGHT AND LICENSE
Copyright (C) 2002 - 2004, Slava Bizyayev. All rights reserved.
This package is free software. You can use it, redistribute it, and/or modify it under the same terms as Perl itself.
The latest version of this module can be found on CPAN.
SEE ALSO
"Web Content Compression FAQ" at http://perl.apache.org/docs/tutorials/client/compression/compression.html
Compress::LeadingBlankSpaces
module can be found on CPAN.Compress::Zlib
module can be found on CPAN.The primary site for the
zlib
compression library is http://www.info-zip.org/pub/infozip/zlib/.Apache::Filter
module can be found on CPAN.Apache::CompressClientFixup
module can be found on CPAN at http://search.cpan.org/author/SLAVA/.RFC 1945
Hypertext Transfer Protocol HTTP/1.0.RFC 2616
Hypertext Transfer Protocol HTTP/1.1.http://www.ietf.org/rfc.html - rfc search by number (+ index list)
http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html CGI/1.1 rfc
http://perl.apache.org/docs/general/correct_headers/correct_headers.html "Issuing Correct HTTP Headers" by Andreas Koenig
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1810:
Non-ASCII character seen before =encoding in 'Oxhøj'. Assuming CP1252
Module Install Instructions
To install Apache::Dynagzip, copy and paste the appropriate command in to your terminal.
cpanm Apache::Dynagzip
perl -MCPAN -e shell install Apache::Dynagzip
For more information on module installation, please visit the detailed CPAN module installation guide.