Nginx - full-featured perl support for nginx
use Nginx; # nginx's asynchronous resolver # "resolver 1.2.3.4;" in nginx-perl.conf ngx_resolver "www.google.com", 15, sub { my (@IPs) = @_; if ($!) { my ($errcode, $errstr) = @_; ngx_log_error $!, "Cannot resolve google's IP: $errstr"; } }; # timer ngx_timer 5, 0, sub { ngx_log_notice 0, "5 seconds gone"; }; # asynchronous connections # with explicit flow control ngx_connector "1.2.3.4", 80, 15, sub { if ($!) { ngx_log_error $!, "Connect error: $!"; return NGX_CLOSE; } my $c = shift; # connection my $wbuf = "GET /\x0d\x0a"; my $rbuf; ngx_writer $c, $wbuf, 15, sub { if ($!) { ngx_log_error $!, "Write error: $!"; return NGX_CLOSE; } return NGX_READ; }; ngx_reader $c, $rbuf, 0, 0, 15, sub { if ($! && $! != NGX_EOF) { ngx_log_error $!, "Read error: $!"; return NGX_CLOSE; } if ($! == NGX_EOF) { ngx_log_info 0, "response length: " . length ($rbuf); return NGX_CLOSE; } return NGX_READ; # no errors - read again }; return NGX_WRITE; # what to do on connect }; # SSL handshake ngx_connector "1.2.3.4", 80, 15, sub { ... my $c = shift; ngx_ssl_handshaker $c, 15, sub { ... ngx_writer $c, $wbuf, 15, sub { ... }; ngx_reader $c, $rbuf, 0, 0, 15, sub { ... }; return NGX_WRITE; }; return NGX_SSL_HANDSHAKER; }; # asynchronous response # via HTTP API sub handler { my ($r) = shift; $r->main_count_inc; ngx_resolver "www.google.com", 15, sub { $r->send_http_header ('text/html'); unless ($!) { lcoal $, = ', '; $r->print ("OK, @_\n"); } else { $r->print ("FAILED, $_[1]\n"); } $r->send_special (NGX_HTTP_LAST); $r->finalize_request (NGX_OK); }; return NGX_DONE; } # and more...
nginx-perl.conf:
http { perl_inc /path/to/lib; perl_inc /path/to/apps; perl_require My/App.pm; perl_init_worker My::App::init_worker; perl_exit_worker My::App::exit_worker; perl_eval '$My::App::SOME_VAR = "foo"'; ... server { location / { perl_handler My::App::handler; ...
My/App.pm:
package My::App; use Nginx; sub handler { my $r = shift; ... } ...
Nginx with capital N is a part of nginx-perl distribution.
Nginx-perl brings asynchronous functions and other useful features into embedded perl to turn it into nice and powerful perl web server.
Nginx is very popular and stable asynchronous web-server. And reusing as much of its internals as possible gives this project same level of stability nginx has. Maybe not right from the beginning, but it can catch up with a very little effort.
Internal HTTP parser, dispatcher (locations) and different types of handlers free perl modules from reinventing all that, like most of the perl frameworks do. It's already there, native and extremely fast.
All of the output filters there as well and everything you do can be gzipped, processed with xslt or through any filter module for nginx. Again, extremely fast.
Nginx has a pretty decent master-worker model, which allows to do process management right out of the box.
And probably some other things I can't remember at the moment.
So, why use any of those perl frameworks if we already have nginx with nice native implementation for almost everything they offer. It just needed a little touch.
Additionally I wanted to implement new asynchronous API with proper flow control and explicit parameters to avoid complexity as much as possible.
As usual for perl extensions:
% perl Makefile.PL % make % make test % make install
Makefile.PL supports everything ./configure does. To build it with SSL support use something like:
% perl Makefile.PL --with-http_ssl_module
Or if you want to install it into different perl simply run Makefile.PL undef it:
% /home/zzz/perl5/perlbrew/perls/perl-5.14.2/bin/perl Makefile.PL
It is safe to install nginx-perl alongside nginx. It uses capital N for perl modules and nginx-perl for binaries.
You don't have to install nginx-perl to try it. There are couple of ready to try examples in eg/:
% ./objs/nginx-perl -p eg/helloworld
Now open another terminal or your web browser and go to http://127.0.0.1:55555/ or whatever IP you're on.
The easiest way to benchmark nginx-perl against node.js is to run redis example from eg/redis, eg/redis.js and compare the results. But first you need to install Redis::Parser::XS from cpan:
% cpan Redis::Parser::XS ... % ./objs/nginx-perl -p eg/redis ... % ab -c10 -n10000 http://127.0.0.1:55555/ % ab -c10 -n10000 http://127.0.0.1:55555/single % ab -c10 -n10000 http://127.0.0.1:55555/multi
Same goes for node.js:
% npm install redis % npm install hiredis ... % node eg/redis.js ... % ab -c10 -n10000 http://127.0.0.1:55555/ % ab -c10 -n10000 http://127.0.0.1:55555/single % ab -c10 -n10000 http://127.0.0.1:55555/multi
Works just like Perl's use lib '/path/to/lib'. Supports only one argument, but you can specify it multiple times.
use lib '/path/to/lib'
http { perl_inc /path/to/lib; perl_inc /path/to/myproject/lib;
Same as Perl's own require.
require
http { perl_inc /path/to/lib; perl_require My/App.pm;
Adds a handler to call on worker's start.
http { perl_inc /path/to/lib; perl_require My/App.pm; perl_init_worker My::App::init_worker; perl_init_worker My::AnotherApp::init_worker;
Adds a handler to call on worker's exit.
http { perl_inc /path/to/lib; perl_require My/App.pm; perl_exit_worker My::App::exit_worker; perl_exit_worker My::AnotherApp::exit_worker;
Sets current location's http content handler (a.k.a. http handler).
http { server { location / { perl_handler My::App::Handler;
Adds an http access handler to the access phase of current location.
http { server { location / { perl_access My::App::access_handler; perl_handler My::App::Handler;
Evaluates some perl code on configuration level. Useful if you need to configure some perl modules directly fron nginx-perl.conf.
http { perl_eval '$My::App::CONF{foo} = "bar"';
Sets http content handler to the sub { } returned from the app. Internally does simple $handler = do '/path/to/app.pl', so you can put your app into @INC somewhere to get shorter path. Additionally prereads entire request body before calling the handler. Which means there is no need to call $r->has_request_body there.
sub { }
$handler = do '/path/to/app.pl'
http { server { location / { perl_app /path/to/app.pl;
NGX_FOO_BAR -- constants ngx_*r -- asynchronous functions (creators) NGX_VERB -- flow control constants ngx_verb -- flow control functions $r->foo_bar -- request object's methods
Each asynchronous function has an r at the end of its name. This is because those functions are creators of handlers with some parameters. E.g. ngx_writer creates write handler for some connection with some scalar as a buffer.
All the things from official embedded perl are there and almost completely untouched. There are quite a few new methods though:
Sets and gets some context scalar. It will be useful to get some data from access handler for example.
Returns the name of the location.
Returns the root path.
Increases value of the internal r->main->count by 1 and therefore allows to send response later from some other callback.
r->main->count
Sends response.
Decreases r->main->count and finalizes request.
Allows to move to the next phase handler from access handler.
Allows to break out of access handler and continue later from some other callback.
This is where response should get generated and send to the client. Here's how to send response completely asynchronously:
sub handler { my $r = shift; $r->main_count_inc; ngx_timer 1, 0, sub { $r->send_http_header('text/html'); $r->print("OK\n"); $r->send_special(NGX_HTTP_LAST); $r->finalize_request(NGX_OK); }; return NGX_DONE; }
Notice return NGX_DONE instead of return OK, this is important, because it allows to avoid post processing response the old way.
return NGX_DONE
return OK
todo
To specify what to do after each callback we can either call some function or return some value and let handler do it for us. Most of the ngx_* handlers support return value and even optimized for that kind of behavior.
Functions take connection as an argument:
ngx_read($c) ngx_write($c) ngx_ssl_handshake($c) ngx_close($c)
Return values only work on current connection:
return NGX_READ; return NGX_WRITE; return NGX_SSL_HANDSHAKE; return NGX_CLOSE;
As an example, let's connect and close connection. We will do flow control via single return for this:
return
ngx_connector '1.2.3.4', 80, 15, sub { return NGX_CLOSE; };
Now, if we want to connect and then read exactly 10 bytes we need to create reader and return NGX_READ from connector's callback:
return NGX_READ
ngx_connector '1.2.3.4', 80, 15, sub { my $c = shift; ngx_reader $c, $buf, 10, 10, 15, sub { ... }; return NGX_READ; };
This will be different, if we already have connection somehow:
ngx_reader $c, $buf, 10, 10, 15, sub { ... }; ngx_read($c);
Each ngx_* handler will call back on any error with $! set to some value and reset to 0 otherwise. For simplicity EOF considered to be an error as well and $! will be set to NGX_EOF in such case.
$!
Example:
ngx_reader $c, $buf, 0, 0, sub { return NGX_WRITE if $! == NGX_EOF; return NGX_CLOSE if $!; ... };
Creates new timer and calls back after $after seconds. If $repeat is set reschedules the timer to call back again after $repeat seconds or destroys it otherwise.
$after
$repeat
Internally $repeat is stored as a refence, so changing it will influence rescheduling behaviour.
Simple example calls back just once after 1 second:
ngx_timer 1, 0, sub { warn "tada\n"; };
This one is a bit trickier, calls back after 5, 4, 3, 2, 1 seconds and destroys itself:
my $repeat = 5; ngx_timer $repeat, $repeat, sub { $repeat--; };
Creates connect handler and attempts to connect to $ip:$port within $timeout seconds. Calls back with connection in @_ afterwards. On error calls back with $! set to some value.
$ip:$port
$timeout
@_
Expects one of the following control flow constants as a result of callback:
NGX_CLOSE NGX_READ NGX_WRITE NGX_SSL_HANDSHAKE
ngx_connector $ip, 80, 15, sub { return NGX_CLOSE if $!; my $c = shift; ... return NGX_READ; };
Creates read handler for $connection with buffer $buf. $min indicates how much data should be present in $buf before the callback and $max limits total length of $buf.
$connection
$buf
$min
$max
Internally $buf, $min, $max and $timeout are stored as refernces, so you can change them at any time to influence reader's behavior.
On error calls back with $! set to some value, including NGX_EOF in case of EOF.
my $buf; ngx_reader $c, $buf, $min, $max, $timeout, sub { return NGX_CLOSE if $! && $! != NGX_EOF; ... return NGX_WRITE; };
Be aware, that $min and $max doesn't apply to the amount of data you want to read but rather to the appropriate buffer size to call back with.
Creates write handler for $connection with buffer $buf and write timeout in <$timeout>.
Internally $buf and $timeout are stored as references, so changing them will influence writer's behavior.
On error calls back with $! set to some value. NGX_EOF should be treated as fatal error here.
my $buf = "GET /\n"; ngx_writer $c, $buf, 15, sub { return NGX_CLOSE if $!; ... return NGX_READ; };
Creates its own internal handler for both reading and writing and tries to do SSL handshake.
On error calls back with $! set to some value.
It's important to understand that handshaker will replace your previous reader and writer, so you have to create new ones.
Typically it should be called inside connector's callback:
ngx_connector ... sub { return NGX_CLOSE if $!; my $c = shift; ngx_ssl_handshaker $c, 15, sub { return NGX_CLOSE if $!; ... ngx_writer ... sub { }; ngx_reader ... sub { }; return NGX_WRITE; }; return NGX_SSL_HANDSHAKE; };
Creates resolver's handler and tries to resolve $name in $timeout seconds using resolver specified in nginx-perl.conf.
$name
On success returns all resolved IP addresses into @_.
On error calls back with $! set to some value, $_[0] set to one of the resolver-specific error constants and with textual explanation in $_[1]:
NGX_RESOLVE_FORMERR NGX_RESOLVE_SERVFAIL NGX_RESOLVE_NXDOMAIN NGX_RESOLVE_NOTIMP NGX_RESOLVE_REFUSED NGX_RESOLVE_TIMEDOUT
This is a thin wrapper around nginx's internal resolver. All its current problems apply. To use it in production you'll need a local resolver, like named that does actual resolving.
ngx_resolver $host, $timeout, sub { if ($!) { my $errcode = $_[0]; my $errstr = $_[1]; warn "failed to resolve $host: $errstr\n"; ... return; } my @IPs = @_; # list of all resolved IP addresses ... };
It is possible to takeover client connection completely and create you own reader and writer on that connection. You need this for websockets and protocol upgrade in general.
There are two methods to support this:
$r->take_connection initializes internal data structure and replaces connection's data with it. Returns connection on success or undef on error.
$r->take_connection
$r->give_connection attaches request $r back to its connection. Doesn't return anything.
$r->give_connection
$r
So, to takeover you need to take connection from the request, tell nginx that you are going to finalize it later by calling $r->main_count_inc, create reader and/or writer on that connection, start reading and/or writing flow and return NGX_DONE from your HTTP handler:
$r->main_count_inc
sub handler { my $r = shift; my $c = $r->take_connection() or return HTTP_SERVER_ERROR; $r->main_count_inc; my $buf; ngx_reader $c, $buf, ... , sub { if ($!) { $r->give_connection; $r->finalize_request(NGX_DONE); return NGX_NOOP; } ... }; ngx_writer $c, ... , sub { if ($!) { $r->give_connection; $r->finalize_request(NGX_DONE); return NGX_NOOP; } ... }; ngx_read($c); return NGX_DONE; }
Once you are done with the connection or connection failed with some error you MUST give connection back to the request and finalize it:
$r->give_connection; $r->finalize_request(NGX_DONE); return NGX_NOOP;
Usually you will also need to return NGX_NOOP instead of NGX_CLOSE, since your connection is going to be closed within http request's finalizer. But it shouldn't cuase any problems either way.
It's important to know how and actually fairly easy to create self-sufficient reusable handlers for nginx-perl.
Just remember couple of things:
1. Use $r->location_name as a prefix:
$r->location_name
location /foo/ { perl_handler My::handler; } sub handler { ... my $prefix = $r->location_name; $prefix =~ s/\/$//; $out = "<a href=$prefix/something > do something </a>"; # will result in "<a href=/foo/something > do something </a>" ... }
2. Use $r->variable to configure handlers and to access per-server and per-location variables:
$r->variable
location /foo/ { set $conf_bar "baz"; perl_handler My::handler; } sub handler { ... my $conf_bar = $r->variable('conf_bar'); my $document_root = $r->variable('document_root'); ... }
3. Use $r->ctx to exchange arbitrary data between handlers:
$r->ctx
sub handler { ... my $ctx = { foo => 'bar' }; $r->ctx($ctx); my $ctx = $r->ctx; ... }
4. Use perl_eval to configure your modules directly from nginx-perl.conf:
perl_eval
http { perl_require MyModule.pm; perl_eval ' $My::CONF{foo} = "bar" '; } package My; our %CONF = (); sub handler { ... warn $CONF{foo}; ... }
Check out eg/self-sufficient to see all this in action:
% ./objs/nginx-perl -p eg/self-sufficient
Nginx::Test, Nginx::Util, Nginx::Redis, http://zzzcpan.github.com/nginx-perl, http://wiki.nginx.org/EmbeddedPerlModule, http://nginx.net/
Igor Sysoev, Alexandr Gomoliako <zzz@zzz.org.ua>
Copyright (C) Igor Sysoev
Copyright 2011 Alexandr Gomoliako. All rights reserved.
This module is free software. It may be used, redistributed and/or modified under the same terms as nginx itself.
To install Nginx::Perl, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Nginx::Perl
CPAN shell
perl -MCPAN -e shell install Nginx::Perl
For more information on module installation, please visit the detailed CPAN module installation guide.