The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

EmbedIT::WebIT - A small yet very effective embeded web server for any perl application

Synopsis

  use EmbedIT::WebIT;

  $server = new EmbedIT::WebIT( SERVER_NAME   => 'www.my.org',
                                SERVER_IP     => '127.0.0.1',
                                SERVER_PORT   => 8080,
                                SOFTWARE      => 'MyApp web server',
                                QUEUE_SIZE    => 100,
                                RUN_AS_USER   => nobody,
                                RUN_AS_GROUP  => nogroup,
                                WAIT_RESPONSE => 1,
                                IMMED_CLOSE   => 1,
                                EMBED_PERL    => 1,
                                FORK_CONN     => 0,
                                SETUP_ENV     => 1,
                                SERVER_ADMIN  => 'info@my.org',
                                SERVERS       => 3,
                                WORKERS       => 1,
                                DOCUMENT_ROOT => '/opt/my/web',
                                DOCUMENTS     => {
                                                   '/index.html'    => 'WPages::index',
                                                   '/error.html'    => 'WPages::error',
                                                   '/style.css'     => 'WPages::style',
                                                   '/print.css'     => 'WPages::print',
                                                   '/404.html'      => 'WPages::error404',
                                                   '*'              => 'WPages::pageHandle',
                                                 },
                                ERROR_PAGES   => { 
                                                   '404' => '/404.html',    # embeded subroutine error
                                                   'ALL' => '/error.html',  # simple html file error
                                                 },
                                EXPIRATIONS   => { 
                                                   'image/jpg' => 86400,
                                                   'ALL' => 3600, 
                                                 },
                                PROC_PREFIX   => 'my:',
                                CHILD_START   => 'WControl::start_db',
                                CHILD_END     => 'WControl::stop_db',
                                LOG_METHOD    => 'WControl::logInfo',
                                DEBLOG_METHOD => 'WControl::logDebug',
                                LOG_HEADERS   => 0,
                                LOG_PACKETS   => 0,
                                CGI_PATH      => '/cgi',
                                ENV_KEEP      => [ 'PERL5LIB', 'LD_LIBRARY_PATH' ],
                                NO_LOGGING    => 0,
                              );

  $server->execute();

Description

The WebIT embeded web server was created a long time ago to make a pure perl application that will interact directly with Kannel. The need was to relieve Kannel from the need to wait for the web server to run its scripts before going back to serve another SMS message. In this respect WebIT is a hack and can be configured to behave in a manner which is not according to the RFC's for HTTP. Yet, creating Perl applications with WebIT using embeded html pages as perl functions outperforms Apache with mod_perl installations.

For this reason I was asked by a few to release this code so that they can use it for their applications.

So even though WebIT is not complete (Workers and SSL not implemented yet) WebIT is already used by 14 perl applications that I know of excluding my personal work.

To work with WebIT all you need to do is to create a new server object by giving to it the parameters that you want, and then at any point in time call the execute method to run the server. The execute method returns only when the server has finished execution, and that can only be done by sending a TERM signal to the process.

Once the server has started it will fork the predefined number of servers and workers. Since workers are not implemented yet you are advised to ask for 0 workers on startup. From then on, WebIT will serve HTTP requests by using external files in a configured directory and/or internal pages served by perl subroutines. The code of the cgi pages and subroutines is as you already know by Apache and mod_perl. You can use the CGI module to get the request parameters, print on the standard output to form the response to the caller, and print to standard error to log text to the logger of the server.

Things to avoid

  • Dont use perl threads ! Perl does not really have threads anyway, so dont use them. Threads that do not by default share their data are not threads, they are forks, and in perl threads are isolated. If you are really inclined to use threads move to another language like Java.

  • Dont use IPC. The server already uses IPC, and some things you can do might break the server.

Just use the server for what it is, and that is an embeded web server for applications, not for hacks, thus you should not need any of the above to create you application. Now if for any reason you really have to use some of the above, then WebIT is not for you.

Configuration

Now lets take a look at the configuration hash of the server.

SERVER_NAME

The DNS name of the server (default is localhost)

SERVER_IP

The IP address to bind to (default is 127.0.0.1)

SERVER_PORT

The TCP port to use (default is 80)

QUEUE_SIZE

The number of connections to queue per child (default is 5)

USE_SSL

The server will work in SSL mode accepting https connections only. (default is undef) This feature is not implemented yet

SSL_CERTIFICATE

The servers SSL certificate path and file. If not defined no certificate file will be used for the connection. You can pass the actual certificate here as is. The value is first tested to see if it matches an existing file, and if not it will be used as an actual certificate. (default is undef) This feature is not implemented yet

SSL_KEY

The servers SSL key path and file. If not defined no key will be used for the connection. You can pass the actual key here as is. The value is first tested to see if it matches an existing file, and if not it will be used as an actual key. (default is undef). This feature is not implemented yet

WAIT_RESPONSE

Directs the server to wait until a response is generated. If 0 server will close connection before running scripts or getting pages and returns 204 (No Content) to client (default is 1 and the server will wait for responses)

NO_WAIT_REPLY

The code to send when WAIT_RESPONCE is 0. (default is undef and 204 is returned)

IMMED_CLOSE

Close connection immediately after serving request. Ignored if WAIT_RESPONSE is 0. (default is 0) If it is set to 0 the server will respect the client's request about the handling of the connection (might be immediate close or keep open)

RUN_AS_USER

The user under which the server should run as

RUN_AS_GROUP

The group under which the server should run as

SETUP_ENV

Allow the server to setup the children environment. This requires some milliseconds for each request served since the server will have to contruct the environment for each call. If you are not using the CGI module and you know what you are doing you can set this to 0-false and save some time for running requests (default is 1)

ENV_KEEP

List of environment variables to keep for scripts. For normal execution all environment variables are cleared and CGI and embeded pages run in a clean environment. If however you need to preserve some, like database variables you can specify their names here in an array, and they will be preserved for your scripts.

ENV_ADD

Hash with environment variables and values to set for scripts. These environment variables and their values will be added to the environment of your CGI and embeded pages.

MIME_TYPES

Path and file where the server can find valid mimetypes. (default is /etc/mime.types)

EMBED_PERL

Run perl CGI scripts inside the server, not in a separate process. Faster than Apache and mod_perl. (default is 0)

SERVER_ADMIN

The email of the server administrator. This text will appear in the environment variables of the CGI / embeded pages (default is empty)

DOCUMENT_ROOT

The path where the site documents and scripts are stored. (default is undef)

DOCUMENTS

A hash of documents and their subroutines to execute within the server. This is used to create fully embeded web servers that respond to specific URL's using specific subroutines. A special page name '*' can be used to direct all unknown page requests to be directed to the subroutine of this special page. Can be used in conjunction with and has precedence over DOCUMENT_ROOT (default is undef)

ERROR_PAGES

A hash with the site supplied error pages. It contains the error code as a key and the page path within DOCUMENT_ROOT or DOCUMENTS of the page for the error. Alternatevly there can be an entry with keyword ALL where all errors without a specific entry in the hash will find their error pages. Error pages can be cgi's or plain html. (default is undef) For all error pages the server sets 4 extra environment variables. These are:

ERROR_CODE

This contains the numeric value of the error, eg 404.

ERROR_TEXT

This contains the text value of the error, eg Page not found.

ERROR_URI

This contains the URI that generated the error.

ERROR_METHOD

This contains the method used to access the URI, eg POST

Along with all other environment variables used you can track all errors to their fullest detail, and handle them not just for display but for administrator notifications as well.

EXPIRATIONS

A hash with expiration times. It contains the content type as a key and the expiration time in seconds. A special entry called ALL specifies the expiration time of any type NOT already defined in the hash.

SERVERS

Number of servers to prefork. Default is 0 where only the master instance exists

WORKERS

Number of page workers to prefork. Default is 0 where only the master instance exists Wrokers are not implemented yet

FORK_CONN

Create a child everytime a new connection arrives. (default is 0) Usefull for hard headed perl modules like SOAP::WSDL that retain information between calls and confuse the server. Not to be used with time sensitive HTTP applications like SMS applications with Kannel because with perl, forking requires quite some time to be performed.

STARTUP

Run this script at startup to load the environment for the pages. Can only be an external perl script. Embeded pages startup code can be done in many ways without the need of external scripts.

CHILD_START

Subroutine to call on every fork for initialization. Returned values of this subroutine are passed to internally called functions (default is undef) Persistant database connections and other paraphenalia that are required for your application should be initialized in the method defined here. All values that are needed by your application should be returned in a hash or an array by your method, so that they can be retrieved later on by your CGI's and embeded pages.

CHILD_END

Subrouting to call on termination of a forked child. It is passed the return values of the start subroutine (default is undef) All values initialized by the method defined in CHILD_START that require some form of proper termination should be treated by the method defined here. The parameter passed to that method is the pointer returned by the CHILD_START, so you should know how to deal with it.

PROC_PREFIX

Text line to be used as prefix for the process name of the childs. (default is WebIT) This is just for decorating the ps listing of those OS's that give us the ability to change the name of the process.

LOG_METHOD

Subroutine to call for logging. It is passed a single string to log. (default is internal logging to stderr)

DEBLOG_METHOD

Subroutine to call for debug logging. It is passed a single string to log. (default is the same with LOG_METHOD)

LOG_HEADERS

Log input and output packet headers as those come and go to and from the server (default is 0)

LOG_PACKETS

Log input and output packets as those come and go to anf from the server. By turning on packet logging you will implicity get header logging. (default is 0)

NO_LOGGING

When set to 1-true the server will avoid all possible logging speeding up processing to the max. (default is 0)

CGI_PATH

A colon or semicolon separated list of paths under the DOCUMENT_ROOT where CGI scripts exist. (default is undef)

AUTH_PATH

A colon or semicolon separated list of paths under the DOCUMENT_ROOT where authentication is needed. Works with embeded pages as well. (default is undef)

AUTH_REALM

A string specifying the realm of the authentication for the AUTH_PATH's. There is only one realm (default is undef)

AUTH_METHOD

Subroutine to call for authenticating remote users. Parameters are the returned values of the child start subroutine preceeded by a username and a password. (default is undef)

SOFTWARE

Text with software name and version. This text will appear in the environment variables of the CGI / embeded pages (default is WebIT/$VESRION)

SIGNATURE

Text with web server signature. This text will appear in the environment variables of the CGI / embeded pages. (default is <br\>WebIT/$VERSION for Perl<br\>)`

Methods

The methods that are available to use are the following:

new()

This is the constructor of the object. It takes as a parameter a hash with keys and values as described above.

execute()

This is the routing to enter the execution loop of the server. This method will never return, so if you need to do anyting more with your application you might want to call this method from a forked process.

data()

This method returns the server child data as those were returned by the CHILD_START method.

Lets assume that you have a CHILD_START method as follows:

  sub start_up {
    %res = ();
    $db = DBI->connect("DBI:Oracle:sid=pits;host=127.0.0.2;port=3127", "user", "pass");

    $res{DATABASE} = $db;

    return \%res;
  }

If you want to retrieve that connection from inside a CGI script or an embeded page what you need to do is the following:

  $res = EmbedIT::WebIT::data();
  $db = $res->{DATABASE};

or if you have access to you server object you can do the following:

  $res = $server->data();
  $db = $res->{DATABASE};
start_time()

This method returns the timestamp of the server startup time. Usefull for applications that need to know when the server started in order to perform some functions.

WebIT and SOAP::WSDL

One of the main reasons why I use now days WebIT, is to expose soap methods. SOAP::WSDL (and not SOAP::Lite) is the best possible soap package available for perl. If you want to use WebIT as a server for SOAP::WSDL this is what you have to do:

First of all you need to specify FORK_CONN as true (1 for perl) to force the server to fork a new child for each new connection. Then you need to specify the embeded pages that will serve the methods exposed by the WSDL. For example, assume you need to expose a method test that takes a string as input and returns another string as output.

Create you WSDL

  <?xml version="1.0" encoding="utf-8"?>
  <wsdl:definitions xmlns:http="http://schemas.xmlsoap.org/wsdl/http/"
                    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
                    xmlns:tns="http://tempuri.org/"
                    xmlns:tm="http://microsoft.com/wsdl/mime/textMatching/"
                    xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/"
                    targetNamespace="http://tempuri.org/"
                    xmlns:wc="http://tempuri.org/"
                    xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
    <wsdl:types>
      <xs:schema elementFormDefault="unqualified" targetNamespace="http://tempuri.org/">
        <xs:element name="InputFlag">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="Flag" type="xs:string" minOccurs="1"  maxOccurs="1"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
  
        <xs:element name="OutputFlag">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="Flag" type="xs:string" minOccurs="1" maxOccures="1"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:schema>
    </wsdl:types>
  
    <wsdl:message name="MsgIn"> <wsdl:part element="tns:InputFlag" name="MessageIn"/> </wsdl:message>
    <wsdl:message name="MsgOut"> <wsdl:part element="tns:OutputFlag" name="MessageOut"/> </wsdl:message>
  
    <wsdl:portType name="TestPort">
      <wsdl:operation name="Test">
        <wsdl:input  message="tns:MsgIn" />
        <wsdl:output message="tns:MsgOut" />
      </wsdl:operation>
    </wsdl:portType>
  
    <wsdl:binding name="TestBind" type="tns:TestPort">
      <soap:binding transport="http://schemas.xmlsoap.org/soap/http" style="document" />
  
      <wsdl:operation name="Test">
        <soap:operation soapAction="urn:Test#Test" style="document" />
  
        <wsdl:input>  <soap:body use="literal"/> </wsdl:input>
        <wsdl:output> <soap:body use="literal"/> </wsdl:output>
      </wsdl:operation>
  
    </wsdl:binding>

    <wsdl:service name="Test">
      <wsdl:port name="Test" binding="tns:TestBind">
        <soap:address location="http://127.0.0.1:8089/WS/Test" />
      </wsdl:port>

    </wsdl:service>
  </wsdl:definitions>

and compile it with wsdl2perl

Then create your handling object (use SOAP::WSDL documentation to see what you need to do) as follows:

  package WebService

  our $VERSION = "1.0";

  sub new {
    my $self = {};
    bless $self;
    return $self;
  }

  sub Test {
    my ($self,$body,$header) = @_;
    my %idata = ();
  
    $idata{Flag} = $body->get_Flag() . "";
  
    return MyElements::OutputFlag->new(\%idata);
  }

and finally create your embeded page that will handle the HTTP request.

  sub WebService {
      eval {
        unshift @INC, $lib_path;      # add at run time the library path of the generated classes from wsdl2perl
        require MyServer::Test::Test; # use the server class generated by wsdl2perl
  
        my $t = WebService->new();    # create a WebService handling object
        my $server = MyServer::Test::Test->new({ dispatch_to     => 'WebService',
                                                 transport_class => 'SOAP::WSDL::Server::CGI' });
        $server->handle();
     };
     if ($@) { print "just do something ...the call has failed\n"; }
  }

On your WebIT configuration hash you need to remember to add the above subroutine as the handler for a page like so:

  $server = new EmbedIT::WebIT( SERVER_NAME => 'name.org',
                                ...
                                FORK_CONN   => 1,
                                ...
                                DOCUMENTS   => {
                                                 'WS/Test' => 'main::WebService',
                                               },
                                ...
                              );

and thats it. You have exposed web services working with WebIT as an embeded web server.

Requirements

You need to have installed the following packages for WebIT to work.

HTTP::Date
IO::Socket
IO::Select
LWP::MediaTypes
IPC::Open3
Taint::Runtime
MIME::Base64

Copyright

Copyright 2008 D. Evmorfopoulos <devmorfo@gmail.com>

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.