The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

RayApp - Framework for data-centric Web applications

SYNOPSIS

        use RayApp;
        my $rayapp = new RayApp;
        my $dsd = $rayapp->load_dsd('structure.xml');
        print $dsd->serialize_data( $data );

INTRODUCTION

The RayApp provides a framework for data-centric Web applications. Instead of writing Perl code that prints HTML, or embedding the code inside of HTML markup, the Web applications only process and return Perl data. No markup handling is done in the code of individual application, inside of the business logic. This reduces the presentation noise in individual applications, increases maintainability and speeds development.

The data returned by the application is then serialized to XML and postprocessed by XSLT to desired output format, which may be HTML, XHTML, WML or anything else. In order to provide all parties involved (analysts, application programmers, Web designers, ...) with a common specification of the data format, data structure description (DSD) file is a mandatory part of the applications. The data returned by the Perl code is fitted into the data structure, creating XML file with agreed-on elements.

This way, application programmers know what data is expected from their applications and Web designers know what XMLs the prostprocessing stage will be dealing with, in advance. In addition, application code can be tested separately from the presentation part, and tests for both application and presentation part can be written independently, in parallel.

Of course, the data structure description can change if necessary, it is not written in stone. Both application programmer and Web designer can use the old DSD file and regression tests to easily migrate to the new structure. This change in DSD leads to change in the DOCTYPE of the resulting XML and is thus easily detected by the external parties. The system will never produce unexpected data output, since the data output is based on DSD which is known.

CONFIGURATION

Most of the use of RayApp approach is expected in the Web context. This section summarizes configuration steps needed for the Apache HTTP server.

Assume you have a Web application that should reside on URL

        http://server/sub/app.html

The application consists of three files:

        /cont/www/app.dsd
        /cont/www/app.pl
        /cont/www/app.xsl

Whenever a request for /sub/appl.html comes, the DSD /cont/www/app.dsd is to be loaded, app.pl executes and the output serialized to HTML with app.xsl. You will need to configure Apache to do these steps for you and generate the HTML on the fly.

Pure mod_perl approach

If you have a mod_perl support in your Apache and want to use it to run you RayApp-based applications, the following setup will give you the correct result:

        Alias /sub/ /cont/www/
        <LocationMatch /sub/.*\.(html|xml)$/>
                SetHandler perl-script
                PerlResponseHandler RayApp::mod_perl
        </LocationMatch>

The Alias directive ensures that the DSD and Perl code will be correctly found in the /cont/www/ directory. The same result can be achieved by setting RAYAPP_DIRECTORY environment variable without specifying Alias:

        <LocationMatch /sub/.*\.(html|xml)$/>
                SetEnv RAYAPP_DIRECTORY /cont/www
                SetHandler perl-script
                PerlResponseHandler RayApp::mod_perl
        </LocationMatch>

Make sure that in this case you include all necessary directives here in the LocationMatch section. Without the Alias, no potential

        <Directory /cont/www/>
        ...
        </Directory>

sections will be taken into account.

There are some more environment variables that are recognized by RayApp:

RAYAPP_INPUT_MODULE

Specifies name of module whose handler function will be invoked for each request. It can be used to do any initial setup which is reasonable to do outside of the code of individual Web applications, like checking permitted parameters or connecting to database sources. The array of return values of this handler will be passed to the application's handler. That way, the applications can be sure they will always get their $q, $r, $dbh values populated and ready.

RAYAPP_STYLE_PARAMS_MODULE

Specifies name of module whose handler function should return hash of parameters that will be passed to the XSLT transformations.

RAYAPP_ERRORS_IN_BROWSER

When set to true (default is false), any internal parsing, execution or styling error will be shown in the output page, besides going to error_log.

CGI approach

You may not have mod_perl installed on your machine. Or you do not want to use it in you Apache. In that case, RayApp can be invoked in CGI manner. With the layout mentioned above, the configuration will be

        ScriptAliasMatch ^/sub/(.+)\.(html|xml)$ /cont/www/$1.pl
        <Location /sub/>
                SetEnv PERL5OPT -MRayApp::CGIWrapper
        </Location>

Essentially, any request for .html or .xml will be mapped to run the .pl application, with RayApp::CGIWrapper helper module providing all the transformations behind the scenes. This layout assumes that the applications are always next to the DSD files with the .pl extensions. In addition, the applications have to have the executable bit set and start with correct #! line.

Alternatively, the rayapp_cgi_wrapper script (included in the RayApp distribution) can be used to run RayApp applications in CGI mode with the following configuration:

        ScriptAliasMatch ^/sub/(.+\.(html|xml))$        \
                        /usr/bin/rayapp_cgi_wrapper/$1
        <Location /sub/>
                SetEnv RAYAPP_DIRECTORY /cont/www
        </Location>

As with he recipe above, the mod_perl RAYAPP_DIRECTORY has to be specified to correctly resolve the URI -> file translation. In this case, the applications can be without the x bit and without the #! line.

The applications

Having the Web server set up, you can write your first application in RayApp manner. For start, a simplistic application which only returns two values will be enough.

First the DSD file, /cont/www/app.dsd:

        <?xml version="1.0"?>
        <root>
                <_param name="name"/>
                <name/>
                <time/>
        </root>

The application will accept one parameter, name and will return hash with two values, name and time. The code can be

        use CGI;
        sub handler {
                my $q = new CGI;
                return {
                        name => $q->param('name'),
                        time => time,
                };
        }
        1;

The application returns a hash with two elements. A request for

        http://server/sub/app.xml?name=Peter

should return

        <?xml version="1.0"?>
        <root>
                <name>Peter</name>
                <time>1075057209</time>
        </root>

Adding the /cont/www/app.xsl file with XSLT templates should be easy now.

Of course, you can also run the application on the command line, but you'll have to use the RayApp::CGIWrapper module, since you application (app.pl) only defined the handler function, nothing more:

        $ perl -MCGI=-debug -MRayApp::CGIWrapper app.dsd

The -MCGI=-debug is here to force debuggin input on standard input.

As using CGI and calling

        my $q = new CGI;

in each of your applications is a bit boring, you can create an initialization module, for example CGIInit.pm:

        package CGIInit;
        use CGI;
        sub handler {
                return (new CGI);
        }
        1;

The application code will change to (app.pl):

        sub handler {
                my $q = shift;
                return {
                        name => $q->param('name'),
                        time => time,
                };
        }
        1;

and setting RAYAPP_INPUT_MODULE=CGIInit on the command line or SetEnv RAYAPP_INPUT_MODULE CGIInit in the Apache configuration file will make sure all RayApp applications' handlers will get the proper parameters. Database handlers are another targets for this centralized initialization.

DATA STRUCTURE DESCRIPTION (DSD)

The data structure description file is a XML file. Its elements either form the skeleton of the output XML and are copied to the output, or specify placeholders for application data, or input parameters that the application accepts.

Parameters

Parameters are denoted by the _param elements. They take the following attributes:

name

Name of the parameter. For example,

        <_param name="id"/>

specifies parameter id.

prefix

Prefix of the parameter. All parameters with this prefix will be allowed. Element

        <_param prefix="search-"/>

allows both search-23 and search-xx parameters.

multiple

By default, only one parameter of each name is allowed. However, specifying multiple="yes" makes it possible to call the application with multiple parameters of the same name:

        <_param name="id" multiple="yes"/>
        application.cgi?id=34;id=45
type

A simple type checking is possible. Available types are int, integer for integer values, num and number for numerical values, and the default string for generic string values.

Note that the type on parameters should only be used for input data that will never be directly entered by the user, either for machine-to-machine communication, or for values in HTML forms that come from menus or checkboxes. If you need to check that the user specified their age as a number, use the type string and application code to retrieve the correct data or return with request for more correct input.

Typerefs

Any child element with an attribute typeref is replaced by document fragment specified by this attribute. Absolute or relative URL is allowed, with possibly fragment information after a # (hash) character. For example:

        <root>
                <invoice typeref="invoice.dsd#inv"/>
                <adress typeref="address.xml"/>
        </root>

Data placeholders

Any child element, element with attributes type, multiple or with name _data are data placeholders that will have the application data binded to them. The allowed attributes of placeholders are:

type

Type of the placeholder. Except the scalar types which are the same as for input parameters, hash or struct values can be used to denote nested structure.

mandatory

By default, no data needs to be returned by the application for the placeholder. When set to yes, the value will be required.

id

An element can be assigned a unique identification which can be then referenced by typeref from other parts of the same DSD or from remote DSD's.

multiple

When this attribute is specified, the value is expected to be an aggregate and either the currect DSD element or its child is repeated for each value.

list

An array is expected as the value. The placeholder element will be repeated.

listelement

An array is expected, the child of the placeholder will be repeated for each of the array's element.

hash

An associative array is expected and placeholder element will be repeated for all values of the array. The key of individual values will be in an attribute id or in an attribute named in DSD with attribute idattr.

hashelement

The same as hash, except that the child of the placeholder will be repeated.

idattr

Specifies the name of attribute which will hold keys of individual values for multiple values hash and hashelement, the default is id.

hashorder

Order of elements for values binded using multiple values hash or hashelement. Possible values are num, string, and (the default) natural.

cdata

When set to yes, the scalar content of this element will be output as a CDATA section.

Conditions

The non-placeholder elements can have one of the if, ifdef, ifnot or ifnotdef attributes that specify a top-level value (from the data hash) that will be checked for presence or its value. If the condition is not matched, this element with all its children will be removed from the output stream.

Attributes

By default, only the special DSD attributes are allowed. However, with an attribute attrs a list of space separated attribute names can be specified. These will be preserved on output.

With attribute xattrs, a rename of attributes is possible. The value is space separated list of space separated pairs of attribute names.

Application name

The root element of the DSD can hold an application attribute with a URL (file name) of the application which should provide the data for the DSD.

DESCRIPTION OF INTERNALS

In the previous parts we have seen how to use RayApp to write Web applications. Changes are that you will want to use RayApp serializer in other, non-Web projects. This part describes the internals of the framework.

RayApp object

To work with RayApp and to have it process data structure description files, application data, and presentation transformation, you need a RayApp object first. Use contructor new to create one:

        use RayApp ();
        my $rayapp = new RayApp;

The constructor takes a couple of optional parameters that affect RayApp's behaviour:

base

The base URI, used for all URI resolutions. By default, the current directory is used.

cache

When set to true value, will cache loaded DSD's and stylesheets. False by default.

ua_options

Options that will be send to LWP::UserAgent constructor. See LWP documentation for exact list.

A constructor call might look like

        my $rayapp = new RayApp (
                base => 'file:///path/sub/',
                cache => 1,
                ua_options => {
                        env_proxy => 1,
                        timeout => 30,
                        },
        );

Should the new call fail, error message can be found in the $RayApp::errstr variable.

Once you have the RayApp object, use load_dsd or load_dsd_string methods to load a document structure description (DSD). Parameters of these methods are as follows.

load_dsd

The only parameter is URL of the DSD file. If you specify a relative URL, it will be resolved relative to the base URI of the RayApp object.

        my $dsd = $rayapp->load_dsd('invoice.dsd');
        my $dsd = $rayapp->load_dsd('file:///path/to/invoice.dsd');
load_dsd_string

For load_dsd_string, the DSD is specified as the sole parameter of the method call:

        my $dsd = $rayapp->load_dsd_string('<?xml version="1.0"?>
                <invoice>
                        <num type="int"/>
                        <data typeref="invoice_data.dsd"/>
                </invoice>
        ')

If the load_dsd or load_dsd_string fails for whatever reason, it returns undef and the error message can be retrieved using errstr method of RayApp:

        my $dsd = $ra->load_dsd('data.xml')
                or die $ra->errstr;

On success, these methods give you a RayApp::DSD object that accepts further method calls.

RayApp::DSD object

The incoming parameters of the CGI request can be checked against the _param specification included in the DSD, using the validate_parameters. It is designed to seamlessly accept hash (array) of parameters or a CGI / Apache::Request / Apache::RequestRec -compatible object, and fetch the parameters from it. The method returns true when all parameters match the DSD, false otherwise. On error, errstr method of the RayApp::DSD object gives the reason.

        my $q = new CGI;
        if (not $dsd->validate_parameters($q)) {
                # ... $dsd->errstr
        }
        
        $dsd->validate_parameters('id' => 1, 'id' => 2,
                'name' => 'PC', 'search' => 'Search')
                or # ...

From the DSD, the document type definition (DTD) can be derived, providing DOCTYPE of the resulting data. Use method get_dtd to receive DTD as a string.

        my $dtdstring = $dsd->get_dtd;

The most important action that can be done with a RayApp::DSD object is serialization of data returned by the application, according to the DSD. Method serialize_data accepts hash with data as its first argument, and optionally secont argument with options driving the serialization. The method returns the output XML string.

        my $xml = $dsd->serialize_data({
                id => 14,
                name => 'Peter'
                });

Alternatively, a method serialize_data_dom can be used which behaves identically, only returning the DOM instead of the string. That may be benefitial if the result is immediatelly postprocessed using Perl tools, saving one parse call.

The supported serialization options are:

RaiseError

By default it is true (1), resulting in an exception whenever a serialization error occurs. This behavior may be switched off by setting the parameter to zero. In that case the result is returned even if the data did not match the DSD exactly (which may lead to the output XML not matching its DOCTYPE). Use errstr to verify that the serialization was without errors.

        my $dom = $dsd->serialize_data_dom({
                people => [ { id => 2, name => 'Bob' },
                        { id => 31, name => 'Alice' } ]
                }, { RaiseError => 0 });
        if ($dsd->errstr) { # ...
doctype

This value will be used as a SYSTEM identifier of the DOCTYPE.

doctype_ext

The SYSTEM identifier will be derived from the URI of the DSD by changing extension to this string.

        my $xml = $dsd->serialize_data({}, { doctype_ext => '.dtd' });

The DOCTYPE will be included in the resulting XML only if one of the doctype or doctype_ext options are used.

validate

The resulting XML is serialized to XML and parsed back while being validated against the DTD derived from the DSD. Set this option to true to enable this extra safe bahaviour.

        my $dom = $dsd->serialize_data_dom({
                numbers => [ 13.4, 3, 45 ],
                rows => $dbh->selectall_arrayref($sth)
                }, { validate => 1 });

Serialized data (the resulting XML) can be immediatelly postprocessed with serialize_style or serialize_style_dom methods. They take the same arguments as serialize_data, but each additional argument is considered to be a URI of a XSLT stylesheet. The stylesheets will be applied to the output XML in the order in which they are specified.

        my $html = $dsd->serliaze_style({
                found => { 1 => 'x', 45 => 67 }
                }, { RaiseError => 0 },
                'generic.xslt',
                'finetune.xslt',
                );

In scalar context, the result of the transformations is returned. In an array context, the result is returned as the first element, followed by the media type (a.k.a. content type) and encoding (a.k.a. charset) of the output.

Executing application handlers

The RayApp object, besides access to the load_dsd* methods, provides methods of executing application handlers, either using the Apache::Registry style inside of the calling Perl/mod_perl environment, or using external CGI scripts.

Method execute_application_handler (and its reusing companion execute_application_handler_reuse) of RayApp object take a single parameter with a file/URL of the Perl handler, or a RayApp::DSD object. The application code is loaded (or reused) and a method handler is invoked. The data then can be passed directly to the serialize* methods of RayApp::DSD object.

        $dsd = $rayapp->load_dsd($uri);
        my $data = $rayapp->execute_application_handler($dsd);
        # my $data = $rayapp->execute_application_handler('script.pm');
        $dsd->serialize_style($data, {}, 'stylesheet.xsl');

When the RayApp::DSD is passed as an argument, the application name is derived the standard way, from the application attribute of the root element of the DSD.

Any additional parameters to execute_application* methods are passed over to the handler methods of the loaded application.

The application can also be invoked in a separate process, using execute_application_process_storable method. The data of the application is then stored using RayApp::CGIStorable module and transferred back to RayApp using application's standard output handle.

SEE ALSO

LWP::UserAgent(3), XML::LibXML(3)

AUTHOR

Copyright (c) Jan Pazdziora 2001--2004

VERSION

This documentation is believed to describe accurately RayApp version 1.146.