The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

App::CELL::Guide - Introduction to App::CELL (POD-only module)

VERSION

Version 0.153

SYNOPSIS

   $ perldoc App::CELL::Guide

INTRODUCTION

App::CELL is the Configuration, Error-handling, Localization, and Logging (CELL) framework for applications written in Perl. In the "APPROACH" section, this Guide describes the CELL approach to each of these four areas, separately. Then, in the </RATIONALE> section, it presents the author's reasons for bundling them together.

HISTORY

CELL was written by Smithfarm in 2013 and 2014, initially as part of the Dochazka project [[ link to SourceForge ]]. Due to its generic nature, it was spun off into a separate project.

APPROACH

This section presents CELL's approach to each of its four principal functions: "Configuration", "Error handling", Localization, and Logging.

Configuration

CELL provides the application developer and site administrator with a straightforward and powerful way to define configuration parameters as needed by the application.

Configuration parameters are placed in specially-named files within a directory referred to by CELL as the "site configuration directory", or "sitedir". CELL recognizes three types of configuration parameters (and, hence, three types of configuration files). These three types are called meta, core, and site parameters, respectively.

The first category, meta, consists of "mutable" parameters -- i.e. parameters that can be changed by the application program. These are similar to global/package variables.

core and site are the second and third categories, or "namespaces". These are used for storing immutable values. Though the values themselves are read-only, a given parameter FOOBAR can be "changed" by defining its default value in core and then setting the FOOBAR site parameter to a different value. In such a case, the site FOOBAR will take precedence over its core counterpart.

Since the configuration files themselves are Perl modules, Perl is leveraged to parse them. Values can be any legal scalar value, so references to arrays, hashes, or subroutines can be used, as well as simple numbers and strings. For details, see "SITE CONFIGURATION DIRECTORY", App::CELL::Config and App::CELL::Load.

CELL's configuration logic is inspired by Request Tracker.

Error handling

To facilitate error handling and make the application's source code easier to read and understand, or at least mitigate its impenetrability, CELL provides the App::CELL::Status module, which enables functions in the application to return status objects if desired.

Status objects have the following principal attributes: level, code, args, and payload, which are given by the programmer when the status object is constructed, as well as attributes like text, lang, and caller, which are derived by CELL. In addition to the attributes, Status.pm also provides some useful methods for processing status objects.

In order to signify an error, subroutine foo_dis could for example do this:

    return $CELL->status(
        level => 'ERR',
        code => 'Gidget displacement %s out of range',
        args => [ $displacement ],
    );

Upon success, foo_dis could return an 'OK' status with the gidgit displacement value in the payload:

    return $CELL->ok( $displacement );

The calling function could check the return value like this:

    my $status = foo_dis();
    return $status if $status->not_ok;
    my $displacement = $status->payload;
    

For details, see App::CELL::Status and App::CELL::Message.

CELL's error-handling logic is inspired by brian d foy's article "Return error objects instead of throwing exceptions"

    L<http://www.effectiveperlprogramming.com/2011/10/return-error-objects-instead-of-throwing-exceptions/>

Localization

This CELL component, called "Localization", gives the programmer a way to encapsulate a "message" (in its simplest form, a string) within a message object and then use that object in various ways.

So, provided the necessary message files have been loaded, the programmer could do this:

    my $message = $CELL->message( code => 'FOOBAR' );
    print $message->text, '\n'; # message FOOBAR in the default language
    print $message->text( lang => 'de' ) # same message, in German

Messages are loaded when CELL is initialized, from files in the site configuration directory. Each file contains messages in a particular language. For example, the file Dochazka_Message_en.conf contains messages relating to the Dochazka application, in the English language. To provide the same messages in German, the file would be copied to Dochazka_Message_de.conf and translated.

Since message objects are used by App::CELL::Status, it is natural for the programmer to put error messages, warnings, etc. in message files and refer to them by their codes.

App::CELL::Message could also be extended to provide methods for encrypting messages and/or converting them into various target formats (JSON, HTML, Morse code, etc.).

For details, see </MESSAGE CONFIGURATION> and <App::CELL::Message>.

Logging

For logging, CELL uses Log::Any and optionally extends it by adding the caller's filename and line number to each message logged.

Message and status objects have 'log' methods, of course, and by default all statuses (except 'OK') are logged upon creation.

Here's how to set up (and do) logging in the application:

    use App::CELL::Log qw( $log );
    $log->init( ident => 'AppFoo' );
    $log->debug( "Three lines into AppFoo" );

App::CELL::Log provides its own singleton, but since all method calls are passed to Log::Any, anyway, the App::CELL::Log singleton behaves just like its Log::Any counterpart. This is useful, e.g., for testing log messages:

    use Log::Any::Test;
    $log->contains_only_ok( "Three lines into AppFoo" );

To actually see your log messages, you have to do something like this:

    use Log::Any::Adapter ('File', $ENV{'HOME'} . '/tmp/CELLtest.log');

SITE CONFIGURATION DIRECTORY

Two directories

The site configuration directory, or "sitedir", is where all the application's configuration information (core, site, and meta parameters; message codes and texts) is stored.

CELL itself has an analogous configuration directory, called the "sharedir", where it's own internal configuration defaults are stored. CELL's core parameters can be overridden by the application's site params.

During initialization, CELL recursively walks first the sharedir, and then the sitedir, looking through those directories and all their subdirectories for meta, core, site, and message configuration files.

The sharedir is part of the App::CELL distro and CELL's initialization routine finds it via a call to the dist_dir routine in the File::ShareDir module.

How CELL finds it

The sitedir must be created and populated with configuration files by the site administrator. CELL's initialization routine finds it by looking in three places:

sitedir parameter -- the initialization route, $CELL->init, takes a sitedir parameter containing the full path to the sitedir. For portability, the path should be constructed using File::Spec (e.g. the catfile method) or similar.
enviro parameter -- if no valid sitedir paramter is given, init looks for a parameter called enviro containing the name of an environment variable containing the sitedir path.
CELL_SITEDIR environment variable -- if no viable sitedir can be found by consulting the function call parameters, init looks in this literal environment variable

For examples of how to call the init routine, see App::CELL.

How to populate it

Once it finds a valid site configuration directory tree, CELL walks it, looking for files matching one four regular expressions:

^.+_MetaConfig.pm$ (meta)
^.+_Config.pm$ (core)
^.+_SiteConfig.pm$ (site)
^.+_Message(_[^_]+){0,1}.conf$ (message)

Files with names that don't match any of the above regexes are ignored.

For the syntax of these files see CELL's own configuration files in the sharedir (config/ in the distro). All four types of configuration file are represented there, with comments.

HOW CONFIG PARAMS ARE INITIALIZED

All three categories of config params (meta, core, and $site) are initialized by require-ing configuration files, which are actually simple Perl modules, in the site configuration directory (sitedir).

The sitedir path is determined using the following simple algorithm:

1. if sitedir argument given to $CELL->load, assume that is the sitedir path; done.
2. if enviro argument given to $CELL->load, assume that is the name of an environment variable containing the sitedir path; done.
3. look for an environment variable CELL_SITEDIR and if it contains a viable sitedir path, done; otherwise trigger a warning that there is no sitedir.

CELL's configuration parameters are modelled after those of Request Tracker. Configuration files are special Perl modules that are loaded at run-time. These modules consist of a series of calls to a set function (which resides in App::CELL::Config).

MESSAGE CONFIGURATION

Introduction

To an application programmer, localization may seem like a daunting proposition. All strings the application displays to users must be replaced by variable names. Then you have to figure out where to put all the strings, translate them into multiple languages, write a library (or find an existing one) to display the right string in the right language at the right time and place. What is more, the application must be configurable, so the language can be changed by the user or the site administrator.

All of this is a lot of work, particularly for already existing, non-localized applications, but even for new applications designed from the start to be localizable.

App::CELL's objective is to provide a simple, straightforward way to write and maintain localizable applications in Perl. Notice the key word "localizable" -- the application may not, and most likely will not, be localized in the initial stages of development, but that is the time when localization-related design decisions need to be made. App::CELL tries to take some of the guesswork out of those decisions.

Later, when it really is time for the application to be translated into one or more additional languages, this becomes a relatively simple matter of translating a bunch of text strings that are grouped together in one or more configuration files with syntax so trivial that no technical expertise is needed to work with them. (Often, the person translating the application is not herself technically inclined.)

Localization with App::CELL

All strings that may potentially need be localized (even if we don't have them translated into other languages yet) are placed in message files under the site configuration directory. In order to be found and parsed by App::CELL, message files must meet some basic conditions:

1. file name format: AppName_Message_lang.conf
2. file location: anywhere under the site configuration directory
3. file contents: must be parsable

Format of message file names

At initialization time, App::CELL walks the site configuration directory tree looking for filenames that meet certain regular expressions. The regular expression for message files is:

    ^.+_Message(_[^_]+){0,1}.conf$

In less-precise human terms, this means that the initialization routine looks for filenames consisting of at least three, but possibly four, components:

1. the application name (this can be anything)
2. followed by _Message
3. optionally followed by _languagetag where "languagetag" is a language tag (see "..link.." for details)
4. ending in .conf

Examples:

    CELL_Message.conf
    CELL_Message_en.conf
    CELL_Message_cs-CZ.conf
    DifferentApplication_Message.conf

Location of message files

As noted above, message files will be found as long as they are readable and located anywhere under the base site configuration directory. For details on how this base site configuration directory is searched for and determined, see "..link..".

How message files are parsed

Message files are parsed line-by-line. The parser routine is parse_message_file in the CELL::Load module. Lines beginning with a hash sign ('#') are ignored. The remaining lines are divided into "stanzas", which must be separated by one or more blank lines.

Stanzas are interpreted as follows: the first line of the stanza should contain a message code, which is simply a string. Any legal Perl scalar value can be used, as long as it doesn't contain white space. CELL itself uses ALL_CAPS strings starting with CELL_.

The remaining lines of the stanza are assumed to be the message text. Two caveats here:

1. In the configuration file, message text strings can be written on multiple lines
2. However, this is intended purely as a convenience for the application programmer. When parse_message_file encounters multiple lines of text, it simply concatenated them together to form a single, long string.

For details, see the parse_message_file function in App::CELL::Load, as well as App::CELL's own message file(s) in config/CELL directory of the App::CELL distro.

How the language is determined

Internally, each message text string is stored along with a language tag, which defines which language the message text is written in. The language tag is derived from the filename using a regular expression like this one:

    _Message_([^_]+).conf$

(The part in parentheses signifies the part between _Message_ and .conf -- this is stored in the language attribute of the message object.)

No sanity checks are conducted on the language tag. Whatever string the regular expression produces becomes the language tag for all messages in that file. If no language tag is found, CELL first looks for a config parameter called CELL_DEFAULT_LANGUAGE and, failing that, the hard-coded fallback value is en.

I'll repeat that, since it's important: CELL assumes that the message file names contain the relevant language tag. If the message file name is MyApp_Message_foo-bar.conf, then CELL will tag all messages in that file as being in the foo-bar language. Message files can also be named like this: MyApp_Message.conf, i.e. without a language tag. In this case, CELL will attempt to determine the default language from a site configuration parameter (CELL_DEFAULT_LANGUAGE). If this parameter is not set, then CELL will give up and assume that all message text strings are in English (language tag en -- CELL's author's native tongue).

Language tags in general

See the W3C's "Language tags in HTML and XML" white paper for a detailed explanation of language tags:

    L<http://www.w3.org/International/articles/language-tags/>

And see here for list of all language tags:

    L<http://www.langtag.net/registries/lsr-language.txt>

Note that you should use hyphens, and not underscores, to separate components within the language tag, i.e.:

    MyApp_Message_cs-CZ.conf   # correct
    MyApp_Message_cs_CZ.conf   # WRONG!!

Non-ASCII characters in config/message file names: may or may not work. Better to avoid them.

Normal usage

In normal usage, the programmer adds messages to the respective message files. After CELL initialization, these messages (or, more precisely, message code-language pairs) will be available to the programmer to use, either directly via CELL::Message->new or indirectly as status codes.

If a message code has text strings in multiple languages, these language variants can be obtained by specifying the Lang parameter to CELL::Message->new. If the Lang parameter is not specified, CELL will always try to use the default language (CELL_DEFAULT_LANGUAGE or English if that parameter has not been set).

STATUS OBJECTS

The most frequent case will be a status code of "OK" with no message (shown here with optional "payload", which is whatever the function is supposed to return on success:

    # all green
    return App::CELL::Status->new( level => 'OK',
                                  payload => $my_return_value,
                                );

To ensure this is as simple as possible in cases when no return value (other than the simple fact of an OK status) is needed, we provide a special constructor method:

    # all green
    return App::CELL::Status->ok;

In most other cases, we will want the status message to be linked to the filename and line number where the new method was called. If so, we call the method like this:

    # relative to me
    App::CELL::Status->new( level => 'ERR', 
                           code => 'CODE1',
                           args => [ 'foo', 'bar' ],
                         );

It is also possible to report the caller's filename and line number:

    # relative to my caller
    App::CELL::Status->new( level => 'ERR', 
                           code => 'CODE1',
                           args => [ 'foo', 'bar' ],
                           caller => [ caller ],
                         );

It is also possible to pass a message object in lieu of code and msg_args (this could be useful if we already have an appropriate message on hand):

    # with pre-existing message object
    App::CELL::Status->new( level => 'ERR', 
                           msg_obj => $my_msg;
                         );

Permitted levels are listed in the @permitted_levels package variable in App::CELL::Log.

COMPONENTS

App::CELL

This top-level module exports a singleton, $CELL, which is all the application programmer needs to gain access to the CELL's key functions.

App::CELL::Config

This module provides CELL's Configuration functionality.

App::CELL::Guide

This guide.

App::CELL::Load

This module hides all the complexity of loading messages and config params from files in two directories: (1) the App::CELL distro sharedir containing App::CELL's own configuration, and (2) the site configuration directory, if present.

App::CELL::Log

Logging is accomplished by using and extending Log::Any.

App::CELL::Message

Localization is on the wish-list of many software projects. With CELL, the programmer can easily design and write my application to be localizable from the very beginning, without having to invest much effort.

App::CELL::Status

Provides CELL's error-handling functionality. Since status objects inherit from message objects, the application programmer can instruct CELL to generate localized status messages (errors, warnings, notices) if desired.

App::CELL::Test

Some routines used by CELL's test suite.

App::CELL::Util

Some generalized utility routines.

RATIONALE

In the author's experience, applications written for "users" (however that term may be defined) frequently need to:

1. be configurable by the user or site administrator
2. handle errors robustly, without hangs and crashes
3. potentially display messages in various languages
4. log various types of messages to syslog

Since these basic functions seem to work well together, CELL is designed to provide them in an integrated, well-documented, straightforward, and reusable package.