The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Activator::Config - provides a merged configuration to a script combining command line options, environment variables, and configuration files.

SYNOPSIS

  use Activator::Config;

  my $config = Activator::Config->get_config( \@ARGV);  # default realm
  my $config = Activator::Config->get_config( \@ARGV, $otherrealm);

  #### Get a hashref of command line arguments, and an arrayref of bareword arguments
  my ( $config, $args ) = Activator::Config->get_args( \@ARGV );

DESCRIPTION

This module allows a script or application to have a complex configuration combining options from command line, environment variables, and YAML configuration files.

For a script or application, one creates any number of YAML configuration files. These files will be deterministically merged into one hash. You can then pass this to an application or write it to file.

This module is not an options validator. It uses command line options as overrides to existing keys in configuration files and DOES NOT validate them. Unrecognized command line options are ignored and @ARGV is modified to remove recognized options, leaving barewords and unrecognized options in place and the same order for a real options validator (like Getopt::Long). If you do use another options module, make sure you call get_config() BEFORE you call their processor, so that @ARGV will be in an appropriate state.

Environment variables can be used to act as a default to command line options, and/or override any top level configuration file key which is a scalar.

This module is cool because:

  • You can generate merged, complex configuration heirarchies that are context sensitive very easily.

  • You can pass as complex a config as you like to any script or application, and override any scalar configuration option with your environment variables or from the command line.

  • It supports realms, allowing you to have default configurations for development, QA, production, or any number of arbitrary realms you desire. That is, with a simple command line flag, you can switch your configuration context.

Configuration Source Precedence

The precedence heirarchy for configuration from highest to lowest is:

  • command line options

  • environment variables

  • forced overrides from config files

  • merged settings from YAML configuration files

COMMAND LINE ARGUMENTS

This module allows you to override configuration file settings from the command line. You can use long or short options using '-' or '--' notation, allows barewords in any order, and recognizes the arguments terminator '--'. Also supported are multiple flag arguments:

  #### turn on super verbosity. sets $config->{v} = 2
  myscript.pl -v -v

You can specify configured options at the command line for override:

  #### override the configuration file setting for 'foo'
  myscript.pl --foo=bar

Note that while YAML configuration (and this module) support deep structures for configuration, you can only override top level keys that are scalars using command line arguments and/or environment variables.

Reserved Arguments

There are a few reserved command line arguments:

 --skip_env        : ignore environment variables (EXCEPT $USER)
 --project=<>      : used to search for the C<E<lt>projectE<gt>.yml> file
 --realm=<>        : use C<E<lt>realmE<gt>.yml> in config file processing and
                     consider all command line arguments to be in this realm
 --conf_path       : colon separated list of directories to search for config files

Project as a Bareword Argument

There are times where a script takes the project name as a required bareword argument. For these cases, require that project be the last argument, and pass a flag to "get_config()".

That is, when your script is called like this:

  myscript.pl --options <project>

get the config like this:

  Activator::Config->get_config( \@ARGV, undef, 1 );

The second argument to "get_config()" is the realm, so you pass undef (unless you know the realm you are looking for) to allow the command line options and environment variables to take affect.

ENVIRONMENT VARIABLES

Environment variables can be used to act as a default to command line options, and/or override any top level configuration file key which is a scalar. The expected format is ACT_CONFIG_[key]. Note that YAML is case sensitive, so the environment variables must match. Be especially wary of command shell senstive characters in your YAML keys (like :~><|).

If you wish to override a key for only a particular realm, you can insert the realm into the env variable wrapped by double underscores:

 ACT_CONFIG_foo       - set 'foo' for default realm
 ACT_CONFIG__bar__foo - set 'foo' only for 'bar' realm

The "Reserved Arguments" listed in the "COMMAND LINE ARGUMENTS" section also have corresponding environment variables with only skip_env being slightly different:

 ACT_CONFIG_skip_env     : set to 1 to skip, or 0 (or don't set it at all) to
                        not skip
 ACT_CONFIG_project      : same as command line argument
 ACT_CONFIG_realm        : same as command line argument
 ACT_CONFIG_conf_path    : same as command line argument

Automatically Imported Environment Variables

Since they tend to be generally useful, the following environment variables are automatically imported into your configuration:

  • HOME

  • USER

it is "FUTURE WORK" to make these cross-platform compatible.

CONFIGURATION FILES

Currently, you can put your YAML configuration file wherever you like, but you must set a key inside your configuration files conf_path, then set the environment variable ACT_CONFIG_conf_path or use the --conf_path option. It is somewhat wonky the way this currently works, and it'll get fixed Real Soon Now.

This path behaves the same as a bash shell $PATH, in that you can set this to one or more colon separated fully qualified path values. Note that the leftmost path takes precedence when processing config files.

Configuration File Heirarchy

In order to facilite the varied ways in which software is developed, deployed, and used, the following heirarchy lists the configuration file heirarchy suported from highest precedence to lowest:

  $ENV{USER}.yml - user specific settings
  <realm>.yml    - realm specific settings and defaults
  <project>.yml  - project specific settings and defaults
  org.yml        - top level organization settings and defaults

It is up to the script using this module to define what project is, and up to the project to define what realms exist, which all could come from any of the command line options, environment variables or configuration files. All of the above files are optional and will be ignored if they don't exist.

Realm Configuration Files

This module supports the concept of realms to allow multiple similar configurations to override only the esential keys. This allows you to have a very large default project configuration file, and for each realm a very small configuration file overriding only the few keys that vary between realms (db connection, email defaults, apache settings, cookie domain for example).

A common configuration directory will have the following files:

  <user>.yml files
  qa.yml
  dev.yml
  prod.yml

Using the --realm option or ACT_CONFIG_realm environment variable set to qa, dev or prod will cause realm.yml to be used during configuration file processing in addition to any realm specific keys in any other config files being utilized.

CONFIGURATION FILE FORMAT

The format for configuration files is YAML. In addition to YAML's requirements, you must define top level relams within your YAML files.

When passing a realm to "get_config()" (or via the --realm command line argument), values for the realm take precedence over the default realm's values. For example, given YAML:

  default:
    key1: value1
  realm:
    key1: value2

Activator::Config->get_config( \@ARGV ) would return:

$config = { key1 => value1 }

and Activator::Config->get_config( \@ARGV, 'realm' ) would return:

$config = { key1 => value2 }

Overrides Format

Sometimes it is desireable to override the generated value after merging several configuration files. There is support for the special realm overrides can be utilzed in these cases, and will stomp any values that come from YAML configurations. For example, given YAML:

  default:
    name: David Davidson from Deluth, Delaware
  some_realm:
    name: Sally Samuelson from Showls, South Carolina
  other_realm:
    name: Ollie Oliver from Olive Branch, Oklahoma
  overrides:
    default:
      name: Ron Johnson from Ronson, Wisconson
    some_realm:
      name: Johnny Jammer, the Rhode Island Hammer

Activator::Config->get_config( \@ARGV ) would return:

  $config = { name => 'Ron Johnson from Ronson, Wisconson', }

Activator::Config->get_config( \@ARGV, 'some_realm' ) would return:

  $config = { name => 'Johnny Jammer, the Rhode Island Hammer' }

Activator::Config->get_config( \@ARGV, 'other_realm' ) would return:

  $config = { name => 'Ollie Oliver from Olive Branch, Oklahoma' }

How to NOT use realms

If you don't need realms for a particular config file (as is often the case with the <project>.yml file), use the special key act_config_no_realms. Example:

  act_config_no_realms:
  this_key: is in the default realm
  this_one: too

CONFIGURATION LOGIC SUMMARY

  • All configuration files are read and merged together with higher precedence configuration files overriding lower precedence on a realm by realm basis.

    If identically named files exist in the conf_path for any level (user, realm, project, organization), only the first discovered file is used. Put another way, the leftmost path in the conf_path takes precedence for any file name conflict.

  • The default realm is merged into each realm (realm's values taking precedence).

  • All default realm environment variables override all values for each realm (excepting the overrides realm).

  • All specific realm environment variables override that realm's values.

  • The default realm overrides section is used to override matching keys in each realm.

  • The specific realm overrides section is used to override matching keys in realm.

  • Any command line options given override ALL matching keys for ALL realms.

  • # TODO: NOT YET IMPLEMENTED

    Perform variable substitution

METHODS

get_config()

Process command line arguments, environment variables and configuration files then return a hashref representing the merged configuration. Recognized configuration items are removed from @ARGV.

Usage: Activator::Config->get_config( \@ARGV, $realm, $project_is_arg );

$realm is optional (default is 'default'). If undefined, it will be determined from a command line option or environment variable.

$project_is_arg is optional. Use any true value for this argument if your script requries the project name as the last bareword argument.

Examples:

  #
  # get options for default realm
  #
  my $config = Activator::Config->get_config( \@ARGV );

  #
  # get options for 'some' realm, ignoring --realm and ACT_CONFIG_realm
  #
  my $config = Activator::Config->get_config( \@ARGV, 'some' );

  #
  # don't ignore --realm and ACT_CONFIG_realm, use $barewords[-1] (the
  # last bareword argument) as the project
  #
  Activator::Config->get_config( \@ARGV, undef, 1 );

See "get_args()" for a description of the way command line arguments are processed.

If called repeatedly, this sub does NOT reprocess \@ARGV. This allows you to make multiple calls to get a reference to the config for multiple realms if desired.

get_args()

Takes a reference to a list of command line arguments (usually \@ARGV) and returns an arrayref consisting of an options hash, and a barewords arrayref. $argv_raw is not changed.

Usage: Activator::Config->get_args( $argv_raw )

  • Arguments can be barewords, '-' notation or '--' notation.

  • Any arguments after the arguments terminator symbol (a plain '--' argument) are returned as barewords. Bareword order of specification is maintained.

  • Values with spaces must be double-quoted, and can themselves contain quotes

      --mode="sliding out of control"
      --plan="pump the "brakes" vigorously"
  • Flag arguments are counted. That is -v -v would set $config->{v} = 2

  • Argument bundling is not supported.

Examples:

  @ARGV                | Value returned
 ----------------------+-----------------------------------------
  --arg                | $argv = { arg => 1 }
  --arg --arg          | $argv = { arg => 2 }
  --arg=val            | $argv = { arg => 'val' }
  --arg=val --arg=val2 | $argv = { arg => [ 'val', 'val2' ] }
  --arg="val val"      | $argv = { arg => 'val val' }

Returns array: ( $args_hashref, $barewords_arrayref )

Throws Activator::Exception::Config when arg is invalid (which at this time is only when a barewod arg of '=' is detected).

DEBUG MODE

Since this module is part of Activator, you can set your Activator::Log level to DEBUG to see how your $config are generated.

 #### TODO: in the future, there needs to be a 'lint' hash within the
 #### realm that says where every variable came from.

COOKBOOK

 #### TODO: these examples are probably complete baloney at this point.

This section gives some examples of how to utilze this module. Each section below (cleverly) assumes we are writing a Cookbook application that can fetch recipies from a database.

End User

Use Case: A user has a CPAN module that provides cookbook.pl to lookup recipies from a database. The project installs these files:

  /etc/cookbook.d/org.yml
  /usr/lib/perl5/site-perl/Cookbook.pm
  /usr/bin/cookbook.pl

org.yml has the following data:

  ---
  default:
    db_name:   cookbook
    db_user:   chef
    db_passwd: southpark

The user can run the script as such:

  #### list recipes matching beans in the organization's public db
  #### using the public account
  cookbook.pl lookup beans

  #### lookup beans in user's db
  cookbook.pl --db_name=my_db  \
              --db_user=cookie \
              --db_passwd=cheflater  lookup beans

  #### user creates $HOME/$USER.yml
  cookbook.pl --conf_file=$HOME/$USER.yaml lookup beans

  #### user creates $HOME/.cookbook.d
  cookbook.pl lookup beans

Simple Development

Use Case: developer is working on cookbook.pl. Project directory looks like:

  $HOME/src/Cookbook/lib/Cookbook.pm
  $HOME/src/Cookbook/bin/cookbook.pl
  $HOME/src/Cookbook/etc/cookbook.d/org.yml
  $HOME/src/Cookbook/.cookbook.d/$USER.yml

With these configurations:

  org.yml:
  ---
  default:
    db_name:   cookbook
    db_user:   chef
    db_passwd: southpark

  $USER.yml
  ---
  default:
    db_name:   $USER
    db_user:   $USER
    db_passwd: passwd
  staging:
    db_name:   staging
    db_user:   test
    db_passwd: test

  #### when developing, call the script like this to lookup bean
  #### recipies from developers personal db
  cd $HOME/src/Cookbook
  bin/cookbook.pl lookup beans

  #### To demo the project to someone else, developer creates a demo
  #### account, which has the environment variable ACT_CONFIG_realm set
  #### to 'staging'. demo user then uses the script as if it were
  #### installed, but connects to the staging database:
  cookbook.pl lookup beans

  #### if the developer wants to see what the demo user sees:
  cd $HOME/src/Cookbook
  bin/cookbook.pl --realm=staging lookup beans

TODO: complex development

Someday, we'll have a really neat example of all the goodness this module is capable of.

FUTURE WORK

  • Make sure that "Automatically Imported Environment Variables" are cross platform compatible.

  • Don't force the conf_path arg: default to something like ~/.activator so a user can have default settings. Further, activator.pl should support a configuration wizard for this file.

  • Clean up cookbook

SEE ALSO

 L<Activator::Exception>
 L<Activator::Log>

AUTHOR

Karim A. Nassar

COPYRIGHT

Copyright (c) 2007 Karim A. Nassar <karim.nassar@acm.org>

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.