The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Config::Parser - base class for configuration file parsers

DESCRIPTION

Config::Parser provides a framework for writing configuration file parsers. It is an intermediate layer between the abstract syntax tree (Config::AST) and implementation of a parser for a particular configuration file format.

It takes a define by example approach. That means that the implementer creates a derived class that implements a parser on top of Config::Parser. Application writers write an example of configuration file in the __DATA__ section of their application, which defines the statements that are allowed in a valid configuration. This example is then processed by the parser implementation to create an instance of the parser, which is then used to process the actual configuration file.

Let's illustrate this on a practical example. Suppose you need a parser for a simple configuration file, which consists of keyword/value pairs. In each pair, the keyword is separated from the value by an equals sign. Pairs are delimited by newlines. Leading and trailing whitespace characters on a line are ignored as well as are empty lines. Comments begin with a hash sign and end with a newline.

You create the class, say Config::Parser::KV, inherited from Config::Parser. The method parser in this class implements the actual parser.

Application writer decides what keywords are allowed in a valid configuration file and what are their values and describes them in the __DATA__ section of his program (normally in a class derived from Config::Parser::KV, in the same format as the actual configuration file. For example:

  __DATA__
  basedir = STRING :mandatory
  mode = OCTAL
  size = NUMBER :array

This excerpt defines a configuration with three allowed statements. Uppercase values to the right of the equals sign are data types. Values starting with a colon are flags that define the semantics of the values. This section declares that three keywords are allowed. The basedir keyword takes string as its argument and must be present in a valid configuration. The mode expects octal number as its argument. The size keyword takes a number. Multiple size statements are collapsed into an array.

To parse the actual configuration file, the programmer creates an instance of the Config::Parse::KV class, passing it the name of the file as its argument:

  $cf = new Config::Parse::KV($filename);

This call first parses the __DATA__ section and builds validation rules, then it parses the actual configuration from $filename. Finally, it applies the validation rules to the created syntax tree. If all rules pass, the configuration is correct and the constructor returns a valid object. Otherwise, it issues proper diagnostics and croaks.

Upon successful return, the $cf object is used to obtain the actual configuration values as needed.

Notice that syntax declarations in the __DATA__ section always follow the actual configuration file format, that's why we call them definition by example. For instance, the syntax definition for a configuration file in Apache-like format would look like

  __DATA__
  <section ANY>
     basedir STRING :mandatory
     mode OCTAL
     size NUMBER :array
  </section>

CONSTRUCTOR

$cfg = new Config::Parser(%hash)

Creates a new parser object. Keyword arguments are:

filename

Name of the file to parse. If supplied, the constructor will call the parse and commit methods automatically and will croak if the latter returns false. The parse method is given filename, line and fh keyword-value pairs (if present) as its arguments.

If not supplied, the caller is supposed to call both methods later.

line

Optional line where the configuration starts in filename. It is used to keep track of statement location in the file for correct diagnostics. If not supplied, 1 is assumed.

Valid only together with filename.

fh

File handle to read from. If it is not supplied, new handle will be created by using open on the supplied filename.

Valid only together with filename.

lexicon

Dictionary of allowed configuration statements in the file. You will not need this parameter. It is listed here for completeness sake. Refer to the Config::AST constructor for details.

USER HOOKS

These are the methods provided for implementers to do any implementation- specific tasks. Default implementations are empty placeholders.

$cfg->init

Called after creation of the base object, when parsing of the syntax definition has finished. Implementers can use it to do any implementation-specific initialization.

$cfg->mangle

Called after successful parsing. It can be used to modify the created source tree.

PARSER METHODS

The following two methods are derived from Config::AST. They are called internally by the constructor, if the file name is supplied.

$cfg->parse($filename, %opts)

Parses the configuration from $filename. Optional arguments are:

fh

File handle to read from. If it is not supplied, new handle will be created by using open on the supplied filename.

line

Line to start numbering of lines from. It is used to keep track of statement location in the file for correct diagnostics. If not supplied, 1 is assumed.

$cfg->commit

Finalizes the syntax tree. Returns true on success, and false on errors.

SYNTAX DEFINITION

Syntax definition is a textual description of statements allowed in a configuration file. It is written in the format of the configuration file itself and is parsed using the same object (derivative of Config::Parser) that will be used later to parse the actual configuration.

Syntax definitions are gathered from the __DATA__ blocks of subclasses of Config::Parser.

In a syntax definition the value of each statement consists of optional data type followed by zero or more options delimited with whitespace.

Valid data types are:

STRING

String value.

NUMBER or DECIMAL

Decimal number.

OCTAL

Octal number.

HEX

Hex number.

BOOL or BOOLEAN

Boolean value. Allowed values are: yes, true, on, t, 1, for true and no, false, off, nil, 0, for false.

If the data type is omitted, no checking is performed unless specified otherwise by other options (see the :re and :check options below).

Options are special names prefixed with a colon. Option names follow the keywords from the Config::AST keyword lexicon value. An option can be followed by an equals sign and its value. If an option is used without arguments, the value 1 is implied.

Any word not recognized as an option or its value starts the default value.

Available options are described below:

:mandatory

Marks the statement as a mandatory one. If such a statement is missing from the configuration file, the parser action depends on whether the default value is supplied. If it is, the statement will be inserted in the parse tree with the default value. Otherwise, a diagnostic message will be printed and the constructor will return undef.

:default

Argument supplies the default value for this setting.

:array

If the value is 1, declares that the statement is an array. Multiple occurrences of the statement will be accumulated. They can be retrieved as a reference to an array when the parsing is finished.

:re = string

Defines a regular expression which the value must match in order to be accepted. This provides a more elaborate mechanism of checking than the data types. In fact, data types are converted to the appropriate :re options internally, for example OCTAL becomes :re = "^[0-7]+$". If data type and :re are used together, :re takes precedence.

:select = method

Argument is the name of a method to call in order to decide whether to apply this definition. The method will be called as

  $cfg->{ \$method }($node, @path)

where $node is the Config::AST::Node::Value object (use $vref->value, to obtain the actual value), and @path is its pathname.

:check = method

Argument is the name of a method which will be invoked after parsing the statement in order to verify its value. This provides the most flexible way of verification (the other two being the :re option and data type declaration). The method will be invoked as follows:

  $cfg->{ \$method }($valref, $prev_value, $locus)

where $valref is a reference to the value, and $prev_value is the value of the previous instance of this setting. The method must return true, if the value is OK for that setting. In that case, it is allowed to modify the value referenced by $valref. If the value is erroneous, the method must issue an appropriate error message using $cfg->error, and return 0.

To specify options for a section, use the reserved keyword __options__. Its value is the list of options as described above. After processing, the keyword itself is removed from the lexicon.

OTHER METHODS

$cfg->check($valref, $prev, $locus)

This method implements syntax checking and translation for BOOLEAN data types. If $$valref is one of the valid boolean values (as described above), it translates it to 1 or 0, stores that value in $valref, and returns 1. Otherwise, it emits error message using $cfg-error> and returns 0.

SEE ALSO

Config::AST(3).