The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Argv - Provide an OO interface to an arg vector

SYNOPSIS

    use Argv;

    # A roundabout way of getting perl's version.
    my $pl = Argv->new(qw(perl -v));
    $pl->exec;

    # Run /bin/cat, showing how to provide "predigested" options.
    Argv->new('/bin/cat', [qw(-u -n)], @ARGV)->system;

    # A roundabout way of globbing.
    my $echo = Argv->new(qw(echo M*));
    $echo->glob;
    my $globbed = $echo->qx;
    print "'echo M*' globs to: $globbed";

    # A demonstration of the builtin xargs-like behavior.
    my @files = split(/\s+/, $globbed);
    my $ls = Argv->new(qw(ls -d -l), @files);
    $ls->parse(qw(d l));
    $ls->dbglevel(1);
    $ls->qxargs(1);
    my @long = $ls->qx;
    $ls->dbglevel(0);
    print @long;

    # A demonstration of how to use option sets in a wrapper program.
    @ARGV = qw(Who -a -y foo -r);       # hack up an @ARGV
    my $who = Argv->new(@ARGV);         # instantiate
    $who->dbglevel(1);                  # set verbosity
    $who->optset(qw(UNAME FOO WHO));    # define 3 option sets
    $who->parseUNAME(qw(a m n p));      # parse these to set UNAME
    $who->parseFOO(qw(y=s z));          # parse -y and -z to FOO
    $who->parseWHO('r');                # for the 'who' cmd
    warn "got -y flag in option set FOO\n" if $who->flagFOO('y');
    print Argv->new('uname', $who->optsUNAME)->qx;
    $who->prog(lc $who->prog);          # force $0 to lower case
    $who->exec(qw(WHO));                # exec the who cmd

More advanced examples can be lifted from the test script or the ./examples subdirectory.

RAISON D'ETRE

Argv presents an OO approach to command lines, allowing you to instantiate an 'argv object', manipulate it, and eventually execute it, e.g.:

    my $ls = Argv->new('ls', ['-l']));
    my $rc = $ls->system;       # or $ls->exec or $ls->qx

Which raises the immediate question - what value does this mumbo-jumbo add over Perl's native support such as:

    my $rc = system(qw(ls -l));

The answer comes in a few parts:

  • STRUCTURE

    First, Argv recognizes the underlying property of an arg vector, which is that it typically begins with a program name potentially followed by options, then operands. An Argv object factors a raw argv into these three groups, provides accessor methods to allow operations on each group independently, and can then put them back together for execution.

  • OPTION SETS

    Second, Argv encapsulates and extends Getopt::Long to allow parsing of the argv's options into different option sets. This is useful in the case of wrapper programs which may, for instance, need to parse out one set of flags to direct the behavior of the wrapper itself, extract a different set and pass them to program X, another for program Y, then exec program Z with the remainder. Doing this kind of thing on a basic @ARGV using indexing and splicing is do-able but leads to spaghetti-ish code.

  • EXTRA FEATURES

    The execution methods system, exec, and qx extend their Perl builtin analogues in a few ways, for example:

    1. An xargs-like capability without shell intervention.
    2. UNIX-like exec() behavior on Windows.
    3. Automatic quoting of system() on Win32 and qx() everywhere
    4. Automatic globbing (primarily for Windows)
    5. Automatic chomping.
    6. Pathname normalization.

All of these behaviors can be toggled, either as class or instance attributes. See EXECUTION ATTRIBUTES below.

DESCRIPTION

An Argv object treats a command line as 3 separate entities: the program, the options, and the args. The options may be futher subdivided into user-defined option sets by use of the optset method. When one of the execution methods is called, the parts are reassmbled into a single list and passed to the underlying Perl execution function.

Compare this with the way Perl works natively, keeping the 0th element of the argv in $0 and the rest in @ARGV.

By default there's one option set, known as the anonymous option set, whose name is the null string. All parsed options go there. The advanced user can define more option sets, parse options into them according to Getopt::Long-style descriptions, query or set the parsed values, and then reassemble them in any way desired at exec time. Declaring an option set automatically generates a set of methods for manipulating it (see below).

All argument-parsing within Argv is done via Getopt::Long.

AUTOLOADING

Argv employs the same technique made famous by the Shell module to allow any command name to be used as a method. E.g.

        $obj->date->exec;

will run the 'date' command. Internally this is translated into

        $obj->argv('date')->exec;

FUNCTIONAL INTERFACE

Because the extensions to system/exec/qx described here may be useful in writing portable programs, they're made available for export as traditional functions. Thus:

    use Argv qw(system exec qv);

will override the Perl builtins. There's no way to override the operator qx() so an alias qv() is provided.

CONSTRUCTOR

    my $obj = Argv->new(@list)

The @list is what will be parsed/executed/etc by subsequent method calls. During initial construction, the first element of the list is separated off as the program; the rest is lumped together as part of the args until and unless option parsing is done, in which case matched options are shifted into collectors for their various option sets. You can also create a "predigested" instance by passing any or all of the prog, opt, or arg parts as array refs. E.g.

    Argv->new([qw(cvs ci)], [qw(-l -n)], qw(file1 file2 file3));

Predigested options are placed in the default (anonymous) option set.

The constructor can be used as a class or instance method. In the latter case the new object is a deep (full) clone of its progenitor. In fact 'clone' is aliased to 'new', allowing clones to be created via:

        my $copy = $orig->clone;

The first argument to new() or clone() can be a hash-ref, which will be used to set execution attributes at construction time. I.e.:

    my $obj = Argv->new({autochomp => 1, stderr => 0}, @ARGV);

you may choose to add the command line later:

    my $obj = Argv->new;
    $obj->prog('cat');
    $obj->args('/etc/motd');

Or

    my $obj = Argv->new({autochomp=>1});
    my $motd = $obj->argv(qw(cat /etc/motd))->qx;

Or (using the autoloading interface)

    my $motd = $obj->cat('/etc/motd')->qx;

METHODS

INSTANCE METHODS

  • prog()

    Returns or sets the name of the program (the "argv[0]"). This can be a list, e.g. qw(rcs co) or an array reference.

  • opts()

    Returns or sets the list of operands (aka arguments). As above, it may be passed a list or an array reference. This is simply the member of the class of optsNAME() methods (see below) whose <NAME> is null; it's part of the predefined anonymous option set.

  • args()

    Returns or sets the list of operands (aka arguments). If called in a void context and without args, the effect is to set the list of operands to ().

  • optset(<list-of-set-names>);

    For each name NAME in the parameter list, an option set of that name is declared and 3 new methods are registered dynamically: parseNAME(), optsNAME(), and flagNAME(). These methods are described below: note that the anonymous option set (see OPTION SETS) is predefined, so the methods parse(), opts(), and flag() are always available. Most users won't need to define any other sets. Note that option-set names are forced to upper case. E.g.:

        $obj->optset('FOO');
  • parseNAME(...option-descriptions...)

    Takes a list of option descriptions and uses Getopt::Long::GetOptions() to parse them out of the current argv and into option set NAME. The opt-descs are exactly as supported by parseFOO() are exactly the same as those described for Getopt::Long, except that no linkage argument is allowed. E.g.:

        $obj->parseFOO(qw(file=s list=s@ verbose));
  • optsNAME()

    Returns or sets the list of options in the option set NAME.

  • flagNAME()

    Sets or gets the value of a flag in the appropriate optset, e.g.:

        print "blah blah blah\n" if $obj->flagFOO('verbose');
        $obj->flagFOO('verbose' => 1);
  • extract

    Takes an optset name and a list of option descs; creates the named optset, extracts any of the named options, places them in the specified optset, and returns them.

  • quote(@list)

    Protects the argument list against exposure to a shell by quoting each element. This method is invoked automatically by the system method on Windows platforms, where the underlying system primitive always uses a shell, and by the qx method on all platforms since it invokes a shell on all platforms.

    The automatic use of quote can be turned off via the autoquote method (see).

    IMPORTANT: this method quotes its argument list in place. In other words, it may modify its arguments.

  • glob

    Expands the argument list using the Perl glob builtin. Primarily useful on Windows where the invoking shell does not do this for you.

    Automatic use of glob on Windows can be enabled via the autoglob method (vide infra).

  • dump

    A no-op except for printing the state of the invoking instance to stderr. Potentially useful for debugging in situations where access to perl -d is limited, e.g. across a socket connection or in a crontab. Invoked automatically at dbglevel=3.

EXECUTION METHODS

The three methods below are direct analogues of the Perl builtins. They simply reassemble a command line from the prog, opts, and args parts according to the option-set rules described below and invoke their builtin equivalent on it.

  • system([<optset-list>])

    Reassembles the argv and invokes system(). Return value and value of $?, $!, etc. are just as described in perlfunc/"system"

    Arguments to this method determine which of the parsed option-sets will be used in the executed argv. If passed no arguments, $obj->system uses the value of the 'dfltsets' attribute as the list of desired sets. The default value of 'dfltsets' is the anonymous option set.

    An option set may be requested by passing its name (with an optional leading '+') or explicitly rejected by using its name with a leading '-'. Thus, given the existence of option sets ONE, TWO, and THREE, the following are legal:

        $obj->system;                       # use the anonymous set only
        $obj->system('+');                  # use all option sets
        $obj->system(qw(ONE THREE);         # use sets ONE and THREE

    The following sequence would also use sets ONE and THREE.

        $obj->dfltsets({ONE => 1, THREE => 1});
        $obj->system;

    while this would use all parsed options:

        $obj->dfltsets({'+' => 1});
        $obj->system;

    and this would set the default to none class-wide, and then use it:

        $obj->dfltsets({'-' => 1});
        $obj->system;

    By default the $obj->system method autoquotes its arguments iff the platform is Windows and the arguments are a list, because in this case a shell is always used. This behavior can be toggled with $obj->autoquote. Note: if and when Perl 5.6 fixes this "bug", Argv will be changed to examine the value of $].

  • exec()

    Similar to system above, but never returns. On Windows, it blocks until the new process finishes for a more UNIX-like behavior than the exec implemented by the C runtime library on Windows, if the execwait attribute is set. This is actually implemented as

        exit $obj->system(LIST);

    and thus all system shell-quoting issues apply

    Option sets are handled as described in system above.

  • qx()

    Same semantics as described in perlfunc/"qx" but has the capability to process only a set number of arguments at a time to avoid exceeding the shell's line-length limit. This value is settable with the qxargs method.

    Also, if autoquote is set the arguments are quoted to protect them against the platform-standard shell on all platforms.

    Option sets are handled as described in system above.

EXECUTION ATTRIBUTES

The behavior of the execution methods system, exec, and qx is governed by a set of execution attributes, which are in turn manipulated via a set of eponymous methods. These methods are auto-generated and thus share certain common characteristics:

  • Translucency

    They can all be invoked as class or instance methods. If used as an instance method the attribute is set only on that object; if used on the class it sets or gets the default for all instances which haven't overridden it. This is inspired by the section on translucent attributes in Tom Christiansen's perltootc tutorial.

  • Class Defaults

    Each attribute has a default which may be overridden with an environment variable by prepending the class name, e.g. ARGV_QXARGS=256 or ARGV_STDERR=0;

  • Context Sensitivity

    The attribute value is always a scalar. If a value is passed it becomes the new value of the attribute and the object or class is returned. If no value is passed and there is a valid return context, the current value is returned. In a void context with no parameter, the attribute value is set to 1.

  • Stickiness

    A subtlety: if an execution attribute is set in a void context, that attribute is "sticky", i.e. it retains its state until explicitly changed. But if a new value is provided and the context is not void, the new value is temporary. It lasts until the next execution method (system, exec, or qx) invocation, after which the previous value is restored. This feature allows locutions like this:

            $obj->cmd('date')->stderr(1)->system;

    Assuming that the $obj object already exists and has a set of attributes; we can override one of them at execution time. More examples:

            $obj->stdout(1);          # set attribute, sticky
            $obj->stdout;             # same as above
            $foo = $obj->stdout;      # get attribute value
            $obj2 = $obj->stdout(1);  # set to 1 (temporary), return $obj
  • autochomp

    All data returned by the qx method is chomped first. Unset by default.

  • autofail

    When set, the program will exit immediately if the system or qx methods detect a nonzero status. Unset by default.

    Autofail may also be given a code-ref, in which case that function will be called upon error. This provides a basic "exception-handling" system:

        $obj->autofail(sub { print "caught an exception\n"; exit 17 });

    Any failed executions by $obj will call handler(). Alternatively, if the reference provided is an array-ref, the first element of that array is assumed to be a code-ref as above and the rest of the array is passed as args to the function on failure.

    If the reference is to a scalar, this scalar is incremented for each error as execution continues, e.g.

        my $rc = 0;
        $obj->autofail(\$rc);
  • envp

    Allows a different environment to be provided during execution of the object. This setting is in scope only for the child process and will not affect the environment of the current process. Takes a hashref:

        my %newenv = %ENV;
        $newenv{PATH} .= ':/usr/ucb';
        delete @newenv{qw(TERM LANG LD_LIBRARY_PATH)};
        $obj->envp(\%newenv);

    Subsequent invocations of $obj will add /usr/ucb to PATH and subtract TERM, LANG, and LD_LIBRARY_PATH;

  • syfail,qxfail

    Similar to autofail but apply only to system() or qx() respectively. Unset by default.

  • autoglob

    If set, the glob() function is applied to the operands ($obj->args) on Windows only. Unset by default.

  • autoquote

    If set, the operands are automatically quoted against shell expansion before system() on Windows and qx() on all platforms (since qx always invokes a shell, and system() always does so on Windows). Set by default.

  • dbglevel

    Sets the debug level. Level 0 (the default) means no debugging, level 1 prints each command before executing it, and higher levels offer progressively more output. All debug output goes to stderr.

  • dfltsets

    Sets and/or returns the default set of option sets to be used in building up the command line at execution time. The default-default is the anonymous option set. Note: this method takes a hash reference as its optional argument and returns a hash ref as well. The selected sets are represented by the hash keys; the values are meaningless.

  • execwait

    If set, $obj->exec on Windows blocks until the new process is finished for a more consistent UNIX-like behavior than the traditional Win32 Perl port. Perl just uses the Windows exec() routine, which runs the new process in the background. Set by default.

  • inpathnorm

    If set, normalizes pathnames to their native format just before executing. This is NOT set by default; even when set it's a no-op except on Windows, where it converts /x/y/z to \x\y\z.

  • outpathnorm

    If set, normalizes pathnames returned by the qx method from \-delimited to /-delimited. This is NOT set by default; even when set it's a no-op except on Windows.

  • noexec

    Analogous to the -n flag to make; prints what would be executed without executing anything.

  • qxargs

    You can set a maximum number of arguments to be processed at a time, allowing you to blithely invoke e.g. $obj->qx on a list of any size without fear of exceeding your shell's limits. A per-platform default is set; this method allows it to be changed. A value of 0 suppresses the behavior.

  • syxargs

    Analogous to qxargs but applies to system() and is turned off by default. The reason is that qx() is typically used to read data whereas system() is more often used to make stateful changes. Consider that "ls foo bar" produces the same result if broken up into "ls foo" and "ls bar" but the same cannot be said for "mv foo bar".

  • stdout

    Setting this attribute to 0, e.g:

        $obj->stdout(0);

    causes STDOUT to be closed during invocation of any of the execution methods system, exec, and qx, and restored when they finish. A fancy (and portable) way of saying 1>/dev/null without needing a shell. A value of 2 is the equivalent of 1>&2.

  • stderr

    As above, for STDERR. A value of 1 is the equivalent of 2>&1:

        @alloutput = $obj->stderr(1)->qx;
  • quiet

    This attribute causes STDOUT to be closed during invocation of the system and exec (but not qx) execution methods. It will cause the application to run more quietly. This takes precedence over a redirection of STDOUT using the <$obj->stdout> method above.

  • attropts

    The above attributes can be set via method calls (e.g. $obj->dbglevel(1)) or environment variables (ARGV_DBGLEVEL=1). Use of the <$obj->attropts> method allows them to be parsed from the command line as well, e.g. myscript -/dbglevel 1. If invoked as a class method it causes options of the same names as the methods above to be parsed (and removed) from the current @ARGV and set as class attributes. As an instance method it parses and potentially depletes the current argument vector of that object, and sets instance attributes only. E.g.:

        Argv->attropts;

    would cause the script to parse the following command line:

        script -/noexec 1 -/dbglevel=2 -flag1 -flag2 arg1 arg2 arg3 ...

    so as to remove the -/noexec 1 -/dbglevel 2 and set the two class attrs. The -/ prefix is chosen to prevent conflicts with "real" flags. Abbreviations are allowed as long as they're unique within the set of -/ flags. As an instance method:

        $obj->attropts;

    it will parse the current value of $obj->args and run

        $obj->foo(1);

    for every instance of -/foo=1 found there.

PORTING

This module is known to work on Solaris 2.5-8 and Windows 2000 SP2, and with perl 5.004_04 and 5.6. As these platforms are quite different, there should be no major portability issues, but please send bug reports or patches to the address below. Recent testing is with newer (5.6+) versions of Perl so some backporting may be necessary for older Perls.

AUTHOR

David Boyce <dsb@boyski.com>

COPYRIGHT

Copyright (c) 1999-2002 David Boyce. All rights reserved. This Perl program is free software; you may redistribute and/or modify it under the same terms as Perl itself.

SEE ALSO

perl(1), Getopt::Long(3), IPC::ChildSafe(3)