NAME

PerlX::bash - tighter integration between Perl and bash

VERSION

This document describes version 0.05 of PerlX::bash.

SYNOPSIS

    # put all instances of Firefox to sleep
    foreach (bash \lines => pgrep => 'firefox')
    {
        bash kill => -STOP => $_ or die("can't spawn `kill`!");
    }

    # count lines in $file
    my $num_lines;
    local $@;
    eval { $num_lines = bash \string => -e => wc => -l => $file };
    die("can't spawn `wc`!") if $@;

    # can capture actual exit status
    my $pattern = qr/.../;
    my $status = bash grep => -e => $pattern => $file, ">$tmpfile";
    die("`grep` had an error!") if $status == 2;

DESCRIPTION

There is one primary function, which is always exported: bash. This takes several arguments and passes them to your system's bash command (therefore, if your system has no bash--e.g. Windows--this module is useless to you). Since bash is a shell, it will run its arguments as a command, meaning that bash is functionally very similar to system. The primary advantages of bash over system are:

  • Actual bash syntax. The system command runs sh, and, even if sh on your system is just a symlink to bash, it will not respect the full bash syntax. For instance, this

        system("diff <(sort $file1) <(sort $file2)");

    will not work on your system (unless your system is super-special in some magical way), because this type of advanced bash syntax is backwards-incompatible with old Bourne shell syntax. However, this

        bash diff => "<(sort $file1)", "<(sort $file2)";

    works just fine.

  • Better return context. The return value of system is "backwards" because it returns the exit code of the command it ran, which is 0 if there were no errors, which is false, thus leading to confusing code like so:

        if (not system($cmd))
        {
            say "It worked!";
        }

    But bash returns true if the command succeeded, and false if it didn't ... in a boolean context. In other scalar contexts, it returns the numeric value of the exit code. If anything goes wrong, an exception is thrown, which can be handy if you're using the return value for something else (like capturing).

  • More capturing options. To capture the output of system, you would normally use backquotes, which returns everything as a string. With PerlX::bash, you can capture output as a string, as an array of lines, or as an array of words. See "Run Modes".

  • Better quoting. With system, you either pass your arguments as separate arguments, in which case the shell is bypassed, or you pass them as one big string. This can make quoting challenging. With PerlX::bash, you never want to bypass bash (if you do, you should be using system instead). Thus, you can specify arguments separately and have things automatically quoted properly (hopefully) without you having to think about it too hard. See "Arguments". Of course, if you'd rather pass the whole command as one big string, you can do that too (see "Switches").

  • Access to (certain) bash switches. Some options to bash come in handy. The most important one is probably -e. With system, you can either use autodie ':all', or not. If you do, then all your commands throw an exception if they don't return success; if you don't, then none of them do. With PerlX::bash, you can just provide -e (or not) to individual commands to achieve the same effect on a more granular level. Other important switches include -c and -x.

Run Modes

You can specify what you want done with the output of bash via several features collectively called "run modes." If you don't specify any run mode at all (which I sometimes call "just run it!" mode), then output goes wherever it would normally go: probably to your terminal, unless you've redirected it in the bash command itself.

Run modes are incompatible with each other, whether they're of the same type (e.g. two different capture modes) or different types (e.g. one capture mode and one filter mode). Specifying more than one run mode is a fatal error.

Capture Modes

Capture modes take the ouptut of the bash command and returns it for storage into a Perl variable. There are 3 basic capture modes, all of which are indicated by a backslashed argument.

String

To capture the entire output as one scalar string, use \string, like so:

    my $num_lines = bash \string => wc => -l => $file;

This is almost exactly like backquotes, except that the output is chomped for you.

Lines

To capture the output as a series of lines, use \lines instead:

    my @lines = bash \lines => git => log => qw< --oneline >, $file;

Individual lines are pre-chomped.

Words

If you'd rather have the output split on whitespace, try \words:

    my @words = bash \words => awk => '$1 == "foo" { print $3, $5 }', $file;

Specifically, the output is split on the equivalent of /[$ENV{IFS}]+/; if $IFS is not set in your environment, a default value of " \t\n" is used.

Context

\string always returns a scalar. \lines and \words should generally be called in list context; in scalar context, they just return the first element of the list.

Filter Modes

If you write some code that looks like this:

    # print paragraph "1:" through paragraph "10:"
    say foreach grep { (/^(\d+):/ && $1 < 10)../^$/ } bash \lines => 'my-script';

then it's going to do what you think: all the lines of output are filtered through your grep and you get just the lines you wanted. However, if my-script takes a long time to produce its output, this solution may not make you happy, because you get nothing at all until my-script has completely finished running. It would be nicer if you could get the output as it was produced, right?

Try this instead:

    # print paragraph "1:" through paragraph "10:"
    bash \lines => 'my-script |' => sub { say if (/^(\d+):/ && $1 < 10)../^$/ };

You'll be much happier.

Technical details:

  • There are two filter modes: | and |&. The former runs each line of STDOUT through your filter function. The latter runs both STDOUT and STDERR through it.

  • In order to use a filter mode, your final argument must be a coderef, and your penultimate argument must either consist of, or end with, one of the two modes.

  • From the perspective of your filter sub, the incoming line is both $_ and $_[0]; use whichever you prefer.

  • Just as with \lines, each line is pre-chomped for you.

Arguments

No matter how many arguments you pass to bash, they will be turned into a single command string and run via bash -c. However, PerlX::bash tries to make intelligent guesses as to which of your arguments are meant to be treated as a single argument in the command line (and therefore might require quoting), and which aren't. Understanding the rules behind these guesses can help avoid surprises.

Basically, there are 3 rules:

  • Some things are always quoted. See "Autoquoting".

  • Some things are never quoted. Any argument that begins with a special character (see "Special Characters") is never quoted.

  • Some things are sometimes quoted. Any argument that contains a special character (see "Special Characters") is quoted, unless one of the following things is true:

    • It is the only argument left after processing capture modes and filters, and it has whitespace in it. In other words, this:

          bash "echo foo; echo bar";

      is the same as this:

          bash -c => "echo foo; echo bar";

      On the grounds that that's most likely what you meant. (You weren't really trying to generate a echo foo; echo bar: command not found error, were you?) Basically, if it looks like it would make a lovely command line as is, we don't mess with it.

    • It looks like a redirection. While the majority of redirections do begin with a special char, sometimes they start with a number; all the following strings would qualify as "looking like a redirection," despite not beginning with a special char:

      • 2>something (standard redirection with fileno)

      • 2>&1 (redirection from fileno to fileno)

      • 4<<<$SOMEVAR (here string)

      Note that some redirection syntax may be bash-version-specific, but the decision on whether to quote or not does not take the bash version into account.

  • If an argument falls into multiple categories, the first matching category (according to the order above) wins. Thus, a filename object (which is always quoted) that begins with a special character (meaning it would never be quoted) is quoted. An argument that both begins with a special character (never quoted) and contains a special character later in its string (quoted) is not quoted.

The reason that arguments which begin with a special character are treated differently (oppositely, even) from other arguments containing special characters is to avoid quoting things such as redirections. So, for instance:

    bash echo => "foo", ">bar";

is the equivalent of:

    system('bash', '-c', q[echo foo >bar]);

whereas:

    bash echo => "foo", "ba>r";

is the equivalent of:

    system('bash', '-c', q[echo foo 'ba>r']);

Mostly this does what you want. For when it doesn't, see "Quoting Details".

Autoquoting

An autoquoting rule is a reference to a sub that takes a single argument and returns true or false. Autoquoting rules are tried, one a time, until one of them returns true, at which point the argument is quoted. If none of them return true, autoquoting does not apply.

PerlX::bash starts with a short list of autoquoting rules:

  • A reference to a regex is stringified and quoted.

  • Any blessed object whose class has a basename method is considered to be a filename and quoted. This covers Path::Class, Path::Tiny, Path::Class::Tiny, and probably many others.

You can also add your own autoquoting rules (feature not yet implemented).

Special Characters

For purposes of determining whether to quote arguments, the most important characteristic is whether a string contains any special characters. Here's the character class of all characters considered "special" by bash:

    [\s\$'"\\#\[\]!<>|;{}()~&]

Note that space is a special character, as are both types of quotes and all four types of brackets, and backslash. Note that the list does not include = or the glob characters (* and ?), because you probably don't want those quoted under most circumstances.

Quoting Details

If an argument is quoted, it is run through "shq", which means it is surrounded with single quotes, and any internal single quotes are appropriately escaped. This is similar to how `bash -x` does it when it prints command lines.

If an argument is not quoted but you wish it were, you can simply call shq yourself (but remember it is not exported by default):

    use PerlX::bash qw< bash shq >;
    bash echo => shq(">bar");   # to print ">bar"

If an argument is quoted but you wish it weren't, you need to fall back to passing the entire command as one big string. (The -c switch is not required, but it may be clearer.)

    # this echoes one line, not two:
    bash echo => "foo;echo bar";
    # this gives you two:
    bash -c => "echo foo;echo bar";
    # or just, you know, make the semi-colon a separate arg:
    bash echo => "foo", ';', echo => "bar";

Switches

Most single character switches are passed through to the spawned bash command, but some are handled by PerlX::bash directly.

-c

Just as with system bash, the -c switch means that the entire command will be sent as one big string. This completely disables all argument quoting (see "Arguments").

When using -c, it must be immediately followed by exactly one argument, which is neither undef nor the empty string (but "0" is okay, although not particularly useful). Otherwise it's a fatal error.

-e

Without the use of -e, any exit value from the command is considered acceptable. (Exceptions are still raised if the command fails to launch or is killed by a signal.) By using -e, exit values other than 0 cause exceptions.

    bash       diff => $file1, $file2; # just print diffs, if any
    bash -e => diff => $file1, $file2; # if there are diffs, print them, then throw exception

This mimics the bash -e behavior of the system bash.

FUNCTIONS

bash

Call your system's bash. See "DESCRIPTION" for full details.

shq

Manually quote something for use as a command-line argument to bash. The following steps are performed:

  • The argument is stringified, in case it is an object.

  • Any single quotes in the string are globally replaced with '\''.

  • The entire string is then enclosed in single quotes.

This should get the string to bash as you intended it; however, beware of arguments which are consequently passed on to another shell (e.g. when your bash command is ssh). In those cases, extra quoting may be required, and you must provide that before calling shq.

Exported only on request.

pwd

This is just an alias for "cwd" in Cwd. We use the pwd name because that's more comfortable for regular users of bash. Exported on request only, so just use Cwd instead if you prefer the more Perl-ish name.

tail

Perl functions that work much like the POSIX-standard head and tail utilities, but for array elements rather than lines of files. Exported only on request.

    # this code:          is the same as this code:
    head  3 => @list;   # @list[0..2]
    head -3 => @list;   # @list[0..$#list-3]
    tail -3 => @list;   # @list[@list-3..$#list]
    tail +3 => @list;   # @list[2..$#list]

Note that not only is it way easier to type, easier to understand when reading, and possibly saves you a temporary variable, it also can be safer: when e.g. @list contains only 2 elements, several of the right-hand constructs will give you unexpected answers. However, head and tail always just return as many elements as they can, which is probably closer to what you were expecting:

    my @list = 1..2;
    @list[@list-3..$#list];  # (2, 1, 2) #!!!
    tail -3 => @list;        # (1, 2)

Their use really shines, however, when used in conjunction with bash \lines and some functional programming:

    my @top_3_numbered_lines = head 3 => grep /^\d/, bash \lines => 'my-script';

STATUS

This module is no longer experimental, and is currently being used for production tasks. There will be no further sweeping changes to the interface, but some tweaking may be necessary as it sees more and more use. Documentation should be complete at this point; anything missing should be considered a bug and reported. I continue to welcome suggestions and contributions, and now recommend that you use this for any purpose you like, but perhaps just keep a close eye on it as it continues to mature.

SUPPORT

Perldoc

You can find documentation for this module with the perldoc command.

  perldoc PerlX::bash

Bugs / Feature Requests

This module is on GitHub. Feel free to fork and submit patches. Please note that I develop via TDD (Test-Driven Development), so a patch that includes a failing test is much more likely to get accepted (or at least likely to get accepted more quickly).

If you just want to report a problem or suggest a feature, that's okay too. You can create an issue on GitHub here: https://github.com/barefootcoder/perlx-bash/issues.

Source Code

none https://github.com/barefootcoder/perlx-bash

  git clone https://github.com/barefootcoder/perlx-bash.git

AUTHOR

Buddy Burden <barefootcoder@gmail.com>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2015-2020 by Buddy Burden.

This is free software, licensed under:

  The Artistic License 2.0 (GPL Compatible)