The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

IPC::Exe - Execute processes or Perl subroutines & string them via IPC. Think shell pipes.

SYNOPSIS

  use IPC::Exe qw(exe bg);

  my @pids = &{
         exe sub { "2>#" }, qw( ls  /tmp  a.txt ),
      bg exe qw( sort -r ),
         exe sub { print "2nd cmd: @_\n"; print "three> $_" while <STDIN> },
      bg exe 'sort',
         exe "cat", "-n",
         exe sub { print "six> $_" while <STDIN>; print "5th cmd: @_\n" },
  };

is like doing the following in a modern Unix shell:

  ls /tmp a.txt 2> /dev/null | { sort -r | [perlsub] | { sort | cat -n | [perlsub] } & } &

except that [perlsub] is really a perl child process with access to main program variables in scope.

DESCRIPTION

This module was written to provide a secure and highly flexible way to execute external programs with an intuitive syntax. In addition, more info is returned with each string of executions, such as the list of PIDs and $? of the last external pipe process (see "RETURN VALUES"). Execution uses exec command, and the shell is never invoked (with exception for non-Unix platforms to allow use of system).

The two exported subroutines perform all the heavy lifting of forking and executing processes. In particular, exe( ) implements the KID_TO_READ version of

  http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens

while bg( ) implements the double-fork technique illustrated at

  http://perldoc.perl.org/perlfaq8.html#How-do-I-start-a-process-in-the-background?

EXAMPLES

Let's dive right away into some examples. To begin:

  my $exit = system( "myprog $arg1 $arg2" );

can be replaced with

  my $exit = &{ exe 'myprog', $arg1, $arg2 };

exe( ) returns a LIST of PIDs, the last item of which is $? (of default &READER). To get the actual exit value $exitval, shift right by eight $? >> 8.

Extending the previous example,

  my $exit = system( "myprog $arg1 $arg2 $arg3 > out.txt" );

can be replaced with

  my $exit = &{ exe sub { open(STDOUT, '>', 'out.txt') or die }, 'myprog', $arg1, $arg2, };

The previous two examples will wait for 'myprog' to finish executing before continuing the main program.

Extending the previous example again,

  # cannot obtain $exit of 'myprog' because it is in background
  system( "myprog $arg1 $arg2 $arg3 > out.txt &" );

can be replaced with

  # just add 'bg' before 'exe' in previous example
  my $bg_pid = &{ bg exe sub { open(STDOUT, '>', 'out.txt') or die }, 'myprog', $arg1, $arg2, };

Now, 'myprog' will be put in background and the main program will continue without waiting.

To monitor the exit value of a background process:

  my $bg_pid = &{
      bg sub {
             # same as 2nd previous example
             my ($pid) = &{
                 exe sub { open(STDOUT, '>', 'out.txt') or die }, 'myprog', $arg1, $arg2,
             };

             # check if exe() was successful
             defined($pid) or die("Failed to fork process in background");

             # handle exit value here
             print STDERR "background exit value: " . ($? >> 8) . "\n";
         }
  };

  # check if bg() was successful
  defined($bg_pid) or die("Failed to send process to background");

Instead of using backquotes or qx( ),

  # slurps entire STDOUT into memory
  my @stdout = (`$program @ARGV`);

  # handle STDOUT here
  for my $line (@stdout)
  {
      print "read_in> $line";
  }

we can read the STDOUT of one process with:

  my ($pid) = &{
      # execute $program with arguments
      exe $program, @ARGV,

      # handle STDOUT here
      sub {
          while (my $line = <STDIN>)
          {
              print "read_in> $line";
          }

          # set exit status of main program
          close($IPC::Exe::PIPE);
      },
  };

  # check if exe() was successful
  defined($pid) or die("Failed to fork process");

  # exit value of $program
  my $exitval = $? >> 8;

Perform tar copy of an entire directory:

  use Cwd qw(chdir);

  my @pids = &{
      exe sub { chdir $source_dir or die }, qw(/bin/tar  cf - .),
      exe sub { chdir $target_dir or die }, qw(/bin/tar xBf -),
  };

  # check if exe()'s were successful
  defined($pids[0]) && defined($pids[1])
      or die("Failed to fork processes");

  # was un-tar successful?
  my $error = pop(@pids);

Here is an elaborate example to pipe STDOUT of one process to the STDIN of another, consecutively:

  my @pids = &{
      # redirect STDERR to STDOUT
      exe sub { "2>&1" }, $program, @ARGV,

      # 'perl' receives STDOUT of $program via STDIN
      exe sub {
              my ($pid) = &{
                  exe qw(perl -e), 'print "read_in> $_" while <STDIN>; exit 123',
              };

              # check if exe() was successful
              defined($pid) or die("Failed to fork process");

              # handle exit value here
              print STDERR "in-between exit value: " . ($? >> 8) . "\n";

              # this is executed in child process
              # no need to return
          },

      # 'sort' receives STDOUT of 'perl'
      exe qw(sort -n),

      # [perlsub] receives STDOUT of 'sort'
      exe sub {
              # find out command of previous pipe process
              # if @_ is empty list, previous process was a [perlsub]
              my ($prog, @args) = @_;
              print STDERR "last_pipe> $prog @args\n"; # output: "last_pipe> sort -n"

              # print sorted, 'perl' filtered, output of $program
              print while <STDIN>;

              # find out exit value of previous 'sort' pipe process
              close($IPC::Exe::PIPE);
              warn("Bad exit for: @_\n") if $?;

              return $?;
          },
  };

  # check if exe()'s were successful
  defined($pids[0]) && defined($pids[1]) && defined($pids[2])
      or die("Failed to fork processes");

  # obtain exit value of last process on pipeline
  my $exitval = pop(@pids) >> 8;

Shown below is an example of how to capture STDERR and STDOUT after sending some input to STDIN of the child process:

  # reap child processes 'xargs' when done
  local $SIG{CHLD} = 'IGNORE';

  # like IPC::Open3, except filehandles are generated on-the-fly
  my ($pid, $TO_STDIN, $FROM_STDOUT, $FROM_STDERR) = &{
      exe +{ stdin => 1, stdout => 1, stderr => 1 }, qw(xargs ls -ld),
  };

  # check if exe() was successful
  defined($pid) or die("Failed to fork process");

  # ask 'xargs' to 'ls -ld' three files
  print $TO_STDIN "/bin\n";
  print $TO_STDIN "does_not_exist\n";
  print $TO_STDIN "/etc\n";

  # cause 'xargs' to flush its stdout
  close($TO_STDIN);

  # print captured outputs
  print "stderr> $_" while <$FROM_STDERR>;
  print "stdout> $_" while <$FROM_STDOUT>;

  # close filehandles
  close($FROM_STDOUT);
  close($FROM_STDERR);

Of course, more exe( ) calls may be chained together as needed:

  # reap child processes 'xargs' when done
  local $SIG{CHLD} = 'IGNORE';

  # like IPC::Open2, except filehandles are generated on-the-fly
  my ($pid1, $TO_STDIN, $pid2, $FROM_STDOUT) = &{
      exe +{ stdin  => 1 }, sub { "2>&1" }, qw(perl -ne), 'print STDERR "360.0 / $_"',
      exe +{ stdout => 1 }, qw(bc -l),
  };

  # check if exe()'s were successful
  defined($pid1) && defined($pid2)
      or die("Failed to fork processes");

  # ask 'bc -l' results of "360 divided by given inputs"
  print $TO_STDIN "$_\n" for 2 .. 8;

  # we redirect stderr of 'perl' to stdout
  #   which, in turn, is fed into stdin of 'bc'

  # print captured outputs
  print "360 / $_ = " . <$FROM_STDOUT> for 2 .. 8;

  # close filehandles
  close($TO_STDIN);
  close($FROM_STDOUT);

Important: Some non-Unix platforms, such as Win32, require interactive processes (shown above) to know when to quit, and can neither rely on close($TO_STDIN), nor kill TERM => $pid;

SUBROUTINES

Both exe( ) and bg( ) are optionally exported. They each return CODE references that need to be called.

exe( )

  exe \%EXE_OPTIONS, &PREEXEC, LIST, &READER
  exe \%EXE_OPTIONS, &PREEXEC, &READER
  exe \%EXE_OPTIONS, &PREEXEC
  exe &READER

\%EXE_OPTIONS is an optional hash reference to instruct exe( ) to return STDIN / STDERR / STDOUT filehandle(s) of the executed child process. See "SETTING OPTIONS".

LIST is exec( ) in the child process after the parent is forked, where the child's stdout is redirected to &READER's stdin.

&PREEXEC is called right before exec( ) in the child process, so we may reopen filehandles or do some child-only operations beforehand.

Optionally, &PREEXEC could return a LIST of strings to perform common filehandle redirections and/or binmode settings. The following are preset actions:

  "2>#"  or "2>null"   silence  stderr
   ">#"  or "1>null"   silence  stdout
  "2>&1"               redirect stderr to  stdout
  "1>&2" or ">&2"      redirect stdout to  stderr
  "1><2"               swap     stdout and stderr

  "0:crlf"             does binmode(STDIN, ":crlf");
  "1:raw" or "1:"      does binmode(STDOUT, ":raw");
  "2:utf8"             does binmode(STDERR, ":utf8");

&READER is called with LIST as its arguments. LIST corresponds to the arguments passed in-between &PREEXEC and &READER.

If exe( )'s are chained, &READER calls itself as the next exe( ) in line, which in turn, calls the next &PREEXEC, LIST, etc.

&PREEXEC is called with arguments passed to the CODE reference returned by exe( ).

&READER is always called in the parent process.

&PREEXEC is always called in the child process.

&PREEXEC and &READER are very similar and may be treated the same.

It is important to note that the actions & return of &PREEXEC matters, as it may be used to redirect filehandles before &PREEXEC becomes the exec process.

close( $IPC::Exe::PIPE ) in &READER to set exit status $? of previous process executing on the pipe.

If LIST is not provided, &PREEXEC will still be called.

If &PREEXEC is not provided, LIST will still exec.

If &READER is not provided, it defaults to

  sub { print while <STDIN>; close($IPC::Exe::PIPE); return $? }

exe( &READER ) returns &READER.

exe( ) returns an empty list.

bg( )

  bg \%BG_OPTIONS, &BACKGROUND
  bg &BACKGROUND

\%BG_OPTIONS is an optional hash reference to instruct bg( ) to wait a certain amount of time for PREEXEC to complete (for non-Unix platforms only). See "SETTING OPTIONS".

&BACKGROUND is called after it is sent to the init process.

If &BACKGROUND is not a CODE reference, return an empty list upon execution.

bg( ) returns an empty list.

This experimental feature is not enabled by default:

  • Upon failure of background to init process, bg( ) can fallback by calling &BACKGROUND in parent or child process if $IPC::Exe::bg_fallback is true. To enable fallback feature, set

      $IPC::Exe::bg_fallback = 1;

SETTING OPTIONS

exe( )

\%EXE_OPTIONS is a hash reference that can be provided as the first argument to exe( ) to control returned values. It may be used to return STDIN / STDERR / STDOUT filehandle(s) of the child process to emulate IPC::Open2 and IPC::Open3 behavior.

The default values are:

  %EXE_OPTIONS = (
      stdin       => 0,
      stdout      => 0,
      stderr      => 0,
      autoflush   => 1,
      binmode_io  => undef,
      exec        => 0,  # Win32 option
  );

These are the effects of setting the following options:

stdin => 1

Return a WRITEHANDLE to STDIN of the child process. The filehandle will be set to autoflush on write if $EXE_OPTIONS{autoflush} is true.

stdout => 1

Return a READHANDLE from STDOUT of the child process, so output to stdout may be captured. When this option is set and &READER is not provided, the default &READER subroutine will NOT be called.

stderr => 1

Return a READHANDLE from STDERR of the child process, so output to stderr may be captured.

autoflush => 0

Disable autoflush on the WRITEHANDLE to STDIN of the child process. This option only has effect when $EXE_OPTIONS{stdin} is true.

binmode_io => ":raw", ":crlf", ":bytes", ":encoding(utf8)", etc.

Set binmode of STDIN and STDOUT of the child process for layer $EXE_OPTIONS{binmode_io}. This is automatically done for subsequently chained exe( )cutions. To stop this, set to an empty string "" or another layer to bring a different mode into effect.

exec => 1

NOTE: This only applies to non-Unix platforms.

Use exec instead of system when executing programs. This is set automatically when $EXE_OPTIONS{stdin} is or was true in a previous exe( ).

With exec, parent thread does not wait for child to finish, allowing programs that wait for STDIN to not block. This is useful to achieve IPC::Open3 behavior where programs wait expecting for further input.

With (default) system, parent thread waits for child to finish and collects the exit status. If the child fails program execution, the parent will cease to continue and return an empty list. This is the way to detect breaks in the chain of exe( )cutions.

bg( )

NOTE: This only applies to non-Unix platforms.

\%BG_OPTIONS is a hash reference that can be provided as the first argument to bg( ) to set wait time (in seconds) before relinquishing control back to the parent thread. See "CAVEAT" for reasons why this is necessary.

The default value is:

  %BG_OPTIONS = (
      wait => 2,  # Win32 option
  );

RETURN VALUES

By chaining exe( ) and bg( ) statements, calling the single returned CODE reference sets off the chain of executions. This returns a LIST in which each element corresponds to each exe( ) or bg( ) call.

exe( )

  • When exe( ) executes an external process, the PID for that process is returned, or an EMPTY LIST if exe( ) failed in any operation prior to forking. If an EMPTY LIST is returned, the chain of execution stops there and the next &READER is not called, guaranteeing the final return LIST to be truncated at that point. Failure after forking causes die( ) to be called.

  • When exe( ) executes a &READER subroutine, the subroutine's return value is returned. If there is no explicit &READER, the implicit default &READER subroutine is called instead:

      sub { print while <STDIN>; close($IPC::Exe::PIPE); return $? }

    It returns $?, which is the status of the last pipe process close. This allows code to be written like:

      my $exit = &{ exe 'myprog', $myarg };
  • When non-default \%EXE_OPTIONS are specified, exe( ) returns additional filehandles in the following LIST:

      (
          $PID,                # undef if exec failed
          $STDIN_WRITEHANDLE,  # only if $EXE_OPTIONS{stdin}  is true
          $STDOUT_READHANDLE,  # only if $EXE_OPTIONS{stdout} is true
          $STDERR_READHANDLE,  # only if $EXE_OPTIONS{stderr} is true
      )

    The positional LIST form return allows code to be written like:

      my ($pid, $TO_STDIN, $FROM_STDOUT) = &{
          exe +{ stdin => 1, stdout => 1 }, '/usr/bin/bc'
      };

    Note: It is necessary to disambiguate \%EXE_OPTIONS (also \%BG_OPTIONS) as a hash reference by including a unary + before the opening curly bracket:

      +{ stdin => 1, autoflush => 0 }
      +{ wait => 2.5 }

bg( )

Calling the CODE reference returned by bg( ) returns the PID of the background process, or an EMPTY LIST if bg( ) failed in any operation prior to forking. Failure after forking causes die( ) to be called.

ERROR CHECKING

To determine if either exe( ) or bg( ) was successful until the point of forking, check whether the returned $PID is defined.

See "EXAMPLES" for examples on error checking.

WARNING: This may get a slightly complicated for chained exe( )'s when non-default \%EXE_OPTIONS cause the positions of $PID in the overall returned LIST to be non-uniform (caveat emptor). Remember, the chain of executions is doing a lot for just a single CODE call, so due diligence is required for error checking.

A minimum count of items (PIDs and/or filehandles) can be expected in the returned LIST to determine whether forks were initiated for the entire exe( ) / bg( ) chain.

Failures after forking are responded with die( ). To handle these errors, use eval.

SYNTAX

It is highly recommended to avoid unnecessary parentheses ( )'s when using exe( ) and bg( ).

IPC::Exe relies on Perl's LIST parsing magic in order to provide the clean intuitive syntax.

As a guide, the following syntax should be used:

  my @pids = &{                                          # call CODE reference
      [ bg ] exe [ sub { ... }, ] $prog1, $arg1, @ARGV,  # end line with comma
             exe [ sub { ... }, ] $prog2, $arg2, $arg3,  # end line with comma
      [ bg ] exe sub { ... },                            # this bg() acts on last exe() only
             sub { ... },
  };

where brackets [ ]'s denote optional syntax.

Note that Perl sees

  my @pids = &{
      bg exe $prog1, $arg1, @ARGV,
      bg exe sub { "2>#" }, $prog2, $arg2, $arg3,
         exe sub { 123 },
         sub { 456 },
  };

as

  my @pids = &{
      bg( exe( $prog1, $arg1, @ARGV,
              bg( exe( sub { "2>#" }, $prog2, $arg2, $arg3,
                      exe( sub { 123 },
                           sub { 456 }
                      )
                  )
              )
          )
      );
  };

CAVEAT

This module is targeted for Unix environments, using techniques described in perlipc and perlfaq8. Development is done on FreeBSD, Linux, and Win32 platforms. It may not work well on other non-Unix systems, let alone Win32.

Some care was taken to rely on Perl's Win32 threaded implementation of fork( ). To get things to work almost like Unix, redirections of filehandles have to be performed in a certain order. More specifically: let's say STDOUT of a child process (read: thread) needs to be redirected elsewhere (anywhere, it doesn't matter). It is important that the parent process (read: thread) does not use STDOUT until after the child is exec'ed. At the point after exec, the parent must restore STDOUT to a previously dup'ed original and may then proceed along as usual. If this order is violated, deadlocks may occur, often manifesting as an apparent stall in execution when the parent tries to use STDOUT.

On Win32, bg( ) unfortunately has to substantially rely on timer code to wait for &PREEXEC to complete in order to work properly with exe( ). The example shown below illustrates that bg( ) has to wait at least until $program is exec'ed. Hence, $wait_time > $work_time must hold true and this requires a priori knowledge of how long &PREEXEC will take.

  &{
      bg +{ wait => $wait_time }, exe sub { sleep($work_time) }, $program
  };

This essentially renders bg &BACKGROUND useless if &BACKGROUND does not exec any programs (Win32).

In summary: (on Win32)

  • Only use bg( ) to exec programs into the background.

  • Keep &PREEXEC as short-running as possible. Or make sure $BG_OPTIONS{wait} time is longer.

  • No &PREEXEC (or code running in parallel thread) == no problems.

Some useful information:

  http://perldoc.perl.org/perlfork.html#CAVEATS-AND-LIMITATIONS
  http://www.nntp.perl.org/group/perl.perl5.porters/2003/11/msg85488.html
  http://www.nntp.perl.org/group/perl.perl5.porters/2003/08/msg80311.html
  http://www.perlmonks.org/?node_id=684859
  http://www.perlmonks.org/?node_id=225577
  http://www.perlmonks.org/?node_id=742363

DEPENDENCIES

Perl v5.6.0+ is required.

The following modules are required:

Extra module required for non-Unix platforms:

AUTHOR

Gerald Lai <glai at cpan dot org>