KOBAYASHI, Hiroaki

NAME

intro_runnable_module - A brief introduction to Runnable-Module design pattern

INTRO

This document briefly describes a programming idiom (or design pattern, maybe) which I call it as Runnable-Module. It seems to me that it is not so well known in perl community but I can find some articles for it. Same pattern can also be found in other dynamic scripting language like python. I'm not sure it already has better name. If you know about it, please tell me.

Note: original version of this document (written in Japanese) can be found at my blog post: https://hkoba.hatenablog.com/entry/2017/09/06/185029 日本語版もあるよ!.

Runnable module - What, Why and How

Runnable-module is a programming idiom which allows you to write a script to be used both as a standalone program and as a library file (can be used via use, require). In perl, it is usually implemented using unless caller block.

What is unless caller?

Have you ever read perl code which ends with something like following?:

    unless (caller) {
      ...Some interesting code...
    }

This unless (caller) {...} guards ... portion of code to run only when this program file is directly executed. When the same script is eval()ed as a string or require()d as a module, the ... portion is not executed. I knew this idiom in context of Tk on comp.lang.perl.tk IIRC and used it like this post. At that time, it was used like:

    MainLoop unless caller;

and achieved following tricks:

  • When this script is executed directly, run Tk::MainLoop() so that correctly start GUI drawing and event loop.

  • Otherwise (i.e. eval()ed from clipboard and/or do "script") do nothing.

Let's write MyScript.pm instead of myscript.pl

This unless caller idiom is useful not only in Tk scripts, but also in normal perl scriptings because it enables you to write dual purpose script: your script can be used as a module and also as a standalone script.

To achieve it, what you need is

1 Name your script like MyScript.pm instead of myscript.pl.
2 Do chmod a+x MyScript.pm from shell.
3 Add shbang #!/usr/bin/env perl at first line of it.
4 Add package MyScript; declaration and also 1; at the end of this script.

And finally, you can write some codes guarded in unless (caller) {...} block for standalone mode. Here is typical skeleton of such script.

    #!/usr/bin/env perl
    package MyScript;
    
    ...
    
    unless (caller) {
       my @opts; 
       push @opts, split /=/, $_, 2 while @ARGV and $ARGV[0] =~ /=/; # XXX:minimum!
       my $app = MyScript->new(@opts);
       $app->main(@ARGV);
    }
    
    1;

Now, your script became a Runnable-Module. You can use this MyScript.pm not only as a CLI tool (don't forget chmod a+x;-), but also as a module and call some internal functions/methods freely.

    # Invoke as a command and execute MyScript->new(x=>100,y=>100)->main('foo','bar')
    % ./MyScript.pm x=100 y=100 foo bar
    
    # Use as a module, instantiate and call method foo
    % perl -I. -MMyScript -le 'print MyScript->new->foo'

Dispatching subcommands to methods turns your script into multi-role editor

In above example, unless (caller) {...} block is hard-wired to call MyScript->new->main. But you can write here more useful behavior which could be similar to typical CLI programs with subcommands (i.e. git), like following:

  • Take a series of posix style long options --name=value and use them as arguments of new().

    • If the option is name only (--name), treat it as --name=1. --debug is treated as --debug=1.

  • After that, treat next remaining argument as a subcommand name and dispatch it to specific method.

Typical CLI usage can be imagined like following:

    # Parse some textfiles and load it into SQLite DB
    % ./MyScript.pm  --dbname=foo.db  import journal.tsv
    
    # Search and list something from above DB
    % ./MyScript.pm  --dbname=foo.db  list_accounts

To achieve above behavior, we can write unless (caller) {...} block like following (assume parse_opts() is given somewhere else):

  unless (caller) {
     my @opts = parse_opts(\@ARGV);
  
     my $self = __PACKAGE__->new(@opts);
     
     my $cmd = shift @ARGV || "help";
  
     my $method = "cmd_$cmd"; # Map $cmd to a method cmd_$cmd
  
     $self->can($method) or die "No such subcommand: $cmd";
  
     $self->$method(@ARGV);
  }

Then we can define subcommands in previous example just as sub cmd_import and sub cmd_list_accounts. No special efforts are required.

Note: Above code dispatches given subcommand argument $cmd to a method named cmd_$cmd. This is because import() is special name for perl itself. See "use" in perlfunc.

Subcommand dispatcher can be extended to aid Exploratory programming

In previous example, the subcommand dispatcher was intentionally restricted only to invoke specifically named methods like cmd_.... Such restriction is useful to hide specific methods (like import) and also can be useful to provide official list of subcommands.

But this subcommand dispatcher can be extended to do more important jobs in programming, especially for bottom-up style Exploratory programming for unknown/uncertain problem domains.

In bottom-up style programming, programmer starts writing small pieces of code, test them from REPL(Read-Eval-Print Loop) one-by-one. Those pieces are composed, tested, renamed, rewritten and/or discarded and tested again-and-again, endlessly until he/she gets something practically useful.

Unfortunately, perl doesn't have good REPL in its core. And even if you use some REPL library, dynamic code redefinition from REPL works against use strict and use warnings, which is the MUST in modern perl programming.

Fortunately, IMHO, most important property of REPL based development can be incorporated to other languages without REPL. Because it is shortness of turn-around time to test every single piece of bottom-up constructions.

In other words, if we can test almost every interesting methods just in seconds from shell's CLI and compose them without creating a new file with editor, your shell becomes REPL for your Exploratory programming.

To achieve this, we can extend subcommand dispatcher to handle methods other than cmd_.... It must emit return values to STDOUT. Since return values may contain undef, [..], {..}... we must use some kind of serializer such as Data::Dumper or JSON. Following is a minimum starting point of such subcommand dispatcher:

    use Data::Dumper;
    
    unless (caller) {
       my @opts = parse_opts(\@ARGV);
    
       my $self = __PACKAGE__->new(@opts);
       
       my $cmd = shift @ARGV || "help";
    
       # If there is a method matches with "cmd_$cmd", invoke it.
       if (my $sub = $self->can("cmd_$cmd")) {
    
         $sub->($self, @ARGV);
       }
       # If there is a method matches with $cmd, invoke it and dump the result
       # for development aid.
       elsif ($sub = $self->can($cmd)) {
    
         my @res = $sub->($self, @ARGV);
    
         print Data::Dumper->new(\@res)->Dump;
       } 
       else {
         die "No such subcommand: $cmd";
       }
    }

You may want to extend above code for more useful one to handle following points:

  • Change exit code when @res is falsy.

  • Change output serializer to JSON.

  • Change argument parser to convert [..], {...} automatically by "decode_json" in JSON too. This enables you to compose your favorite methods each other which takes/returns structured objects/arrays.

This is a backstory of MOP4Import::Base::CLI_JSON. Thank you for reading!

APPENDIX

Sample implementation of parse_opts()

    sub parse_opts {
      my ($list, $result) = @_;
      $result //= [];
      while (@$list and my ($n, $v) = $list->[0]
             =~ m{^--$ | ^(?:--? ([\w:\-\.]+) (?: =(.*))?)$}xs) {
        shift @$list;
        last unless defined $n;
        push @$result, $n, $v // 1;
      }
      wantarray ? @$result : $result;
    }