KOBAYASHI, Hiroaki

NAME

whyfields -- or Modern use of fields.pm and %FIELDS.

DESCRIPTION

Here I try to explain why %FIELDS is useful still and alternative to fields.

fields.pm -- old story.

fields allows you to extend use strict style typo-check to fields (or slots, instance variables, member variables.. as you like).

    use strict;
    package Cat {

       use fields qw/name birth_year/; # field declaration

       sub new {
          my Cat $self = fields::new(shift); #  my TYPE $sport.
          $self->{name}       = shift; # Checked!
          $self->{birth_year} = shift  # Checked!
             // $self->_this_year;
          $self;
       }

       sub age {
          my Cat $self = shift;
          return $self->_this_year
                    - $self->{birth_year}; # Checked!
       }

       sub _this_year {
          (localtime)[5] + 1900;
       }
    };

    my @cats = map {Cat->new($_, 2010)} qw/Tuxie Petunia Daisy/;

    foreach my Cat $cat (@cats) {

       print $cat->{name}, ": ", $cat->age, "\n";

       # print $cat->{namae}, "\n"; # Compilation error!
    }

Above program defines a class Cat, with members {name} and {birth_year}. It also defines constructor new and method age, which computes cat's age from birth_year.

In above program, variables $self, $cat are declared with type annotation Cat like my Cat $self so that detect typos about members at compile time (eg. perl -wc).

Since this typo-check can be applied as soon as program became syntactically correct, you can check it very early stage of development (even when you have no unit tests!). And if you integrate this check to editor's file-save-hook (using flycheck and/or App::perlminlint), you can detect typos just after every file savings.

Why most people do not use fields.pm?

Today, use strict is well known best practice for perl programming. If fields is useful too, why is it rarely used?

Here is my guess list.

Type annotation becomes too long for real world apps

In real world application, most classnames are very long like MyProject::SomeModule::SomeClass. But you wouldn't love to write following:

   my MyProject::SomeModule::SomeClass $obj = ...;

I saw some codes uses __PACKAGE__ like following, but it is still long.

  my __PACKAGE__ $obj = ...;
fields.pm doesn't generate accessors/constructor

In OOP, encapsulation (of internal) is important topic. Directly accessing $cat->{birth_year} outside from Cat is simply violation of OO philosophy.

So we need accessors and constructor. But fields.pm does nothing about them. So you need to use other accessor generator like Class::Accessor, anyway.

fiels.pm was limited to single inheritance.

fields.pm was introduced when perl was 5.005. At that time, it was actually based on ARRAY, so it was single inheritance only.

Then after release of perl5.009, fields::new returns real HASH. But above restriction kept in fields.pm for backward compatibility.

A few tips you should know about fields.

So, you might think Class::Accessor::Fast or Moo... is final answer. But wait, are they check fields typos for you at compile time? I don't think so. (If I'm wrong, please let me know). Also, use of accessor in internal code can slowdown your code (remember perl's sub call is not so fast than simple hash access).

In my humble opinion, compile time typo checking is still strong point of perl5 over other LL like ruby, python and php. I hope more perl mongers cares about this.

Anyway, I want to introduce some facts about fields so that you could XXX:

fields works even for unblessed HASH!

Since typo check by fields and type annotation (my Cat $cat) is executed at compile time, actual value in the variable is not limited to instances of annotated class (Cat). Even unblessed HASH can be used.

For example, you can check PSGI $env statically like following (this is a shorthand version of MOP4Import::PSGIEnv):

   use strict;
   use 5.012;
   {
      package Env;
      use fields qw/REQUEST_METHOD psgi.version/; # and so on...
   };

   return sub {
      (my Env $env) = @_;

      if ($env->{REQUEST_METHOD} eq 'GET') { # Checked!
         return [200, header(), ["You used 'GET'"]];
      }
      elsif ($env->{REQUEST_METHOD} eq 'POST') { # Checked!
         return [200, header(), ["You used 'POST'"]];
      }
      else {
         return [200, header()
                , ["Unsupported method $_\n", "psgi.version="
                   , join(" ", $env->{'psgi.version'})]]; # Checked too!
      }
   };

   sub header {
      ["Content-type", "text/plain"]
   }

constant sub can be used for my TYPE slot.

In fact, you can use shorten type annotation using type alias (constant sub which returns class name). So, you can rewrite following:

   my MyProject::SomeModule::Purchase $obj = ...;

into:

   sub Purchase () {'MyProject::SomeModule::Purchase'}

   ...

   my Purchase $obj = ...;

Some of you may feel above acceptable to write.

And as a positive side-effect, such type alias can be used for object instantiation. It also allows overriding of actual class in subclasses.

     ...
     # Subclass can override ->Purchase().
     my Purchase $obj = $self->Purchase->new(...);
     ...

values of %FIELDS can be anything now.

Actually fields is an abstraction interface of perl's core interface %FIELDS.

  package Cat;
  use fields qw/name birth_year/;

Above code briefly does following:

  package Cat;
  BEGIN {
    $FIELDS{name} = 1;
    $FIELDS{birth_year} = 2;
  }

Then perl's compiler can check typos for variables which have type annotation like my Cat $cat. When compiler find field access like $cat->{name}, it looks up %CAT::FIELDS and checks if $CAT::FIELDS{name} exists. If it exists, field access is valid. If you wrote wrong field name like $cat->{namae}, you will get compilation error like No such class field "namae" in variable $cat of type main::Cat.

Interestingly, above story does not mention value of %FIELDS. Actually, the values are only used in fields.pm to achieve single inheritance restriction. Perl's core itself does not care its content.

I think this means we are able to write our own alternative to fields.pm using %FIELDS, at our risks. So, let start experiment!

(Proposed) Modern use of fields and strict.

Based on the above discussion, here I propose alternative style use of fields (actually %FIELDS) to obtain more typo checking at compile time like use strict. I hope this is can be incorporated into your coding effortlessly.

Divide and conquer.

First, we should divide and conquer our problem. In this case, I want to divide it between "border of encapsulation".

In outside of class definition (which means user side of the class), direct access to field is non-sence. But in class definition, it doesn't matter. So, use fields for internal codes.

   my $foo = new Foo(width => 8, height => 3);
   $foo->{width} * $foo->{height};  # Evil!

   package Foo {
     use fields qw/width height/;
     sub area {
       my Foo $self = shift;
       $self->{width} * $self->{height};  # No problem.
     }
   };

my MY $obj

Secondly, I propose shorthand name MY as default type alias. Then every method argument declaration starts like (my MY $self, ...) = @_. This will be short enough to adapt, especially if you are already familiar with use strict (its only 3 chars addition!). To define this alias, just write sub MY () {__PACKAGE__} at beginning of your packages.

  package MyApp::Model::Company::LongLongProductName {
    sub MY () {__PACKAGE__};
    use fields qw/price/;

    sub add_price {
      (my MY $self, my $val) = @_;
      $self->{price} += $val;
      $self
    }
  };

I propose this is because good naming is not so easy. And having compile-time checking should be achieved earlier than you finally have an good naming (I think!). Of course if you have an good naming and it is stable (will not be changed for enough long time), use it instead.

configure + accessor generator

For about accessors, let's generate it from fields specifications. And to have better support for accessor generation, we also need base class to hold consistent constructor.

Describing such implementation is out of scope of this document. Instead I will sketch about one design outline which I frequently used for over a decade. It is rooted in "Perl/Tk" and tcl/tk widet API.

   # User code. (no fields check)
   my $obj = Foo->new(width => 8, height => 3);

   print $obj->width * $obj->height; # => 24

   $obj->configure(height => 3, width => 3);

   print $obj->width * $obj->height; # => 9
  • public members start with [A-Za-z]. Others are private.

  • In this style, I generate only getters from fields declaration.

  • If you want to write complex getter, name your private field starting with '_'.

       sub dbh {
         (my Foo $foo) = @_;
         $foo->{_dbh} //= do {
            DBI->connect($foo->{user}, $foo->{password}, ...);
         };
       }
  • For setters, I define general purpose setter configure in base class. And it eventually calls onconfigure_... hooks if it exists.

       sub onconfigure_file {
         (my Foo $foo, my $fn) = @_;
         $foo->{string} = read_file($fn);
       }
  • To set default values, define and call hook like after_new (there would be better name though).

       sub after_new {
         (my Foo $foo) = @_;
         $foo->{name}       //= "(A cat not yet named)";
         $foo->{birth_year} //= $foo->default_birth_year;
       }
       sub default_birth_year {
         _this_year();
       }

Here is a sample implementation of above.

Note: below doesn't care about subclassing. For real work, please consult internal of MOP4Import::Declare and MOP4Import::Base::Configure.

    use strict;
    use 5.009;
    package MyProject::Object { sub MY () {__PACKAGE__}
       use Carp;
       use fields qw//; # Note. No fields could cause a problem.
       sub new {
         my MY $self = fields::new(shift);
         $self->configure(@_) if @_;
         $self->after_new;
         $self
       }
       sub after_new {}
    
       sub configure {
          my MY $self = shift;
          my (@task);
          my $fields = _fields_hash($self);
          my @params = @_ == 1 && ref $_[0] eq 'HASH' ? %{$_[0]} : @_;
          while (my ($name, $value) = splice @params, 0, 2) {
            unless (defined $name) {
              croak "Undefined key for configure";
            }
            unless ($name =~ /^[A-Za-z]\w*$/) {
              croak "Invalid key for configure $name";
            }
            if (my $sub = $self->can("onconfigure_$name")) {
              push @task, [$sub, $value];
            } elsif (not exists $fields->{$name}) {
              confess "Unknown configure key: $name";
            } else {
              $self->{$name} = $value;
            }
          }
          $$_[0]->($self, $$_[1]) for @task;
          $self;
       }
    
       sub _fields_hash {
         my ($obj) = @_;
         my $sym = _globref($obj, 'FIELDS');
         unless (*{$sym}{HASH}) {
           *$sym = {};
         }
         *{$sym}{HASH};
       }
       sub _globref {
         my ($thing, $name) = @_;
         my $class = ref $thing || $thing;
         no strict 'refs';
         \*{join("::", $class, defined $name ? $name : ())};
       }
    
       # Poorman's MOP4Import::Declare.
       sub import {
          my ($myPack, @decls) = @_;
          my $callpack = caller;
          *{_globref($callpack, 'ISA')} = [$myPack];
          foreach my $decl (@decls) {
             my ($pragma, @args) = @$decl;
             $myPack->can("declare_$pragma")->($myPack, $callpack, @args);
          }
       }
    
       sub declare_fields {
         my ($myPack, $callpack, @names) = @_;
         my $fields = _fields_hash($callpack);
         foreach my $name (@names) {
           $fields->{$name} = 1; # or something more informative.
           *{_globref($callpack, $name)} = sub { $_[0]->{$name} };
         }
       }
    };
    1;

Here is user code of above base class.

    package MyProject::Product; sub MY () {__PACKAGE__}
    use MyProject::Object [fields => qw/name price/];

    1;