Symbol::Opaque - ML-ish data constructor pattern matching


    use Symbol::Opaque;

    BEGIN { 
        defsym('foo');   # define the constructor "foo"
        defsym('bar');   # define the constructor "bar"

    if ( foo(my $x) << foo(4) ) {    # bind foo(4) into foo($x)
        # $x is now 4
    if ( foo(13, bar(my $x)) << foo(13, bar("baz")) ) {
        # $x is now "baz"

    if ( foo(my $x) << bar(42) ) {
        # not executed: foo(X) doesn't match bar(42)


This module allows the creation of data constructors, which can then be conditionally unified like in Haskell or ML. When you use the binding operator <<, between two structures, this module tries to bind any free variables on the left in order to make the structures the same. For example:

    foo(my $x) << foo(14)           # true, $x becomes 14

This will make $x equal 14, and then the operator will return true. Sometimes it is impossible to make them the same, and in that case no variables are changed and the operator returns false. For instance:

    foo(my $x, 10) << foo(20, 21)   # impossible: false, $x is undef

This makes it possible to write cascades of tests on a value:

    my $y = foo(20, 21);
    if (foo("hello", my $x) << $y) {
    elsif (foo(my $x, 21) << $y) {
        # this gets executed: $x is 20
    else {
        die "No match";

(Yes, Perl lets you declare the same variable twice in the same cascade -- just not in the same condition).

Before you can do this, though, you have to tell Perl that foo is such a data constructor. This is done with the exported defsym routine. It is advisable that you do this in a BEGIN block, so that the execution path doesn't have to reach it for it to be defined:

    BEGIN {
        defsym('foo');   # foo() is a data constructor

If two different modules both declare a 'foo' symbol, they are considered the same. The reason this isn't dangerous is because the only thing that can ever differ about two symbols is their name: there is no "implementation" defined.

The unification performed is unidirectional: you can only have free variables on the left side.

The unification performed is nonlinear: you can mention the same free variable more than once:

    my $x;   # we must declare first when there is more than one mention
    foo($x, $x) << foo(4, 4);  # true; $x = 4
    foo($x, $x) << foo(4, 5);  # false

Unification of arrays is performed by comparing them elementwise, just like the arguments of a structure.

Unification of hashes is done like so: Every key that the target (left) hash has, the source (right) hash must also, and their values must unify. However, the source hash may have keys that the target hash does not, and the two hashes will still unify. This is so you can support "property lists", and unify against structures that have certain properties.

A variable is considered free if it is writable (this is true of all variables that you'll pass in), undefined, and in the top level of a constructor. That is:

    foo([1, my $x]) << foo([1,2])

Will not unify $x, since it is not directly in a data constructor. To get around this, you can explicitly mark variables as free with the free function:

    foo([1, free my $x]) << foo([1,2])  # success: $x == 2

Sometimes you have a situation where you're unifying against a structure, and you want something to be in a position, but you don't care what it is. The _ marker is used in this case:

    foo([1, _]) << foo([1, 2])   # success: no bindings




Luke Palmer <lrpalmer at gmail dot com>