The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

SQL::Interpolate::Macro - Macros and SQL filters for SQL::Interpolate

SYNOPSIS

 use SQL::Interpolate qw(:all);
 use SQL::Interpolate::Macro qw(:all);

 # Macros that assist in SQL building
 sql_interp 'SELECT * FROM mytable WHERE',
     sql_and( sql_if($blue,    q( color = "blue"   )),
              sql_if($shape, sql('shape =', \$shape)) ),
     'LIMIT 10';

 # Macros and fitlers that perform automatic table joining.
 # First specify database layout:
 my $interp = SQL::Interpolate->new(sql_rel_filter(
     sales_order      => {name => qr/([S-T])/, key => ['so_nbr']},
     part             => {name => qr/([p-r])/, key => ['part_nbr']},
     sales_order_line => {name => qr/([S-T])([p-r])/,
                          key  => ['so_nbr', 'part_nbr']}
 ))->make_sql_interp();
 # Then do queries:
 ($sql, @bind) = $interp->('
     SELECT * FROM REL(S), REL(Sp), REL(p)
     WHERE LINK(S,Sp,p) AND S = 123
 ');
 # RESULT:
 #   $sql = 'SELECT *
 #           FROM  sales_order as S, sales_order_line as Sp, part as p
 #           WHERE S.so_nbr = Sp.so_nbr AND Sp.part_nbr = p.part_nbr AND
 #                 S.so_nbr = 123'
  

DESCRIPTION

This module provides macro and filter capabilities to further simplify the construction of SQL queries using SQL::Interpolate. Various macros and filters are included as well as a framework for writing your own.

Macros are objects derived from SQL::Interpolate::Macro and which expand to other interpolation list elements (strings, variable references, macro objects, etc.) before interpolation. Macros may also exist as a convenience as "stringified macros" within strings (e.g. "WHERE LINK(AB,BC) AND x=y"), and these are expanded into real macro objects (e.g. 'WHERE ', link('AB','BC'), ' AND x=y'). Also, if enabled, source filtering internally converts sql// quotes into macro objects.

Macro expansion is performed by the sql_flatten() function, which is called internally by sql_interp() if macros are enabled. The process can be recursive since the expansion of a macro may contain other (e.g. nested) macros.

An SQL filter is an object derived from SQL::Interpolate::SQLFilter and is used by sql_flatten() to assist in macro expansion and/or filtering of SQL text. The filtering on SQL text can occur before and/or after the SQL fragments are interpolated into a single string.

Motivation

One use of this module is to handle trivial cases uniformly. For example,

  my @conditions = ...;  # list of SQL conditional statements.
  sql_interp "x=2 OR", sql_and(@conditions);

If @conditions > 0, this expands to

  'x=2 OR', '(', $conditions[0], 'AND', $conditions[1], 'AND', ...., ')'

If @conditions == 0, this expands correctly to

  "x=2 AND", "1=1"
  # equivalent to "x=2"

since

  sql_and()  returns  "1=1"  # analogous to x^0 = 1
  sql_or()   returns  "1=0"  # analogous to x*0 = 0

Also, though a minor point, say you didn't match the parenthesis correctly in your interpolation list:

  "WHERE thid IN", \@thids,
  "AND ( action IN", \@admin_aids,
         "OR action IN", ['post', 'reply', 'edit']

This error would be caught at runtime, but it would be caught (earlier) at compile time if written as such:

  "WHERE thid IN", \@thids,
  "AND", sql("action IN", \@admin_aids,
             "OR action IN", ['post', 'reply', 'edit']
             # -- syntax error by compiler

Built-in Macros and Filters

This module comes with various macros and SQL filters already implemented. See the sections below.

Writing your own macros

Here is a trivial example of writing your own macro-like function without using this framework at all.

  sub myparen {
      return '(', @_, ')';
  }

Example usage:

  my ($sql, @bind) = sql_interp "WHERE", my_paren("x=", \$x);
  # RESULT:
  #   $sql = "WHERE (x=?)";
  #   @bind = $x;

This simple expansion is not always sufficient though. In the following cases, macro expansion should not occur at the time Perl calls the macro function but rather later during the execution of sql_interp().

(1) if the macro expansion needs data that is available only to an instance of SQL::Interpolate (e.g. a database handle that is interrogated to generate database-dependent SQL) or

(2) to support recursive macros properly, e.g.

  sql_interp sql_and(sql_or('x=y', 'y=z'), 'z=w')
  # Don't expand to
  #   sql_interp sql_and('(', 'x=y', 'y=z', ')', 'z=w')
  # and then to
  #   sql_interp '(', 'AND', 'x=y', 'AND', 'y=z', 'AND', ')', 'AND', 'z=w'
  # but rahter expand to
  #   sql_interp '(', sql_or('x=y', 'y=z'), 'AND', 'z=w', ')'
  # and then to
  #   sql_interp '(', '(', 'x=y', 'OR', 'y=z', ')', 'AND', 'z=w', ')'

Notice how the expansion in the last example must be done outside-to-in rather inside-to-out.

The framework can be used as follows.

  # helper function for simpler syntax
  sub myparen { return MyParen->new(@_); }
  
  # Macro class
  package MyParen;
  use base 'SQL::Interpolate::Macro';
  
  sub new {
      my ($class, @elements) = @_;
      my $self = bless [ @elements ], $class;
      return $self;
  }
  
  sub expand {
      my ($self, $interp) = @_;
      my $dbh = $interp->{dbh};
      if (defined($dbh) && $dbh->{Driver}->{Name} eq 'funnydatabase') {
          return ('leftparen', @$self, 'rightparen');
      }
      else {
          return ('(', @$self, ')');
      }
  }
  1;

Here, myparen() returns a macro object, which can be placed onto an interpolation list. During the flattening phase, sql_interp() calls sql_flatten() which calls the expand() method on the macro object, and this method must return the interpolation list that the macro will expand to. If the interpolation list macro contains macros itself, an sql_interp will further expand these. See also the source code of the macros provided in this module for more complex examples.

Writing your own SQL filters

The following is an example that expands on the previous example but supports a stringified version of the macro in the SQL literals.

  # example usage:
  my $interp = SQL::Interpolate->new(myparen_filter())->make_closure();
  ($sql, @bind) = $interp->("SELECT * FROM mytable WHERE MYPAREN(x=y)");
  # Equivalent to:
  #   sql_interp "SELECT * FROM mytable WHERE ", 
  
  # helper function for simpler syntax
  sub myparen_filter {
      return MyParenFilter->new(@_);
  }
  
  # SQL Filter class
  package MyParenFilter;
  use base 'SQL::Interpolate::SQLFilter';
  
  # Filter a single SQL string fragment (during expansion)
  sub filter_text_fragment {
      my ($self, $sql) = @_;
      my @items;
      pos($sql) = 0;
      my $pos0 = pos($sql);
      until ($sql =~ /\G$/gc) {
          my $pos1 = pos($sql);
          if ($sql =~ m{\G \b MYPAREN \( (.*?) \)}xsgc) {
              push @items, substr($sql, $pos0, $pos1 - $pos0)
                  if $pos1 != $pos0;
              $pos0 = pos($sql);
              push @items, MyParen->new($1);
          }
          else {
              $sql =~ m{\G.}xsgc;  # more efficiently: \G.[^P]*
          }
      }
      my $pos1 = pos($sql);
      push @items, substr($sql, $pos0, $pos1 - $pos0) if $pos1 != $pos0;
      return @items;
  }
  1;

Your SQL filter may optionally have a few other methods that will be called by sql_interp if they exist. See "SQL::Interpolate::Macro Methods" for details.

INTERFACE

Main functions

sql_flatten
 @list_out = sql_flatten(@list_in);          # functional
 @list_out = $interp->sql_flatten(@list_in); # OO

Fully expands all macros in an interpolation list such that only strings, variables references, and sql() contains are left (no macros). Any macros in inside variable references are expanded to the other types of elements.

 my @list = sql_flatten sql/SELECT * FROM mytable where x=$x/;
 # OUTPUT: @list = ('SELECT * FROM mytable where x=', \$x);

This function takes the same type of input as sql_interp, and, in fact, sql_interp uses it to preprocess input. This function is called internally by sql_interp() if you use SQL::Interpolate::Macro. Therefore you would rarely need to call it directly.

Builtin Filters

sql_rel_filter

The filter can simplify table joins by aliasing your tables and keys using special naming conventions and by writing the SQL expressions that link tables together via keys. It can be particularly useful to represent recursive data structures in a relational database, where the table join requirements are complicated.

This allows one to write

  my $dbx = DBIx::Interpolate->new(sql_rel_filter(
      sales_order      => {name => qr/^([S-T])$/, key => ['so_nbr']},
      part             => {name => qr/^([p-r])$/, key => ['part_nbr']},
      sales_order_line => {name => qr/^([S-T])([p-r])$/,
                           key => ['so_nbr', 'part_nbr']}
  ));
  $dbx->selectall_arrayref('
      SELECT * FROM REL(S), REL(Sp), REL(p)
      WHERE LINK(S,Sp,p) AND S = 123
  ');

instead of

  $dbh->selectall_arrayref('
      SELECT *
      FROM  sales_order        as S,
            sales_order_line   as Sp,
            part               as p
      WHERE S.so_nbr = Sp.so_nbr AND Sp.part_nbr = p.part_nbr AND
            S.so_nbr = 123
  ');

or

  $dbh->selectall_arrayref('
      SELECT *
      FROM  sales_order      as S  JOIN
            sales_order_line as Sp USING (so_nbr) JOIN
            part             as p  USING (part_nbr)
  ');

The above example prints part information for all the line items on sales order #123.

The table naming convention is that the names resemble the entities related by the tables, and this is a natural way to write queries. In the above example the entities are sales order (S) and part (p), which are related by the table represented by the juxtaposition "Sp". The LINK(S,Sp,p) macro expands by equating the keys in the given relations that identity the same entities according to the naming convention:

 "S" and "Sp" share S --> "S.so_nbr = Sp.so_nbr"
 "Sp" and "p" share p --> "Sp.part_nbr = p.part_nbr"

Also, any entity written alone in an expression (e.g. "S" in "S = 123" above) is considered to be shorthand for primary key in a relation representing that entity:

 "S = 123" --> "S.so_nbr = 123"

The sql_rel_filter() function above describes how the naming conventions map to your database schema. The meaning in this example is as follows. sales order entities can be represented by the names "S" and "T". Part entities can be represented by the names "p", "q", and "r". The juxtaposition of one of each entity name represents a row in the sales_order_line relation that relates these two entities. The entity names in the matches must be surrounded by capturing parenthesis (). The names in the key list correspond respectively to the entities captured in the match, and these names are the names of the primary or foreign keys (in the current relation) that identify the entities. For example, "Sp" represents a row in the sales_order_line table, with represents a relationship between a sales order entity (S) and part entity (p). The key for S is given by Sp.so_nbr, and the key for p is given by Sp.part_nbr.

Also, consider a product structure that is represented in the database where parts can contain (subcomponent) parts that can in turn contain other (subcomponent) parts:

  sql_rel_filter(...
      part_part => {
          name => qr/^([p-r])([p-r])$/, key => ['part_nbr1', 'part_nbr2']}
  )

Now, it is possible to use two entities in the same class in the same relation:

 SELECT * FROM REL(pq), REL(qs) WHERE LINK(pq,qs)

Table linking is not limited to SELECT statements either:

 UPDATE REL(Sp), REL(p)
 SET p.color = 'blue'
 WHERE LINK(Sp,p) AND S = $sonbr

The SQL may contain these macros:

  REL(...) - identify relation and entities on it.
     This is converted into an SQL::Interpolate::Rel object.

  LINK(X,Y,Z,...) forms a Boolean expression linking the provided
    relations.  There must be at least one parameter (typically two).
     This is converted into an SQL::Interpolate::Link object.

 $sql_in : string - input string

 $sql_out : string - output string

The utility of automatic table linking is probably best shown by a real-world example. See Meset::MessageBoard in the "SEE ALSO" section, which use recursive data structures:

  sql_rel_filter({
      messageset => {name => qr/^[A-Z]$/, key => 'msid'},
      message    => {name => qr/^[m-p]$/, key => 'mid'},
      messageset_message => {foreign => {
          msid => 'messageset',
          mid  => 'message'
      }},
      messageset_messageset => {foreign => {
          msid_1 => 'messageset',
          msid_2 => 'messageset'
      }},
  });

This expands simplified SQL such as

  SELECT * FROM REL(Am), REL(m)
  WHERE  A = ? AND LINK(Am, m)

into standard SQL:

  SELECT * FROM messageset_message as Am, message as m
  WHERE  Am.msid = ? AND Am.mid = m.mid

Builtin Macros

sql_and
  $macro = sql_and(@predicates);

Creates a macro object (SQL::Interpolate::And) representing a Boolean AND over a list of predicates.

  ($sql, @bind) = sql_interp sql_and(
      'x=y', 'y=z or z=w', sql('z=', \3))

Generates

  ('((x=y) AND (y=z or z=w) AND (z=?))', 3)

If the @predicates list is empty, a '1=1' is returned.

  ($sql, @bind) = sql_interp 'x=y AND', sql_and(@predicates);
  # Result: $sql = 'x=y AND 1=1';

Predicates are surrounded by parenthesis if possibly needed.

sql_or
  $macro = sql_or(@predicates);

Creates a macro object (SQL::Interpolate::Or) representing a Boolean OR over a list of predicates.

  ($sql, @bind) = sql_interp sql_or('x=y', 'y=z', sql_paren('z=', \3))

Generates

  ('(x=y OR y=z OR 'z=?)', 3)

If the @predicates list is empty, a '1=0' is returned.

  ($sql, @bind) = sql_interp 'x=y OR', sql_or(@predicates);
  # Generates $sql = 'x=y OR 1=0';

Predicates are surrounded by parenthesis if possibly needed.

sql_paren
  $macro = sql_paren(@objs)

Creates a macro object (SQL::Interpolate::Paren) representing a list of interpolation objects to be surrounded by parenthesis.

  ($sql, @bind) = sql_interp sql_paren('x=y AND z=', \3)
  # Generates ($sql, @bind) = ('(x=y and z=?)', 3);

Typically, the size of the interpolation list should not be zero.

sql_if
  $macro = sql_if($condition, $value_if_true)

Creates a macro object (SQL::Interpolate::If) that expands to the given value if the condition is true, else if expands to the empty list ().

  ($sql, @bind) = sql_interp
      sql_and(
          sql_if($blue,  q(color = "blue")),
          sql_if($shape, sql_paren('shape =', \$shape)) )
  # Generates one of
  #   q((color = "blue" AND shape = ?)), $shape
  #   q((color = "blue"))
  #   q((shape = ?)), $shape
  #   1=1, $shape

sql_if is similar to

  $condition ? $value_if_true : ()

except that it is a macro and therefore is evaluated at the time that sql_interp() is processed.

sql_rel
  $macro = sql_rel($alias)

Creates a macro object (SQL::Interpolate::Rel) that expands to a table and alias definition based on the database description given in sql_rel_filter().

See sql_rel_filter() above for details.

  $macro = sql_link(@aliases)

Creates a macro object (SQL::Interpolate::Link) that expands to a table join condition definition based on the database description given in sql_rel_filter().

See the sql_rel_filter() above for details.

SQL::Interpolate::SQL::Filter Methods

The following methods are implemented by SQL filter objects (i.e. objects derived from SQL::Interpolate::SQLFilter). Many of these methods are called by sql_interp() during interpolation, and many of these methods are optional.

init

Initializes (or reinitializes) the macro object. This is called by sql_interp() each time sql_interp() begins processing to reset any state. This method is optional.

  sub init {
      my ($self) = @_;
      ... reinitialize self
  }
filter_text_fragment

Filter a single SQL string fragment. This is called by sql_flatten() during expansion. This method is optional.

  sub filter_text_fragment {
      my ($self, $sql) = @_;
      if ($sql =~ /.../) { # match
          ... expand and return new list of interpolation entities
          return (...);
      }
      else {
          return $sql; # return original string (unmodified)
      }
  }

One use of filter_text_fragment is to expand any macros embedded inside strings.

  "SELECT REL(AB), REL(BC) ..."

expands to

  "SELECT ", rel('AB'), ', ', rel('BC'), ' ...'
filter_text

Filter SQL string. This is called by sql_interp() after macro expansion. This method is optional.

  sub filter_text {
      my ($self, $sql) = @_;
      ...transform $sql...
      return $sql;
  }

Compare this to filter_text_fragment(), which is processed earlier.

SQL::Interpolate::Macro Methods

The following methods are implemented by macro objects (i.e. objects derived from SQL::Interpolate::Macro).

expand

Expands macro to an expanded interpolation list. This is called by sql_flatten() during macro expansion.

  sub expand {
      my ($self, $interp) = @_;
      ...expand self to interpolation list @list, possibly
         referring to $interp and $filter...
      return @list;
  }

Exports and Use Parameters

EXPORTS
 use SQL::Interpolate::Macro qw(:all);

':all' exports these functions: sql_and, sql_flatten, sql_if, sql_link, sql_or, sql_paren, sql_rel, sql_rel_filter .

DEPENDENCIES

This module depends on SQL::Interpolate but otherwise has no major dependencies.

DESIGN NOTES

Limitations

This macro facilities are still a bit under development, so interfaces could change and may particularly affect you if you are writing your own macros.

The utility of macros over just plain SQL has been questioned. A healthy balance can probably be made: use macros only when they are elucidate rather than obscure and when they improve robustness and simplicity of the syntax.

Proposed Enhancements

REL(AB,BC) could be expanded into an "x as AB JOIN y as BC on condition" or "x as AB JOIN y as BC USING(...)" Do all major databases support this syntax? The juxtaposition of the JOIN and the linking condition could eliminate the need for the separate LINK(...) macro:

  SELECT * FROM REL(AB,BC) WHERE A = ?

Other table join improvements

 - support multi-part keys?
 - support optional automatic inclusion of LINK(...).

CONTRIBUTORS

David Manura (http://math2.org/david)--author. Feedback incorporated from Mark Stosberg on table linking, SQL LIMIT, and things.

LEGAL

Copyright (c) 2004-2005, David Manura. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html.

SEE ALSO

Other modules in this distribution: SQL::Interpolate, SQL::Interpolate::Filter, DBIx::Interpolate.

Full example code of automatic table linking - Meset::MessageBoard in Meset (http://math2.org/meset).

Dependencies: DBI.

Related modules: SQL::Abstract, DBIx::Abstract, Class::DBI.