The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

SQL::Statement::Functions - built-in & user-defined SQL functions

SYNOPSIS

 SELECT Func(args);
 SELECT * FROM Func(args);
 SELECT * FROM x WHERE Funcs(args);
 SELECT * FROM x WHERE y < Funcs(args);

DESCRIPTION

This module contains the built-in functions for SQL::Parser and SQL::Statement. All of the functions are also available in any DBDs that subclass those modules (e.g. DBD::CSV, DBD::DBM, DBD::File, DBD::AnyData, DBD::Excel, etc.).

This documentation covers built-in functions and also explains how to create your own functions to supplement the built-in ones. It's easy. If you create one that is generally useful, see below for how to submit it to become a built-in function.

Function syntax

When using SQL::Statement/SQL::Parser directly to parse SQL, functions (either built-in or user-defined) may occur anywhere in a SQL statement that values, column names, table names, or predicates may occur. When using the modules through a DBD or in any other context in which the SQL is both parsed and executed, functions can occur in the same places except that they can not occur in the column selection clause of a SELECT statement that contains a FROM clause.

 # valid for both parsing and executing

     SELECT MyFunc(args);
     SELECT * FROM MyFunc(args);
     SELECT * FROM x WHERE MyFuncs(args);
     SELECT * FROM x WHERE y < MyFuncs(args);

 # valid only for parsing (won't work from a DBD)

     SELECT MyFunc(args) FROM x WHERE y;

User-Defined Functions

Loading User-Defined Functions

In addition to the built-in functions, you can create any number of your own user-defined functions (UDFs). In order to use a UDF in a script, you first have to create a perl subroutine (see below), then you need to make the function available to your database handle with the CREATE FUNCTION or LOAD commands:

 # load a single function "foo" from a subroutine
 # named "foo" in the current package

      $dbh->do(" CREATE FUNCTION foo EXTERNAL ");

 # load a single function "foo" from a subroutine
 # named "bar" in the current package

      $dbh->do(" CREATE FUNCTION foo EXTERNAL NAME bar");


 # load a single function "foo" from a subroutine named "foo"
 # in another package

      $dbh->do(' CREATE FUNCTION foo EXTERNAL NAME "Bar::Baz::foo" ');

 # load all the functions in another package

      $dbh->do(' LOAD "Bar::Baz" ');

Functions themselves should follow SQL identifier naming rules. Subroutines loaded with CREATE FUNCTION can have any valid perl subroutine name. Subroutines loaded with LOAD must start with SQL_FUNCTION_ and then the actual function name. For example:

 package Qux::Quimble;
 sub SQL_FUNCTION_FOO { ... }
 sub SQL_FUNCTION_BAR { ... }
 sub some_other_perl_subroutine_not_a_function { ... }
 1;

 # in another package
 $dbh->do("LOAD Qux::Quimble");

 # This loads FOO and BAR as SQL functions.

Creating User-Defined Functions

User-defined functions (UDFs) are perl subroutines that return values appropriate to the context of the function in a SQL statement. For example the built-in CURRENT_TIME returns a string value and therefore may be used anywhere in a SQL statement that a string value can. Here' the entire perl code for the function:

 # CURRENT_TIME
 #
 # arguments : none
 # returns   : string containing current time as hh::mm::ss
 #
 sub SQL_FUNCTION_CURRENT_TIME {
     sprintf "%02s::%02s::%02s",(localtime)[2,1,0]
 }

More complex functions can make use of a number of arguments always passed to functions automatically. Functions always receive these values in @_:

 sub FOO {
     my($self,$sth,$rowhash,@params);
 }

The first argument, $self, is whatever class the function is defined in, not generally useful unless you have an entire module to support the function.

The second argument, $sth is the active statement handle of the current statement. Like all active statement handles it contains the current database handle in the {Database} attribute so you can have access to the database handle in any function:

 sub FOO {
     my($self,$sth,$rowhash,@params);
     my $dbh = $sth->{Database};
     # $dbh->do( ...), etc.
 }

In actual practice you probably want to use $sth-{Database} directly rather than making a local copy, so $sth->{Database}->do(...).

The third argument, $rowhash, is a reference to a hash containing the key/value pairs for the current database row the SQL is searching. This isn't relevant for something like CURRENT_TIME which isn't based on a SQL search, but here's an example of a (rather useless) UDF using $rowhash that just joins the values for the entire row with a colon:

 sub COLON_JOIN {
     my($self,$sth,$rowhash,@params);
     my $str = join ':', values %$rowhash;
 }

The remaining arguments, @params, are arguments passed by users to the function, either directly or with placeholders; another silly example which just returns the results of multiplying the arguments passed to it:

 sub MULTIPLY {
     my($self,$sth,$rowhash,@params);
     return $params[0] * $params[1];
 }

 # first make the function available
 #
 $dbh->do("CREATE FUNCTION MULTIPLY");

 # then multiply col3 in each row times seven
 #
 my $sth=$dbh->prepare("SELECT col1 FROM tbl1 WHERE col2 = MULTIPLY(col3,7)");
 $sth->execute;
 #
 # or
 #
 my $sth=$dbh->prepare("SELECT col1 FROM tbl1 WHERE col2 = MULTIPLY(col3,?)");
 $sth->execute(7);

Creating In-Memory Tables with functions

A function can return almost anything, as long is it is an appropriate return for the context the function will be used in. In the special case of table-returning functions, the function should return a reference to an array of array references with the first row being the column names and the remaining rows the data. For example:

1. create a function that returns an AoA,

  sub Japh {[
      [qw( id word   )],
      [qw( 1 Hacker  )],
      [qw( 2 Perl    )],
      [qw( 3 Another )],
      [qw( 4 Just    )],
  ]}

2. make your database handle aware of the function

  $dbh->do("CREATE FUNCTION 'Japh');

3. Access the data in the AoA from SQL

  $sth = $dbh->prepare("SELECT word FROM Japh ORDER BY id DESC");

Or here's an example that does a join on two in-memory tables:

  sub Prof  {[ [qw(pid pname)],[qw(1 Sue )],[qw(2 Bob)],[qw(3 Tom )] ]}
  sub Class {[ [qw(pid cname)],[qw(1 Chem)],[qw(2 Bio)],[qw(2 Math)] ]}
  $dbh->do("CREATE FUNCTION $_) for qw(Prof Class);
  $sth = $dbh->prepare("SELECT * FROM Prof NATURAL JOIN Class");

The "Prof" and "Class" functions return tables which can be used like any SQL table.

More complex functions might do something like scrape an RSS feed, or search a file system and put the results in AoA. For example, to search a directory with SQL:

 sub Dir {
     my($self,$sth,$rowhash,$dir)=@_;
     opendir D, $dir or die "'$dir':$!";
     my @files = readdir D;
     my $data = [[qw(fileName fileExt)]];
     for (@files) {
         my($fn,$ext) = /^(.*)(\.[^\.]+)$/;
         push @$data, [$fn,$ext];
     }
     return $data;
 }
 $dbh->do("CREATE FUNCTION Dir");
 printf "%s\n", join'   ',@{ $dbh->selectcol_arrayref("
     SELECT fileName FROM Dir('./') WHERE fileExt = '.pl'
 ")};

Obviously, that function could be expanded with File::Find and/or stat to provide more information and it could be made to accept a list of directories rather than a single directory.

Table-Returning functions are a way to turn *anything* that can be modeled as an AoA into a DBI data source.

Built-in Functions

Aggregate Functions

min, max, avg, sum, count

Aggregate functions are handled elsewhere, see SQL::Parser for documentation.

Date and Time Functions

current_date, current_time, current_timestamp

CURRENT_DATE

 # purpose   : find current date
 # arguments : none
 # returns   : string containing current date as yyyy-mm-dd

CURRENT_TIME

 # purpose   : find current time
 # arguments : none
 # returns   : string containing current time as hh::mm::ss

CURRENT_TIMESTAMP

 # purpose   : find current date and time
 # arguments : none
 # returns   : string containing current timestamp as yyyy-mm-dd hh::mm::ss

String Functions

char_length, lower, position, regex, soundex, substring, trim, upper

CHAR_LENGTH

 # purpose   : find length in characters of a string
 # arguments : a string
 # returns   : a number - the length of the string in characters

LOWER & UPPER

 # purpose   : lower-case or upper-case a string
 # arguments : a string
 # returns   : the sting lower or upper cased

POSITION

 # purpose   : find first position of a substring in a string
 # arguments : a substring and  a string possibly containing the substring
 # returns   : a number - the index of the substring in the string
 #             or 0 if the substring doesn't occur in the sring

REGEX

 # purpose   : test if a string matches a perl regular expression
 # arguments : a string and a regex to match the string against
 # returns   : boolean value of the regex match
 #
 # example   : ... WHERE REGEX(col3,'/^fun/i') ... matches rows
 #             in which col3 starts with "fun", ignoring case

SOUNDEX

 # purpose   : test if two strings have matching soundex codes
 # arguments : two strings
 # returns   : true if the strings share the same soundex code
 #
 # example   : ... WHERE SOUNDEX(col3,'fun') ... matches rows
 #             in which col3 is a soundex match for "fun"

CONCAT

 # purpose   : concatenate 1 or more strings into a single string;
 #                      an alternative to the '||' operator
 # arguments : 1 or more strings
 # returns   : the concatenated string
 #
 # example   : SELECT CONCAT(first_string, 'this string', ' that string')
 #              returns "<value-of-first-string>this string that string"
 # note      : if any argument evaluates to NULL, the returned value is NULL

COALESCE aka NVL

 # purpose   : return the first non-NULL value from a list
 # arguments : 1 or more expressions
 # returns   : the first expression (reading left to right)
 #             which is not NULL; returns NULL if all are NULL
 #
 # example   : SELECT COALESCE(NULL, some_null_column, 'not null')
 #              returns 'not null'

DECODE

 # purpose   : compare the first argument against
 #             succeding arguments at position 1 + 2N
 #             (N = 0 to (# of arguments - 2)/2), and if equal,
 #                              return the value of the argument at 1 + 2N + 1; if no
 #             arguments are equal, the last argument value is returned
 # arguments : 4 or more expressions, must be even # of arguments
 # returns   : the value of the argument at 1 + 2N + 1 if argument 1 + 2N
 #             is equal to argument1; else the last argument value
 #
 # example   : SELECT DECODE(some_column,
 #                    'first value', 'first value matched'
 #                    '2nd value', '2nd value matched'
 #                    'no value matched'
 #                    )

REPLACE, SUBSTITUTE

 # purpose   : perform perl subsitution on input string
 # arguments : a string and a substitute pattern string
 # returns   : the result of the substitute operation
 #
 # example   : ... WHERE REPLACE(col3,'s/fun(\w+)nier/$1/ig') ... replaces
 #                      all instances of /fun(\w+)nier/ in col3 with the string
 #                      between 'fun' and 'nier'

SUBSTRING

  SUBSTRING( string FROM start_pos [FOR length] )

Returns the substring starting at start_pos and extending for "length" character or until the end of the string, if no "length" is supplied. Examples:

  SUBSTRING( 'foobar' FROM 4 )       # returns "bar"

  SUBSTRING( 'foobar' FROM 4 FOR 2)  # returns "ba"

Note: The SUBSTRING function is implemented in SQL::Parser and SQL::Statement and, at the current time, can not be over-ridden.

TRIM

  TRIM ( [ [LEADING|TRAILING|BOTH] ['trim_char'] FROM ] string )

Removes all occurrences of <trim_char> from the front, back, or both sides of a string.

 BOTH is the default if neither LEADING nor TRAILING is specified.

 Space is the default if no trim_char is specified.

 Examples:

 TRIM( string )
   trims leading and trailing spaces from string

 TRIM( LEADING FROM str )
   trims leading spaces from string

 TRIM( 'x' FROM str )
   trims leading and trailing x's from string

Note: The TRIM function is implemented in SQL::Parser and SQL::Statement and, at the current time, can not be over-ridden.

Special Utility Functions

IMPORT()

 CREATE TABLE foo AS IMPORT(?)    ,{},$external_executed_sth
 CREATE TABLE foo AS IMPORT(?)    ,{},$AoA

Submitting built-in functions

There are a few built-in functions in the SQL::Statement::Functions. If you make a generally useful UDF, why not submit it to me and have it (and your name) included with the built-in functions? Please follow the format shown in the module including a description of the arguments and return values for the function as well as an example. Send them to me at jzucker AT cpan.org with a subject line containing "built-in UDF".

Thanks in advance :-).

ACKNOWLEDGEMENTS

Dean Arnold supplied DECODE, COALESCE, REPLACE, many thanks!

AUTHOR & COPYRIGHT

Copyright (c) 2005 by Jeff Zucker: jzuckerATcpan.org Copyright (c) 2009,2010 by Jens Rehsack: rehsackATcpan.org

All rights reserved.

The module may be freely distributed under the same terms as Perl itself using either the "GPL License" or the "Artistic License" as specified in the Perl README file.