NAME

HTML::ReportWriter - Simple OO interface to generate pageable, sortable HTML tabular reports

SYNOPSIS

 #!/usr/bin/perl -w

 use strict;
 use HTML::ReportWriter;
 use CGI;
 use Template;
 use DBI;

 my $dbh = DBI->connect('DBI:mysql:foo', 'bar', 'baz');

 # The simplest possible method of calling RW...
 my $report = HTML::ReportWriter->new({
                DBH => $dbh,
                DEFAULT_SORT => 'name',
                SQL_FRAGMENT => 'FROM person AS p, addresses AS a WHERE a.person_id = p.id',
                COLUMNS => [ 'name', 'address1', 'address2', 'city', 'state', 'zip' ],
 });

 $report->draw();

DESCRIPTION

This module generates an HTML tabular report. The first row of the table is the header, which will contain column names, and if the columns are sortable, the name will link back to the cgi, and will allow for changing of the sort. Below the table of results is a paging table, which shows the current page, along with n other pages of the total result set, and includes links to the first, previous, next and last pages in the result set.

HTML::Reportwriter also supports column grouping, allowing for reports to display datasets like the following:

 +------------+---------------+-------+
 |            |               |   1   |
 |            |    test       +-------+
 |   foo      |               |   2   |
 |            +---------------+-------+
 |            |    test2      |   3   |
 +------------+---------------+-------+

METHODS

new($options)

Accepts the same arguments as HTML::ReportWriter::PagingAndSorting, plus the following:

DBH A database handle that has connected to the database that the report is to be run against.
SQL_FRAGMENT An SQL fragment starting from the FROM clause, continued through the end of the where clause. In the case of MySQL and/or other databases that support them, GROUP BY and HAVING clauses may also be added to the SQL fragment.
COLUMNS Column definitions for what is to be selected. A column definition consists of one of two formats, either a simple one-element array reference or an array reference of hash references containing the following four elements:
 get - the string used in the get variable to determine the sorted column
 sql - the sql statement that will select the data from the database
 display - What should be displayed in the column's table header
 sortable - whether or not a sorting link should be generated for the column
 order - (optional) sql that will be used to order by the specified column. If not present, then the value of sql is used
 group - (optional) true or false. If true, the column will be grouped after the results are retrieved.
 hide_column - (optional) true or false. If true, the column will not be drawn to the screen.  This is useful if you
                want to pull data for some other column's draw_func.
 draw_func - (optional) code ref, may not be used if group is true. code ref to a function which takes two parameters.
             param 1: the data from the column being rendered
             param 2: the hashref containing the data from the rest of the row
             returns: the string that should be displayed as the cell contents when that column is drawn

These definitions can be arbitrarily complex. For example:

 COLUMNS => [
     {
         get => 'username',
         sql => 'jp.username',
         display => 'Username',
         sortable => 1,
         group => 1,
     },
     {
         get => 'date',
         sql => 'DATE_FORMAT(l.created, \'%m/%e/%Y\') AS date',
         display => 'Date',
         sortable => 1,
         order => 'l.created',
     },
     {
         get => 'type',
         sql => "IF(l.deleted = 'yes', 'delete', 'add') AS type",
         display => 'Type',
         sortable => 1,
     },
     {
         sql => "l.id",
         hide_column => 1,
     },
     {
         get => 'successful',
         sql => "l.successful",
         display => 'Successful',
         sortable => 1,
         draw_func => sub { my ($data, $cols) = @_; my $lid = $cols->{'id'}; my $color = ($data eq 'no' ? 'red' : ($data eq 'yes' ? 'green' : 'black')); return "<a style=\"color: $color;\" href=\"#\" onClick=\"popup('/cgi-bin/reports/message.cgi?id=$lid'); return false;\">$data</a>"; },
     },
 ]

and

 COLUMNS => [ 'name', 'address', 'age' ]

are both valid definitions. Additionally, you can combine scalar and hashref-filled arrayrefs, like

 COLUMNS => [
     'name',
     'age',
     {
         get => 'birthday',
         sql => 'DATE_FORMAT(birthday, \'%m/%e/%Y\') AS birthday',
         display => 'Birthday',
         sortable => 1,
         order => 'birthday',
     },
 ]

If you are going to use complex sql structures in a column definiton (for example, the DATE_FORMAT and IF statements above), it is STRONGLY recommended that you use a column alias (for example, the 'AS date' in the date column example) in order to ensure proper functionality. This module has not been tested with unaliased complex sql column definitions.

NOTE: If you use formatting that would change a numeric-type column into a string-type column (for example the date columns above), you should use the order attribute to ensure proper ordering. For example using DATE_FORMAT as shown above results in the integer-style date column being treated as a string (20041010120000 becomes '10-10-2004'), which would cause '10-10-2004' to sort before '10-02-2004'. draw_func is intended to provide you with a simple alternative to things like DATE_FORMAT -- you can now do the formatting outside the SQL.

DEBUG Will cause useful debugging messages to be printed using the warn facility. default: 0
COLUMN_SORT_DEFAULT If the simplified version of the COLUMNS definition is used (COLUMNS => [ 'foo', 'bar' ]), then this variable determines whether the table header will allow sorting of any columns. It is global in scope; that is, either every column is sortable, or every column is not. If the hashref method is used to define columns, this variable will be ignored.
MYSQL_MAJOR_VERSION Currently either 3, 4 or 5. Determines which method of determining num_results is used. In MySQL 4 a new method was added which makes the process much more efficient. Defaults to 4 since it's been the stable release for well over a year.
CGI_OBJECT A handle to a CGI object. Since it is very unlikely that a report will ever be just a static report with no user interaction, it is assumed that the coder will want to instantiate their own CGI object in order to allow the user to interact with the report. Use of this argument will prevent needless creation of additional CGI objects.
PAGE_TITLE The title of the current page. Defaults to "HTML::ReportWriter v${VERSION} generated report".
CSS The CSS style applied to the page. Can be an external stylesheet reference or an inline style. Has a default inline style that I won't waste space listing here.
HTML_HEADER The first thing that will appear in the body of the HTML document. Unrestricted, can be anything at all. I recommend placing self-referential forms here to allow the user to interact with the report (for example, setting date ranges). See EXAMPLES below for ideas.
LANGUAGE_TOKENS A list of language tokens that can be used to replace the default English text.
EXPORT_VARIABLE Defaults to 'exportto'. This variable, if passed in via a GET or POST parameter, and set equal to a supported export method, will cause the report to be exported. Set EXPORT_VARIABLE to '' to disable exporting. Currently supported formats:
  • excel: Reports are exported without paging. Sorting is maintained. Grouping of results is disabled.

REPORT_TABLE_WIDTH Default: 800. The width of the table containing the report output.
DOCTYPE Default: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Set to '' if you do not want to have a doctype declared.

The return of this function is a reference to the object. Calling draw after the object's initialization will draw the page.

Note with regards to DEFAULT_SORT: the string used to specify the default sort must match the get parameter of the COLUMNS definition if you use a hashref COLUMN definition.

draw()

Renders the page. This function takes one optional parameter, i<disable_output>. If the results are not being exported to a non-HTML format, and if i<disable_output> evaluates to false then this function writes the HTTP header and the page text to STDOUT (the only behaviour prior to version 1.5.0). If the parameter evaluates to true, the function does not print anything to STDOUT. In either of those cases, the text of the HTML page is returned by draw().

If the results are being exported (to e.g. Excel), then the output will always be sent to STDOUT regardless of the value of disable_output, and the function will return success or failure of the export.

export_to_excel()
draw_row()

This function is magical. One day, when I am feeling brave, I will comment it. What ever happened to the nice textbook recursive functions they taught in CS class?

It is internal to HTML::ReportWriter, and shouldn't need to be overridden. If you want to customize the look and feel of the report, you can use the draw_func option to the columns to override the appearance of a given element, and you can customize the stylesheet at creation-time (see new()).

group_results()

Internal method -- groups a result set if necessary.

get_results()

Internal method -- generates the result set used by draw().

A NOTE ABOUT GROUPING

Grouping is done in code after the database returns its dataset. Grouping breaks sortability (or vice-versa depending on your point of view) to a small degree, and for large datasets grouping can produce unexpected results across pages.

In particular, if a group of 6 rows of data spans two pages (say 3 results on page 1 and 3 results on page 2), then only three results will show on page 1. For this reason, it is recommended on pages that have highly-constrained result sets (1 page of results), or where paging is disabled.

Additionally, it is strongly recommended that authors using this module consider which columns really need to be sortable when using grouping. Based on limited testing and experience, if sorting is necessary, I recommend only allowing sorting on the grouped columns. I also recommend making the default sort == the first grouped column.

When you sort by an ungrouped column, the grouped columns are still grouped, despite the fact that they may appear adjacent. Order of the rows is preserved. For example, assume the following dataset was sorted by the third column and grouped by the first:

 test       foo      5
 test2      quux     4
 test       bar      3
 test2      bat      2
 test       baz      1

Output would be:

 +-------+------+---+
 |       | foo  | 5 |
 |       +------+---+
 | test  | bar  | 3 |
 |       +------+---+
 |       | baz  | 1 |
 +-------+------+---+
 |       | quux | 4 |
 | test2 +------+---+
 |       | bat  | 2 |
 +-------+------+---+

One restriction on grouped columns is that they must appear at the beginning of the column list:

 COLUMNS => [
     {
         get => 'username',
         sql => 'jp.username',
         display => 'Username',
         sortable => 1,
         group => 1,
     },
     {
         get => 'date',
         sql => 'DATE_FORMAT(l.created, \'%m/%e/%Y\') AS date',
         display => 'Date',
         sortable => 1,
         order => 'l.created',
     },
 ]

is correct, whereas:

 COLUMNS => [
     {
         get => 'username',
         sql => 'jp.username',
         display => 'Username',
         sortable => 1,
     },
     {
         get => 'date',
         sql => 'DATE_FORMAT(l.created, \'%m/%e/%Y\') AS date',
         display => 'Date',
         sortable => 1,
         order => 'l.created',
         group => 1,
     },
 ]

will result in die() being called.

USING HTML::ReportWriter WITH OTHER DATASOURCES

While it is my hope that I will eventually be able to add any datasource that someone could want to this package, I know that in the meantime, some people are left hanging. As a result of the recent separation of data retrieval into the get_results() method, you can now add support for any data retrieval method you wish: either subclass HTML::ReportWriter and override get_results, or modify your created object like

 $report->{'get_results'} = \&data_retrieval_func();

If you override the function, your new function should take no parameters, rely on $self to provide necessary paging and sorting information, and return an arrayref consisting of hashrefs containing { column => value } pairs for each column that is passed in to new using COLUMNS. I haven't tested this, but see no reason why it should not work.

Please do not rely on this as anything more than a temporary fix -- I expect that the internals of the module will change somewhat dramatically when I finally decide on a method of abstracting data retrieval. Suggestions welcome on how to abstract data retrieval.

Update: overriding this function will cause breakage unless either (1) Your function is modified to take exports into account, or (2) You disable export functionality in your report.

TODO

  • write tests for the module

  • support for other databases (help greatly appreciated)

  • implement export feature supporting export to PDF for the results

  • overrides for shading behaviour and line drawing between cells.

  • I am debating the addition of support for query prefixes, like DISTINCT (which should appear before the list of data to be selected). This could also be used for extensions like MySQL's SQL_NO_CACHE directives. Feedback welcome.

EXAMPLES

Data: the following statements can be reloaded into a mysql database and the resulting tables can be used in conjunction with the scripts below to demo the features of the reportwriter. The data should be imported into a DB named testing, and have a root user with a blank password, or else you need to edit the DBI->connect statements below. I'd recommend never having a blank root password, personally.

 CREATE TABLE log (
       userid int(10) default NULL,
       activity char(24) default NULL,
       created timestamp(14) NOT NULL
     ) TYPE=MyISAM;

 INSERT INTO log VALUES (1,'test',20050124064258);
 INSERT INTO log VALUES (1,'test2',20050124064301);
 INSERT INTO log VALUES (1,'test3',20050124064303);
 INSERT INTO log VALUES (1,'test4',20050124064305);
 INSERT INTO log VALUES (2,'test4',20050124064308);
 INSERT INTO log VALUES (3,'test5',20050124064313);
 INSERT INTO log VALUES (2,'test6',20050124064317);
 INSERT INTO log VALUES (2,'test7',20050124064319);

 CREATE TABLE people (
           name char(25) default NULL,
           age int(2) default NULL,
           birthday timestamp(14) NOT NULL,
           phone_number char(10) default NULL,
           active int(1) default '1'
     ) TYPE=MyISAM;

 INSERT INTO people VALUES ('Barney Rubble',20,19700101000000,'9724443456',1);
 INSERT INTO people VALUES ('Fred Flintstone',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('wilma',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('dino',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('bambam',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('george',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('elroy',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('judy',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('jane',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('disco',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('tango',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('foxtrot',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('waltz',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('swing',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('i',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('am',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('not',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('feeling',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('creative',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('right',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('now',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('so',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('the',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('sample',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('data',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('sucks.',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('sorry',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('about',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES ('that!',25,19700401000000,'9725551212',1);
 INSERT INTO people VALUES (':)',25,19700401000000,'9725551212',1);

 CREATE TABLE user (
           id int(10) default NULL,
           name char(15) default NULL,
           groupid int(1) default NULL
     ) TYPE=MyISAM;

 INSERT INTO user VALUES (1,'zeus',3);
 INSERT INTO user VALUES (2,'apollo',2);
 INSERT INTO user VALUES (3,'mercury',3);

Example 1, a simple non-interactive report, like one that might be used to show phonebook entries:

 #!/usr/bin/perl -w

 use strict;
 use HTML::ReportWriter;
 use DBI;

 my $dbh = DBI->connect('DBI:mysql:host=localhost:database=testing', 'root', '');

 my $sql_fragment = 'FROM people WHERE active = 1';

 my $report = HTML::ReportWriter->new({
         DBH => $dbh,
         SQL_FRAGMENT => $sql_fragment,
         DEFAULT_SORT => 'birthday',
         COLUMNS => [
            'name',
            'age',
            'birthday',
            {
                get => 'phone',
                sql => 'phone_number',
                display => 'Phone Number',
                sortable => 0,
            },
         ],
 });

 $report->draw();

Example 2, an interactive report allowing the user to select data within a date range.

 #!/usr/bin/perl -w

 use strict;
 use CGI;
 use HTML::ReportWriter;
 use DBI;

 my $dbh = DBI->connect('DBI:mysql:host=localhost:database=testing', 'root', '');
 my $co = CGI->new();

 # set defaults if there is not a setting for date1 or date2
 my $date1 = $co->param('date1') || '20050101000000';
 my $date2 = $co->param('date2') || '20050201000000';

 my $sql_fragment = 'FROM log AS l, user AS u WHERE l.userid = u.id AND u.groupid = '
                  . $dbh->quote(3) . ' AND l.created BETWEEN '
                  . $dbh->quote($date1) . ' AND ' . $dbh->quote($date2);

 my $report = HTML::ReportWriter->new({
         DBH => $dbh,
         CGI_OBJECT => $co,
         SQL_FRAGMENT => $sql_fragment,
         DEFAULT_SORT => 'date',
         HTML_HEADER => '<form method="get"><table><tr><td colspan="3">Show results from:</td></tr><tr>
                         <td><input type="text" name="date1" value="' . $date1 . '" /></td>
                         <td>&nbsp;&nbsp;to&nbsp;&nbsp;</td>
                         <td><input type="text" name="date2" value="' . $date2 . '" /></td></tr>
                         <tr><td colspan="3" align="center"><input type="submit" /></td></tr></table></form>',
         PAGE_TITLE => 'Log Activity for Group ' . 'foo',
         COLUMNS => [
             'name',
             'activity',
             {
                 get => 'date',
                 sql => 'DATE_FORMAT(l.created, \'%m/%e/%Y\') AS date',
                 display => 'Date',
                 sortable => 1,
             },
         ],
 });

 $report->draw();

Caveats for Example 2:

  • By using the short form of the column definitions, you are asserting that there is only one column named 'name' and one column named 'activity' in both the log and user tables combined. You'd get an SQL error otherwise for having an ambiguous column reference.

  • Assumption is that the user enters the date in as a MySQL timestamp in the form. I got lazy as I was writing this example. Also, the form would probably not look great, because the table is not formatted, nor does it have an alignment on the page -- the report would be centered and the form left-justified. Making things pretty is left as an exercise for the reader.

BUGS

None are known about at this time.

Please report any additional bugs discovered to the author.

SEE ALSO

This module relies on DBI, Template and CGI. The paging/sorting module also relies on POSIX and List::MoreUtils. Exporting to Excel uses Spreadsheet::SimpleExcel.

AUTHOR

Shane Allen <opiate@gmail.com>

ACKNOWLEDGEMENTS

  • PagingAndSorting was developed during my employ at HRsmart, Inc. http://www.hrsmart.com and its public release was graciously approved.

  • Robert Egert was an early adopter, and made signifigant contributions in the form of suggestions and bug reports.

  • Mark Stosberg made several contributions including documentation corrections and PostgreSQL support.

  • Steven Mackenzie contributed a patch to add support for SQLite.

COPYRIGHT

Copyright 2004, Shane Allen. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.