The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Text::Table::Manifold - Render tables in manifold formats

Synopsis

This is scripts/synopsis.pl:

        #!/usr/bin/env perl

        use strict;
        use utf8;
        use warnings;
        use warnings qw(FATAL utf8); # Fatalize encoding glitches.
        use open     qw(:std :utf8); # Undeclared streams in UTF-8.

        use Text::Table::Manifold ':constants';

        # -----------

        # Set parameters with new().

        my($table) = Text::Table::Manifold -> new
        (
                alignment =>
                [
                        align_left,
                        align_center,
                        align_right,
                        align_center,
                ]
        );

        $table -> headers(['Homepage', 'Country', 'Name', 'Metadata']);
        $table -> data(
        [
                ['http://savage.net.au/',   'Australia', 'Ron Savage',    undef],
                ['https://duckduckgo.com/', 'Earth',     'Mr. S. Engine', ''],
        ]);

        # Note: Save the data, since render() may update it.

        my(@data) = @{$table -> data};

        # Set parameters with methods.

        $table -> empty(empty_as_text);
        $table -> format(format_internal_boxed);
        $table -> undef(undef_as_text);

        # Set parameters with render().

        print "Format: format_internal_boxed: \n";
        print join("\n", @{$table -> render(padding => 1)}), "\n";
        print "\n";

        $table -> headers(['One', 'Two', 'Three']);
        $table -> data(
        [
                ['Reichwaldstraße', 'Böhme', 'ʎ ʏ ʐ ʑ ʒ ʓ ʙ ʚ'],
                ['ΔΔΔΔΔΔΔΔΔΔ', 'Πηληϊάδεω Ἀχιλῆος', 'A snowman: ☃'],
                ['Two ticks: ✔✔', undef, '<table><tr><td>TBA</td></tr></table>'],
        ]);

        # Save the data, since render() may update it.

        @data = @{$table -> data};

        $table -> empty(empty_as_minus);
        $table -> format(format_internal_boxed);
        $table -> undef(undef_as_text);
        $table -> padding(2);

        print "Format: format_internal_boxed: \n";
        print join("\n", @{$table -> render}), "\n";
        print "\n";

        # Restore the saved data.

        $table -> data([@data]);

        # Etc.

This is data/synopsis.log, the output of synopsis.pl:

        Format: format_internal_boxed:
        +-------------------------+-----------+---------------+----------+
        | Homepage                |  Country  |          Name | Metadata |
        +-------------------------+-----------+---------------+----------+
        | http://savage.net.au/   | Australia |    Ron Savage |  undef   |
        | https://duckduckgo.com/ |   Earth   | Mr. S. Engine |  empty   |
        +-------------------------+-----------+---------------+----------+

        Format: format_internal_boxed:
        +-------------------+---------------------+----------------------------------------+
        |  One              |         Two         |                                 Three  |
        +-------------------+---------------------+----------------------------------------+
        |  Reichwaldstraße  |        Böhme        |                       ʎ ʏ ʐ ʑ ʒ ʓ ʙ ʚ  |
        |  ΔΔΔΔΔΔΔΔΔΔ       |  Πηληϊάδεω Ἀχιλῆος  |                          A snowman: ☃  |
        |  Two ticks: ✔✔    |        undef        |  <table><tr><td>TBA</td></tr></table>  |
        +-------------------+---------------------+----------------------------------------+

The latter table renders perfectly in FF, but not so in Chrome (today, 2015-01-31).

Description

Outputs tables in any one of several supported types.

Features:

o Generic interface to all supported table formats
o Separately specify header/data/footer rows
o Separately include/exclude header/data/footer rows
o Align cell values

Each column has its own alignment option, left, center or right.

For internally generated HTML, this is done with a CSS div within each td, not with the obsolete td align attribute.

But decimal places are not alignable, yet, as discussed in the "TODO".

o Escape HTML entities or URIs

But not both at the same time!

o Extend short header/data/footer rows with empty strings or undef

Auto-extension results in all rows being the same length.

This takes place before the transformation, if any, mentioned next.

o Tranform cell values which are empty strings and undef
o Pad cell values
o Handle UFT8
o Return the table as an arrayref of lines or as a string

The arrayref is returned by "render([%hash])", and the string by "render_as_string([%hash])".

When returning a string by calling render_as_string() (which calls render()), you can specify how the lines in the arrayref are joined.

In the same way the format parameter discussed just below controls the output, the join parameter controls the join.

The format of the output is controlled by the format parameter to new(), or by the parameter to the "format([$format])" method, or by the value of the format key in the hash passed to "render([%hash])" and "render_as_string(%hash])", and must be one of these imported constants:

o format_internal_boxed

All headers, footers and table data are surrounded by ASCII characters.

The rendering is done internally.

See scripts/internal.boxed.pl and output file data/internal.boxed.log.

o format_internal_github

Render as github-flavoured markdown.

The rendering is done internally.

See scripts/internal.github.pl and output file data/internal.github.log.

o format_internal_html

Render as a HTML table. You can use the "pass_thru([$hashref])" method to set options for the HTML table.

The rendering is done internally.

See scripts/internal.html.pl and output file data/internal.html.log.

o format_html_table

Passes the data to HTML::Table. You can use the "pass_thru([$hashref])" method to set options for the HTML::Table object constructor.

Warning: You must use Text::Table::Manifold's data() method, or the data parameter to new(), and not the -data option to HTML::Table. This is because the module processes the data before calling the HTML::Table constructor.

o format_text_csv

Passes the data to Text::CSV. You can use the "pass_thru([$hashref])" method to set options for the Text::CSV object constructor.

See scripts/text.csv.pl and output file data/text.csv.log.

o format_text_unicodebox_table

Passes the data to Text::UnicodeBox::Table. You can use the "pass_thru([$hashref])" method to set options for the Text::UnicodeBox::Table object constructor.

See scripts/text.unicodebox.table.pl and output file data/text.unicodebox.table.log.

See also scripts/synopsis.pl, and the output data/synopsis.log.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install Text::Table::Manifold as you would any Perl module:

Run:

        cpanm Text::Table::Manifold

or run:

        sudo cpan Text::Table::Manifold

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = Text::Table::Manifold -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Text::Table::Manifold.

Details of all parameters are explained in the "FAQ".

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. "data([$arrayref])"]):

o alignment => $arrayref of imported constants

This specifies alignment per column. There should be one array element per column of data. The $arrayref will be auto-extended if necessary, using the constant align_center.

Alignment applies equally to every cell in the column.

A value for this parameter is optional.

Default: align_center for every column.

o data => $arrayref_of_arrayrefs

This specifies the table of cell values.

An arrayref of arrayrefs, each inner arrayref is a row of data.

The # of elements in each alignment/header/data/footer row does not have to be the same. See the extend* parameters for more. Auto-extension results in all rows being the same length.

A value for this parameter is optional.

Default: [].

o empty => An imported constant

This specifies how to transform cell values which are the empty string. See also the undef parameter.

The empty parameter is activated after the extend* parameters has been applied.

A value for this parameter is optional.

Default: empty_as_empty. I.e. do not transform.

o escape => An imported constant

This specifies escaping of either HTML entities or URIs.

A value for this parameter is optional.

Default: escape_nothing. I.e. do not transform.

o extend_data => An imported constant

The 2 constants available allow you to specify how short data rows are extended. Then, after extension, the transformations specified by the parameters empty and undef are applied.

A value for this parameter is optional.

Default: extend_with_empty. I.e. extend short data rows with the empty string.

o extend_footers => An imported constant

The 2 constants available allow you to specify how short footer rows are extended. Then, after extension, the transformations specified by the parameters empty and undef are applied.

A value for this parameter is optional.

Default: extend_with_empty. I.e. extend short footer rows with the empty string.

o extend_headers => An imported constant

The 2 constants available allow you to specify how short header rows are extended. Then, after extension, the transformations specified by the parameters empty and undef are applied.

A value for this parameter is optional.

Default: extend_with_empty. I.e. extend short header rows with the empty string.

o footers => $arrayref

These are the column footers. See also the headers option.

The # of elements in each header/data/footer row does not have to be the same. See the extend* parameters for more.

A value for this parameter is optional.

Default: [].

o format => An imported constant

This specifies which format to output from the rendering methods.

A value for this parameter is optional.

Default: format_internal_boxed.

o headers => $arrayref

These are the column headers. See also the footers option.

The # of elements in each header/data/footer row does not have to be the same. See the extend* parameters for more.

A value for this parameter is optional.

Default: [].

o include => An imported constant

Controls whether header/data/footer rows are included in the output.

The are three constants available, and any of them can be combined with '|', the logical OR operator.

A value for this parameter is optional.

Default: include_headers | include_data.

o join => $string

"render_as_string([%hash])" uses $hash{join}, or $self -> join, in Perl's join($join, @$araref) to join the elements of the arrayref returned by internally calling "render([%hash])".

render() ignores the join key in the hash.

A value for this parameter is optional.

Default: ''.

o padding => $integer

This integer is the # of spaces added to each side of the cell value, after the alignment parameter has been applied.

A value for this parameter is optional.

Default: 0.

o pass_thru => $hashref

A hashref of values to pass thru to another object.

The keys in this $hashref control what parameters are passed to rendering routines.

A value for this parameter is optional.

Default: {}.

o undef => An imported constant

This specifies how to transform cell values which are undef. See also the empty parameter.

The undef parameter is activated after the extend* parameters have been applied.

A value for this parameter is optional.

Default: undef_as_undef. I.e. do not transform.

Methods

See the "FAQ" for details of all importable constants mentioned here.

And remember, all methods listed here which are parameters to "new([%hash])", are also parameters to both "render([%hash])" and "render_as_string([%hash])".

alignment([$arrayref])

Here, the [] indicate an optional parameter.

Returns the alignment as an arrayref of constants, one per column.

There should be one element in $arrayref for each column of data. If the $arrayref is too short, align_center is the default for the missing alignments.

Obviously, $arrayref might force spaces to be added to one or both sides of a cell value.

Alignment applies equally to every cell in the column.

This happens before any spaces specified by "padding([$integer])" are added.

See the "FAQ#What are the constants for alignment?" for legal values for the alignments (per column).

alignment is a parameter to "new([%hash])". See "Constructor and Initialization".

data([$arrayref])

Here, the [] indicate an optional parameter.

Returns the data as an arrayref. Each element in this arrayref is an arrayref of one row of data.

The structure of $arrayref, if provided, must match the description in the previous line.

Rows do not need to have the same number of elements.

Use Perl's undef or '' (the empty string) for missing values.

See "empty([$empty])" and "undef([$undef])" for how '' and undef are handled.

See "extend_data([$extend])" for how to extend short data rows, or let the code extend auto-extend them.

data is a parameter to "new([%hash])". See "Constructor and Initialization".

empty([$empty])

Here, the [] indicate an optional parameter.

Returns the option specifying how empty cell values ('') are being dealt with.

$empty controls how empty strings in cells are rendered.

See the "FAQ#What are the constants for handling cell values which are empty strings?" for legal values for $empty.

See also "undef([$undef])".

empty is a parameter to "new([%hash])". See "Constructor and Initialization".

escape([$escape])

Here, the [] indicate an optional parameter.

Returns the option specifying how HTML entities and URIs are being dealt with.

$escape controls how either HTML entities or URIs are rendered.

See the "FAQ#What are the constants for escaping HTML entities and URIs?" for legal values for $escape.

escape is a parameter to "new([%hash])". See "Constructor and Initialization".

extend_data([$extend])

Here, the [] indicate an optional parameter.

Returns the option specifying how short data rows are extended.

If the # of elements in a data row is shorter than the longest row, $extend specifies how to extend those short rows.

See the "FAQ#What are the constants for extending short rows?" for legal values for $extend.

extend_data is a parameter to "new([%hash])". See "Constructor and Initialization".

extend_footers([$extend])

Here, the [] indicate an optional parameter.

Returns the option specifying how short footer rows are extended.

If the # of elements in a footer row is shorter than the longest row, $extend specifies how to extend those short rows.

See the "FAQ#What are the constants for extending short rows?" for legal values for $extend.

extend_footers is a parameter to "new([%hash])". See "Constructor and Initialization".

extend_headers([$extend])

Here, the [] indicate an optional parameter.

Returns the option specifying how short header rows are extended.

If the # of elements in a header row is shorter than the longest row, $extend specifies how to extend those short rows.

See the "FAQ#What are the constants for extending short rows?" for legal values for $extend.

extend_headers is a parameter to "new([%hash])". See "Constructor and Initialization".

footers([$arrayref])

Here, the [] indicate an optional parameter.

Returns the footers as an arrayref of strings.

$arrayref, if provided, must be an arrayref of strings.

See "extend_footers([$extend])" for how to extend a short footer row, or let the code auto-extend it.

footers is a parameter to "new([%hash])". See "Constructor and Initialization".

format([$format])

Here, the [] indicate an optional parameter.

Returns the format as a constant (actually an integer).

See the "FAQ#What are the constants for formatting?" for legal values for $format.

format is a parameter to "new([%hash])". See "Constructor and Initialization".

format_as_internal_boxed()

Called by "render([%hash])".

format_as_internal_github()

Called by "render([%hash])".

format_as_internal_html()

Called by "render([%hash])".

format_as_html_table()

Called by "render([%hash])".

format_as_text_csv().

Called by "render([%hash])".

format_as_text_unicodebox_table()

Called by "render([%hash])".

headers([$arrayref])

Here, the [] indicate an optional parameter.

Returns the headers as an arrayref of strings.

$arrayref, if provided, must be an arrayref of strings.

See "extend_headers([$extend])" for how to extend a short header row, or let the code auto-extend it.

headers is a parameter to "new([%hash])". See "Constructor and Initialization".

include([$include])

Here, the [] indicate an optional parameter.

Returns the option specifying if header/data/footer rows are included in the output.

See the "FAQ#What are the constants for including/excluding rows in the output?" for legal values for $include.

include is a parameter to "new([%hash])". See "Constructor and Initialization".

join([$join])

Here, the [] indicate an optional parameter.

Returns the string used to join lines in the table when you call "render_as_string([%hash])".

$join is the parameter passed to the Perl function join() by render_as_string().

Further, you can use the key join in %hash to pass a value directly to "render_as_string([%hash])".

new([%hash])

The constructor. See "Constructor and Initialization" for details of the parameter list.

Note: "render([%hash])" and "render_as_string([%hash])"support the same options as new().

padding([$integer])

Here, the [] indicate an optional parameter.

Returns the padding as an integer.

Padding is the # of spaces to add to both sides of the cell value after it has been aligned.

padding is a parameter to "new([%hash])". See "Constructor and Initialization".

pass_thru([$hashref])

Here, the [] indicate an optional parameter.

Returns the hashref previously provided.

See "FAQ#What is the format of the $hashref used in the call to pass_thru()?" for details.

See scripts/html.table.pl, scripts/internal.table.pl and scripts/text.csv.pl for sample code where it is used in various ways.

pass_thru is a parameter to "new([%hash])". See "Constructor and Initialization".

render([%hash])

Here, the [] indicate an optional parameter.

Returns an arrayref, where each element is 1 line of the output table. These lines do not have "\n" or any other line terminator added by this module.

It's up to you how to handle the output. The simplest thing is to just do:

        print join("\n", @{$table -> render}), "\n";

Note: render() supports the same options as "new([%hash])".

render() ignores the join key in the hash.

See also "render_as_string([%hash])".

render_as_string([%hash])

Here, the [] indicate an optional parameter.

Returns the rendered data as a string.

render_as_string uses the value of $hash{join}, or the result of calling $self -> join, in Perl's join($join, @$araref) to join the elements of the arrayref returned by internally calling "render([%hash])".

Note: render_as_string() supports the same options as "new([%hash])", and passes them all to "render([%hash])".

See also "render([%hash])".

undef([$undef])

Here, the [] indicate an optional parameter.

Returns the option specifying how undef cell values are being dealt with.

$undef controls how undefs in cells are rendered.

See the "FAQ#What are the constants for handling cell values which are undef?" for legal values for $undef.

See also "empty([$empty])".

undef is a parameter to "new([%hash])". See "Constructor and Initialization".

widths()

Returns an arrayref of the width of each column, after the data is cleaned and rectified, but before it has been aligned or padded.

FAQ

Note: See "TODO" for what has not been implemented yet.

How are imported constants used?

Firstly, you must import them with:

        use Text::Table::Manifold ':constants';

Then you can use them in the constructor:

        my($table) = Text::Table::Manifold -> new(empty => empty_as_text);

And/or you can use them in method calls:

        $table -> format(format_internal_boxed);

See scripts/synopsis.pl for various use cases.

Note how sample code uses the names of the constants. The integer values listed below are just FYI.

What are the constants for alignment?

The parameters, one per column, to "alignment([$arrayref])" must be one of the following:

o align_left => 0
o align_center => 1

So-spelt. Not 'centre'.

o align_right => 2

Alignment applies equally to every cell in a column.

What are the constants for handling cell values which are empty strings?

The parameter to "empty([$empty])" must be one of the following:

o empty_as_empty => 0

Do nothing. This is the default.

o empty_as_minus => 1

Convert empty cell values to '-'.

o empty_as_text => 2

Convert empty cell values to the text string 'empty'.

o empty_as_undef => 3

Convert empty cell values to undef.

See also "undef([$undef])".

Warning: This updates the original data!

What are the constants for escaping HTML entities and URIs?

The parameter to "escape([$escape])" must be one of the following:

o escape_nothing => 0

This is the default.

o escape_html => 1

Use HTML::Entities::Interpolate to escape HTML entities. HTML::Entities::Interpolate cannot be loaded at runtime, and so is always needed.

o escape_uri => 2

Use URI::Escape's uri_escape() method to escape URIs. URI::Escape is loaded at runtime if needed.

Warning: This updates the original data!

What are the constants for extending short rows?

The parameters to "extend_data([$extend])", "extend_footers([$extend])" and "extend_headers([$extend])", must be one of the following:

o extend_with_empty => 0

Short header/data/footer rows are extended with the empty string.

Later, the values discussed under "FAQ#What are the constants for handling cell values which are empty strings?" will be applied.

o extend_with_undef => 1

Short header/data/footer rows are extended with undef.

Later, the values discussed under "FAQ#What are the constants for handling cell values which are undef?" will be applied.

See also "empty([$empty])" and "undef([$undef])".

Warning: This updates the original data!

What are the constants for formatting?

The parameter to "format([$format])" must be one of the following:

o format_internal_boxed => 0

Render internally.

o format_text_csv => 1

Text::CSV is loaded at runtime if this option is used.

o format_internal_github => 2

Render internally.

o format_internal_html => 3

Render internally.

o format_html_table => 4

HTML::Table is loaded at runtime if this option is used.

o format_text_unicodebox_table => 5

Text::UnicodeBox::Table is loaded at runtime if this option is used.

What are the constants for including/excluding rows in the output?

The parameter to "include([$include])" must be one or more of the following:

o include_data => 1

Data rows are included in the output.

o include_footers => 2

Footer rows are included in the output.

o include_headers => 4

Header rows are included in the output.

What is the format of the $hashref used in the call to pass_thru()?

It takes these (key => value) pairs:

o new => $hashref
o For internal rendering of HTML

$$hashref{table} is used to specify parameters for the table tag.

Currently, table is the only tag supported by this mechanism.

o When using HTML::Table, for external rendering of HTML

$hashref is passed to the HTML::Table constructor.

o When using Text::CSV, for external rendering of CSV

$hashref is passed to the Text::CSV constructor.

o When using Text::UnicodeBox::Table, for external rendering of boxes

$hashref is passed to the Text::UnicodeBox::Table constructor.

See html.table.pl, internal.html.pl and text.csv.pl, all in the scripts/ directory.

What are the constants for handling cell values which are undef?

The parameter to "undef([$undef])" must be one of the following:

o undef_as_empty => 0

Convert undef cell values to the empty string ('').

o undef_as_minus => 1

Convert undef cell values to '-'.

o undef_as_text => 2

Convert undef cell values to the text string 'undef'.

o undef_as_undef => 3

Do nothing.

This is the default.

See also "empty([$undef])".

Warning: This updates the original data!

Will you extend the program to support other external renderers?

Possibly, but only if the extension matches the spirit of this module, which is roughly: Keep it simple, and provide just enough options but not too many options. IOW, there is no point in passing a huge number of options to an external class when you can use that class directly anyway.

I've looked a number of times at PDF::Table, for example, but it is just a little bit too complex. Similarly, Text::ANSITable has too many methods.

See also "TODO".

How do I run author tests?

This runs both standard and author tests:

        shell> perl Build.PL; ./Build; ./Build authortest

TODO

o Fancy alignment of real numbers

It makes sense to right-justify integers, but in the rest of the table you probably want to left-justify strings.

Then, vertically aligning decimal points (whatever they are in your locale) is another complexity.

See Text::ASCIITable and Text::Table.

o Embedded newlines

Cell values could be split at each "\n" character, to find the widest line within the cell. That would be then used as the cell's width.

For Unicode, this is complex. See http://www.unicode.org/versions/Unicode7.0.0/ch04.pdf, and especially p 192, for 'Line break' controls. Also, the Unicode line breaking algorithm is documented in http://www.unicode.org/reports/tr14/.

Perl modules and other links relevant to this topic are listed under "See Also#Line Breaking".

o Nested tables

This really requires the implementation of embedded newline analysis, as per the previous point.

o Pass-thru class support

The problem is the mixture of options required to drive classes.

o Sorting the rows, or individual columns

See Data::Table and HTML::Table.

o Color support

See Text::ANSITable.

o Subtotal support

Maybe one day. I did see a subtotal feature in a module while researching this, but I can't find it any more.

See Data::Table. It has grouping features.

See Also

Table Rendering

Any::Renderer

Data::Formatter::Text

Data::Tab

Data::Table

Data::Tabulate

Gapp::TableMap

HTML::Table

HTML::Tabulate

LaTeX::Table

PDF::Table

PDF::TableX

PDF::Report::Table

Table::Simple

Term::TablePrint

Text::ANSITable

Text::ASCIITable

Text::CSV

Text::FormatTable

Text::MarkdownTable

Text::SimpleTable

Text::Table

Text::Table::Tiny

Text::TabularDisplay

Text::Tabulate

Text::UnicodeBox

Text::UnicodeBox::Table

Text::UnicodeTable::Simple

Tie::Array::CSV

Line Breaking

Text::Format

Text::LineFold

Text::NWrap

Text::Wrap

Text::WrapI18N

Unicode::LineBreak.

UNICODE LINE BREAKING ALGORITHM

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Repository

https://github.com/ronsavage/Text-Table-Manifold

Support

Bugs should be reported via the CPAN bug tracker at

https://github.com/ronsavage/Text-Table-Manifold/issues

Author

Text::Table::Manifold was written by Ron Savage <ron@savage.net.au> in 2015.

Marpa's homepage: http://savage.net.au/Marpa.html.

My homepage: http://savage.net.au/.

Copyright

Australian copyright (c) 2014, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Perl Artistic License, a copy of which is available at:
        https://perldoc.perl.org/perlartistic.html.