The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Excel-Template-XLSX - Create Excel .xlsx files starting from (one or more) template(s).

SYNOPSIS

   use Excel::Template::XLSX;
   my ($self, $workbook) = Excel::Template::XLSX->new('perl.xlsx', 'template1.xlsx', 'template2.xlsx', ...);
   $self->parse_template();
   
   # Add a worksheet, ... and anything else you would do with Excel::Writer::XLSX
   $worksheet = $workbook->add_worksheet();

   # Although Excel::Writer::XLSX says the workbook will automatically get 
   # closed during global destruction.  This wrapper around Excel::Writer::XLSX may
   # mess this up, and it is better to specifically close your workbook when you are done.
   $workbook->close();

DESCRIPTION

This module is a companion to Excel::Writer::XLSX(EWX), or if you prefer, a wrapper to that module. It uses EWX as a base class. It opens an existing spreadsheet file (.xlsx format), and also creates a new EWX object. As it parses the template file(s), it calls EWX methods to re-create the template contents in the EWX object.

When parsing is complete, the workbook object is left open for the calling perl script to add additional content.

The purpose of this module is to separate the roles of content/presentation vs programming in an Excel document, in much the same way that HTML templating engines work. A user who is knowledgeable in Excel can create an Excel file for use as a template, without requiring the skill set of Perl or Excel::Writer::XLSX. Conversely, the Perl programmer who is creating dynamic content does not need design skills to layout the presentation in the template.

WHAT IT CAN DO

   Cell Values (strings, numbers, dates, ... )
   Cell Formulas
   Cell Hyperlinks
   
   Cell Formatting (borders, shading, fonts, font sizes, colors)
   
   Column Widths
   Row Widths
   
   Headers and Footers
   

Simple template variables (via callback) See "template_callback"

WHAT IT CAN NOT DO

   Excel::Template::Excel can not modify Excel files in place!  It is not
   intended to.  Since the parser directly adds content to the EWX workbook object
   as the contents are parsed, both the template, and the output file must
   be open at the same time.  
   
   It may be possible to open the output file to a file handle, and 
   after parsing of the template is complete, write the contents of file
   over the template.  The author has not tried this.
   
   It is not the design of this module to faithfully re-create the entire
   contents of the template file in the EWX output.  If you are using this
   module to rewrite Excel files, you are on your own.

   These items are completely dropped from the output file:

   Images in the Sheet
   Images in Headers/Footers
   Charts
   Shapes
   Themes (gradients, fonts, fills, styles)
   macros
   modules (vba code)

   And probably other things.  See the tests (t directory of the distribution) 
   for examples of what does work.

SUBROUTINES AND METHODS

__podhead

Dummy subroutine to allow me to hide this pod documentation when using code folding in the editor.

new

Creates a new Excel::Template::XLSX object, and also creates a new Excel::Writer::XLSX object. A workbook object is created for the output file.

Returns the Template object, and the workbook object. Workbook object is also available as $self->{EWX}; If the caller is only expecting a single return value, then just the $self object is returned.

parse_template

Parses common elements of the Spreadsheet, such as themes, styles, and strings. These are stored in the main object ($self).

Finds each sheet in the workbook, and initiates parsing of each sheet.

Properties for the created workbook are set from the first template that has properties. Properties in subsequent workbooks are ignored.

template_callback

Place holder method for a callback routine to modify the content of the template before being written to the output spreadsheet.

This callback is activated for all shared string (both plain and rich text strings), and also for header/footer text.

The callback is supplied with the two parameters: The object name (since this is a method), and the text to be processed. This is passed as a reference to single scalar.

This method is called numerous times during processing (e.g. once for each unique string in the spreadsheet, so the user is advised to keep it efficient.

This callback approach does not force any particular templating system on the user. They are free to use whatever system they choose.

Note that templating can only do simple scalars. Complex templating (if-then- else, loops, etc) do not make sense in that the callback is supplied with the contents of a single cell. Having said that, remember that the full power of Excel::Writer::XLSX is available to the user to modify the template after it is processed.

   # A snippet of code to replace [% template %] in the 
   # template spreadsheet with 'Output'

   my ($self, $wbk) = Excel::Template::XLSX->new($output_xlsx, $template_xlsx);

   use Template::Tiny;
   my $template = Template::Tiny->new( TRIM => 1 );
   $self->{template_callback} = sub {
      my ($self, $textref) = @_;
      $template->process($textref, { template => 'Output' }, $textref );
   };

   $self->parse_template();

_apply_tint

Applies tinting to a color object, if the tint attribute is encountered in parsing.

_base_path_for

Manipulates the path to a member in the zip file, to find the associated rels file.

_cell_to_row_col

Converts an A1 style cell reference to a row and column index.

_color

Parses color element (rgb, index, theme, and tint)

_extract_files

Called by parse_template to fetch the xml strings from the zip file. XML strings are parsed, except for worksheets. Individual worksheets are parsed separately.

_parse_alignment

Parses horizontal and vertical cell alignments in a sheet.

_parse_borders

Parses cell border and diagonal border styles. Called from _parse_styles. Returns an array of border styles, each one as a hash.

_parse_fills

Parses styles for cell fills (pattern, foreground and background colors. horizontal and horizontal and vertical cell alignments in a sheet.

Gradients are parsed, but since EWX does not support gradients, a pattern is substituted.

_parse_fonts

Parses font information (font name, size, super/sub scripts, alignment colors, underline, bold, italic, and strikeout attributes).

_parse_numbers

Parses styles for cell number formats (financial, decimal, exponential, date-time, ...)

_parse_protection

Parses locked and hidden attributes for a cell. These are only useful if the worksheet is locked.

This module does not lock the workbook or the worksheet.

_parse_shared_strings

Parses the shared strings file. Excel does not directly store string values with the cell, but stores an index into the shared strings table instead, to save memory, if a string value is referenced more than once. Shared strings also contain formatting if multiple formats are applied within a cell (See write_rich_string in EWX.

_parse_sheet

Parses an individual worksheet. This is done in two passes. See _parse_sheet_pass1 and _parse_sheet_pass2 for what elements are parsed. This is necessary because the parse order of XML::Twig callbacks are in the wrong order for some sheet information (header/footer information, hyperlinks, and merged cells).

_parse_sheet_pass1

Parses some elements in a worksheet ( pageMargins, headerFooter, hyperlinks, pageSetup, Merged Cells, Sheet Formatting Row and Column heights, Sheet selection, and Tab Color)

_parse_sheet_pass2

Parses cell contents (first by row, then by column). Cells can contain inline strings, string references, direct string values, formulas, and hyperlinks. Each cell may also contain formatting information. The format is in an index to formatting for borders, shading, alignment, font, and number formats.

_parse_styles

Parses style information. Parses number formats directly. Calls subroutines to parse fonts, fills, and borders, alignment, and protection.

Finally, parses Cell Xfs elements to Combine fonts, borders, number formats, alignment, patterns, into a single format specification.

Calls EWX add_formats to create a format, and stores the format information in a FORMAT array within the object.

_parse_themes

Parses theme information. Some color settings are referenced by an index to the theme.

_parse_xml

Low level subroutine to parse an entire member of a zip file. Used for small files, such as xxx.xml.rels, where the entire file is parsed.

For larger files, XML::Twig::twig_handlers are used.

_rels_for

Returns the .rels file name for a sibling workbook or worksheet.

zzpodtail

Dummy subroutine to allow me to hide pod documentation when using code folding in the editor.

INSTALLATION

   Install with CPAN
   
      cpan Excel::Template::XLSX

   or, use the standard Unix style installation.
 
   Unzip and untar the module as follows:
 
      tar -zxvf Excel::Template::XLSX-nnn.tar.gz
 
   The module can be installed using the standard Perl procedure:
 
      perl Makefile.PL
      make
      make test
      make install    # As sudo/root

BUGS

Large spreadsheets may cause segfaults on perl 5.14 and earlier

This module internally uses XML::Twig, which makes it potentially subject to Bug #71636 for XML-Twig: Segfault with medium-sized document on perl versions 5.14 and below (the underlying bug with perl weak references was fixed in perl 5.15.5). The larger and more complex the spreadsheet, the more likely to be affected, but the actual size at which it segfaults is platform dependent. On a 64-bit perl with 7.6gb memory, it was seen on spreadsheets about 300mb and above. You can work around this adding XML::Twig::_set_weakrefs(0) to your code before parsing the spreadsheet, although this may have other consequences such as memory leaks.

Please report any bugs to GitHub Issues at https://github.com/davidsclarke/Excel-Template-XLSX/issues.

SUPPORT

You can find this documentation for this module with the perldoc command.

    perldoc Excel::Template::XLSX

You can also look for information at:

DEBUGGING TIPS

Using the Perl debugger gets complicated because of XML::Twig. The objects created by XML::Twig are HUGE. Also, stepping through the code often results in exceeding a stack depth of >100. The author found it helpful to take advantage of the simplify() method in XML::Twig when using the debugger 'x' command to examine variables.

   x $node->simplify()
   

Also, it is helpful to use the 'c' command to jump over XML::Twig subroutine calls and callbacks.

BUGS

Please report any bugs or feature requests to the author.

TO DO

   Worksheet Activation
   Table Formatting/Styles
   Calculation Mode
   

REPOSITORY

The Excel::Template::XLSX source code is hosted on github: http://github.com/davidsclarke/Excel-Template-xlsx.

SEE ALSO

   Excel::Writer::XLSX
   
   This module does not provide much documentation on the capabilites of methods
   for creating Excel content.  The documentation provided with EWX is excellent,
   and also has numerous examples included.

   Spreadsheet::ParseXLSX
   
   Although this module does not use Spreadsheet::ParseXLSX, the parsing and 
   comments regarding issues involved with parsing spreadsheets came from this module.

   XML::Twig and Archive::Zip
   
   Excel .xlsx files are zippped .xml files.  These two modules are used to 
   unzip the .xlsx file, extract the members, and parse the relative portions
   of the .xml files inside.

ACKNOWLEDGEMENTS

This module leverages the methods in Excel::Writer::XLSX, maintained by John McNamara to recreate the template.

The parser was developed using Spreadsheet::ParseXLSX as a starting point, maintained by Jesse Luehrs. This parser calls methods in EWX directly when a token is resolved rather than building up an object representing the parsed content.

LICENSE AND COPYRIGHT

Either the Perl Artistic Licence http://dev.perl.org/licenses/artistic.html or the GPL http://www.opensource.org/licenses/gpl-license.php.

AUTHOR

   David Clarke dclarke@cpan.org