The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MsOffice::Word::Template - treat a Word document as Template Toolkit document

SYNOPSIS

  my $template = MsOffice::Word::Template->new($filename);
  my $new_doc  = $template->process(\%data);
  $new_doc->save_as($path_for_new_doc);

DESCRIPTION

Purpose

This module treats a Microsoft Word document as a template for generating other documents. The idea is similar to the "mail merge" functionality in Word, but with much richer possibilities, because the whole power of the Perl Template Toolkit can be exploited, for example for

  • dealing with complex, nested datastructures

  • using control directives like IF, FOREACH, CALL, etc.

To distinguish templating directives from regular Word content, just use the Word highlighting function :

  • fragments highlighted in yelllow are interpreted as GET directives, i.e. the data content will be inserted at that point in the document, keeping the current formatting properties (bold, italic, font, etc.).

  • fragments highlighted in green are interpreted as Template Toolkit directives that do not directly generate content, like IF, FOREACH, etc. The Word formatting around such directives is dismissed, including the current context (paragraph or table row), in order to avoid empty paragraphs or empty rows in the resulting document.

Status

This first release is a proof of concept. Some simple templates have been successfully tried; however it is likely that a number of improvements will have to be made before this system can be used at large scale in production. If you use this module, please keep me informed of your difficulties, tricks, suggestions, etc.

METHODS

Constructor

new

  my $template = MsOffice::Word::Template->new($filename);
  # or : my $template = MsOffice::Word::Template->new($surgeon);   # an instance of MsOffice::Word::Surgeon
  # or : my $template = MsOffice::Word::Template->new(surgeon => $surgeon, %options);

Possible options are :

surgeon

an instance of MsOffice::Word::Surgeon

data_color

the Word highlight color for marking GET directives (default : yellow)

directive_color

the Word highlight color for marking other directives (default : green)

template_config

hashref of configuration options to be passed to "new" in Template -- see Template::Manual::Config

Using the template

process

  my $new_doc = $template->process(\%data);
  $new_doc->save_as($path_for_new_doc);

Process the template on a given data tree, and return a new document (actually, a new instance of MsOffice::Word::Surgeon). That document can then be saved in a file using "save_as" in MsOffice::Word::Surgeon.

WRITING TEMPLATES

A template is just a regular Word document, in which the highlighted fragments represent templating instructions.

The "holes" to be filled must be highlighted in yellow. Fill these zones with the names of variables to fill the holes. Names of variables can be paths into a complex datastructure, with dots separating the levels, like foo.3.bar.-1 -- see "GET" in Template::Manual::Directive and Template::Manual::Variables. Thanks to the Template::AutoFilter module, the builtin html filter of the Template Toolkit is automatically applied, so that ampersand characters and angle brackets are automatically replaced by the corresponding HTML entities (otherwise the resulting XML would be incorrect and could not be opened by Microsoft Word).

Control directives like IF, FOREACH, etc. must be highlighted in green. When seeing a green zone, the system will remove markup for the surrounding text, run and paragraph nodes. If this occurs within a table, the markup for the current row is also removed. Without this mechanism, the final result would contain an empty paragraph or an empty row for each templating directive.

In consequence of this distinction between yellow and green highlights, templating zones cannot mix GET directives with other directives : a GET directive within a green zone would generate output outside of the regular XML flow (paragraph nodes, run nodes and text nodes), and therefore MsWord would generate an error when trying to open such content. There is a workaround, however : GET directives within a green zone will work if they also generate the appropriate markup for paragraphs, runs and text nodes; but in that case you must also apply the "none" filter from Template::AutoFilter so that angle brackets in XML markup do not get translated into HTML entities.

AUTHOR

Laurent Dami, <dami AT cpan DOT org<gt>

COPYRIGHT AND LICENSE

Copyright 2020 by Laurent Dami.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.