HTML::FormatPS - Format HTML as PostScript
use HTML::TreeBuilder; $tree = HTML::TreeBuilder->new->parse_file("test.html"); use HTML::FormatPS; $formatter = HTML::FormatPS->new( FontFamily => 'Helvetica', PaperSize => 'Letter', ); print $formatter->format($tree);
Or, for short:
use HTML::FormatPS; print HTML::FormatPS->format_file( "test.html", 'FontFamily' => 'Helvetica', 'PaperSize' => 'Letter', );
The HTML::FormatPS is a formatter that outputs PostScript code. Formatting of HTML tables and forms is not implemented.
You might specify the following parameters when constructing the formatter object (or when calling format_file or format_string):
What kind of paper should we format for. The value can be one of these: A3, A4, A5, B4, B5, Letter, Legal, Executive, Tabloid, Statement, Folio, 10x14, Quarto.
The default is "A4".
The width of the paper, in points. Setting PaperSize also defines this value.
The height of the paper, in points. Setting PaperSize also defines this value.
The left margin, in points.
The right margin, in points.
Both left and right margin at the same time. The default value is 4 cm.
The top margin, in points.
The bottom margin, in points.
Both top and bottom margin at the same time. The default value is 2 cm,
This parameter determines if we should put page numbers on the pages. The default value is true; so you have to set this value to 0 in order to suppress page numbers. (The "No" in "PageNo" means number/numero!)
This parameter specifies which family of fonts to use for the formatting. Legal values are "Courier", "Helvetica" and "Times". The default is "Times".
This is a scaling factor for all the font sizes. The default value is 1.
For example, if you want everything to be almost three times as large, you could set this to 2.7. If you wanted things just a bit smaller than normal, you could set it to .92.
This option (pronounced "ledding", not "leeding") controls how much is space between lines. This is a factor of the font size used for that line. Default is 0.1 -- so between two 12-point lines, there will be 1.2 points of space.
Assuming you have PageNo on, StartPage controls what the page number of the first page will be. By default, it is 1. So if you set this to 87, the first page would say "87" on it, the next "88", and so on.
If this option is set to a true value, HTML::FormatPS will make a point of not emitting the PostScript prolog before the document. By default, this is off, meaning that HTML::FormatPS will emit the prolog. This option is of interest only to advanced users.
If this option is set to a true value, HTML::FormatPS will make a point of not emitting the PostScript trailer at the end of the document. By default, this is off, meaning that HTML::FormatPS will emit the bit of PostScript that ends the document. This option is of interest only to advanced users.
my $formatter = FormatterClass->new( option1 => value1, option2 => value2, ... );
This creates a new formatter object with the given options.
Output is in ISO Latin1 format. The underlying HTML parsers tend to now work in Unicode (perl native) code points. There is an impedance mismatch between these, which may give issues with complex characters within HTML.
Support for some more character styles, notably including: strike-through, underlining, superscript, and subscript.
Support for Unicode.
Support for Win-1252 encoding, since that's what most people mean when they use characters in the range 0x80-0x9F in HTML.
And, if it's ever even reasonably possible, support for tables.
I would welcome email from people who can help me out or advise me on the above.
Nigel Metheringham <email@example.com>
Sean M Burke <firstname.lastname@example.org>
Gisle Aas <gisle@ActiveState.com>
This software is copyright (c) 2016 by Nigel Metheringham, 2002-2005 Sean M Burke, 1999-2002 Gisle Aas.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.