The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTML::Hyphenate - class for inserting soft hyphens into HTML.

VERSION

This is version 0.05.

SYNOPSIS

    use HTML::Hyphenate;

    $hyphenator = new HTML::Hyphenate();
    $html_with_soft_hyphens = $hyphenator->hyphenated($html);

    $hyphenator->html($html);
    $hyphenator->style($style); # czech or german

    $hyphenator->min_length(10);
    $hyphenator->min_pre(2);
    $hyphenator->min_post(2);
    $hyphenator->output_xml(1);
    $hyphenator->default_lang('en-us');
    $hyphenator->default_included(1);
    $hyphenator->classes_included(['shy']);
    $hyphenator->classes_excluded(['noshy']);

DESCRIPTION

Most HTML rendering engines used in web browsers don't figure out by themselves how to hyphenate words when needed, but we can tell them how they might do it by inserting soft hyphens into the words.

SUBROUTINES/METHODS

HTML::Hyphenate->new()

Constructs a new HTML::Hyphenate object.

$hyphenator->hyphenated()

Returns the HTML including the soft hyphens.

$hyphenator->html();

Gets or sets the HTML to hyphenate.

$hyphenator->style();

Gets or sets the style to use for pattern usages in TeX::Hyphen. Can be czech or german.

$hyphenator->min_length();

Gets or sets the minimum word length required for having soft hyphens inserted. Defaults to 10 characters.

$hyphenator->min_pre(2);

Gets or sets the minimum amount of characters in a word preserved before the first soft hyphen. Defaults to 2 characters.

$hyphenator->min_post(2);

Gets or sets the minimum amount of characters in a word preserved after the last soft hyphen. Defaults to 2 characters.

$hyphenator->output_xml(1);

Have HTML::TreeBuilder output HTML in HTML or XML mode.

$hyphenator->default_lang('en-us');

Gets or sets the default pattern to use when no language can be derived from the HTML.

$hyphenator->default_included();

Gets or sets if soft hyphens should be included in the whole tree by default. This can be used to insert soft hyphens only in parts of the HTML having specific class names.

$hyphenator->classes_included();

Gets or sets a reference to an array of class names that will have soft hyphens inserted.

$hyphenator->classes_excluded();

Gets or sets a reference to an array of class names that will not have soft hyphens inserted.

DEPENDENCIES

Class::Meta::Express Class::Meta::Type HTML::Entities HTML::Hyphenate::TypeDef HTML::TreeBuilder Log::Log4perl Readonly Set::Scalar TeX::Hyphen TeX::Hyphen::Pattern Test::More Test::NoWarnings

INCOMPATIBILITIES

    This module has the same limits as TeX::Hyphen, TeX::Hyphen::Pattern and HTML::TreeBuilder.

DIAGNOSTICS

This module uses Log::Log4perl for logging.

  • It warns when a language encountered in the HTML is not supported by TeX::Hyphen::Pattern

BUGS AND LIMITATIONS

  • Empty subclass test fails, this is probably a Class::Meta::Express issue. The empty subclass can't be empty, it needs at least:

        use Class::Meta::Express;
    
        class {
            ctor 'new';
        };
  • Perfect hyphenation can be more complicated than just inserting a hyphen somewhere in a word, and sometimes requires semantics to get it right. For example cafeetje should be hyphenated as cafe-tje and not cafee-tje and buurtje can be hyphenated as buur-tje or buurt-je, depending on it's meaning. While HTML could provide a bit more context (mainly the language being used) than plain text to handle these issues, the initial purpose of this module is to make it possible for HTML rendering engines that support soft hyphens to be able to break long words over multiple lines to avoid unwanted overflow.

  • The hyphenation doesn't get better than TeX::Hyphenate and it hyphenation patterns provide.

  • The round trip from HTML source via HTML::Tree to HTML source might introduce changes to the source, for example accented characters might be transformed to HTML encoded entity equivalent.

CONFIGURATION AND ENVIRONMENT

The output is generated by HTML::TreeBuilder and can be either HTML or XML.

AUTHOR

Roland van Ipenburg <ipenburg@xs4all.nl>

LICENSE AND COPYRIGHT

Copyright (C) 2010 by Roland van Ipenburg

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.0 or, at your option, any later version of Perl 5 you may have available.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENSE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.