Unicode::Wrap - Unicode Line Breaking
use Unicode::Wrap; my $wrapper = new Unicode::Wrap( line_length => 75 ); my @lines = $wrapper->break_lines($long_string); my $text = $wrapper->wrap(" ", "", $long_string); my $text = $wrapper->rewrap("", "", $text); use Unicode::Wrap qw/ break_lines wrap /; $Unicode::Wrap::columns = 75; my @lines = break_lines($long_string); my $text = wrap(" ", "", $long_string);
This module implements UAX#14: Line Breaking Properties. It goes through a text string, classifies each character and computes a length for each. When the line gets too long, it's separated. Some Text::Wrap-style functions are also provided to do some simple text wrapping.
The following methods are available:
This constructs a new wrapping object. Parameters:
Specifies the length of a line (in whatever units you want to use)
If set, and there are no breaking opportunities before the line_length is reached, an 'emergency' break will be inserted at this position. Generally this should be set to line_length (or 1, since it won't be used until line_length is reached anyway).
If emergency_break is not set, no emergency breaks will be inserted, which could result in some really long lines if no line-breaking opportunity exists.
This should contain a coderef to your own 'length' implementation. It will be passed the character in question and the classification of that character. It should return the length of the character in your chosen unit.
This may also contain a simple hashref, keyed on the character, with values consisting of the length of that character.
In theory, this could be used to estimate the number of pixels each character would consume, using a variable-width font. You could then wrap based on the number of pixels and not just the number of characters.
If you wish to override the module's default classification method, you can either set this to be a hashref of direct mappings, or a coderef, which will be called (@_ = ($self, $code)) to determine the line breaking classification of that character. This function can return undef if you wish to defer to the default classification system for that lookup.
The next may be called either as object methods, or as functions:
This will break $text up into individual lines. Newlines are preserved but none will be added. Use this if you need an implementation of UAX#14 that just breaks lines up without re-assembling them into a text string.
$text
This will take a chunk of text, normalize the newlines (but preserve them) and attempt to wrap it per UAX#14 in the style of Text::Wrap. The difference here is that only one chunk of text can be wrapped at a time.
This does the same thing as wrap, except that newlines are normalized to spaces before wrapping. This might be used if you already have a paragraph of text that you want to re-wrap.
wrap
Returns the Line Breaking classification of the character passed.
print classify("a"); # AL print $self->classify("5"); # NU, unless $self->{classify} overrides
Unicode Standard Annex #14: Line Breaking Properties
David NESTING <david@fastolfe.net>
Copyright (c) 2003 David Nesting. All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Unicode::Wrap, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Unicode::Wrap
CPAN shell
perl -MCPAN -e shell install Unicode::Wrap
For more information on module installation, please visit the detailed CPAN module installation guide.