++ed by:
KEEDI SSCAFFIDI

2 PAUSE users

Gábor Egressy

NAME

Text::Format - Various subroutines to manipulate text.

SYNOPSIS

    use Text::Format;

    $text = Text::Format->new( {
        columns        => 72, # format, paragraphs, center
        tabstop        =>  8, # expand, unexpand, center
        firstIndent    =>  4, # format, paragraphs
        bodyIndent     =>  0, # format, paragraphs
        rightFill      =>  0, # format, paragraphs
        rightAlign     =>  0, # format, paragraphs
        leftMargin     =>  0, # format, paragraphs, center
        rightMargin    =>  0, # format, paragraphs, center
        extraSpace     =>  0, # format, paragraphs
        abbrevs        => {}, # format, paragraphs
        text           => [], # all
        hangingIndent  =>  0, # format, paragraphs
        hangingText    => [], # format, paragraphs
        noBreak        =>  0, # format, paragraphs
        noBreakRegex   => {}, # format, paragraphs
    } ); # these are the default values

    %abbr = (foo => 1, bar => 1);
    $text->abbrevs(\%abbr);
    $text->abbrevs();
    $text->abbrevs({foo => 1,bar => 1});
    $text->abbrevs(qw/foo bar/);
    $text->text(\@text);

    $text->columns(132);
    $text->tabstop(4);
    $text->extraSpace(1);
    $text->firstIndent(8);
    $text->bodyIndent(4);
    $text->config({tabstop => 4,firstIndent => 0});
    $text->rightFill(0);
    $text->rightAlign(0);

DESCRIPTION

The format routine will format under all circumstances even if the width isn't enough to contain the longest words. Text::Wrap will die under these circumstances, although I am told this is fixed, which isn't quite desirable in my opinion. If columns is set to a small number and words are longer than that and the leading 'whitespace' than there will be a single word on each line. This will let you make a simple word list which could be indented or right aligned. There is a chance for croaking if you try to subvert the module.

General setup should be explained with the below graph.

                              columns
<------------------------------------------------------------>
<----------><------><---------------------------><----------->
 leftMargin  indent  text is formatted into here  rightMargin

indent is firstIndent or bodyIndent

format @ARRAY || \@ARRAY || [<FILEHANDLE>] || NOTHING

Allows to do basic formatting of text into a paragraph, with indent for first line and body set separately. Can specify tab size and columns, width of total text, right fill with spaces and right align, right margin and left margin. Strips all leading and trailing whitespace before proceding. Text is first split into words and then reassebled.

paragraphs @ARRAY || \@ARRAY || [<FILEHANDLE>] || NOTHING

Considers each element of text as a paragraph and if the indents are the same for first line and the body then the paragraphs are separated by a single empty line otherwise they follow one under the other. If hanging indent is set then a single empty line will separate each paragraph as well. Calls format to do the actual formatting.

center @ARRAY || NOTHING

Centers a list of strings in @ARRAY or internal text. Empty lines appear as, you guessed it, empty lines. Center strips all leading and trailing whitespace before proceding. Left margin and right margin can be set.

expand @ARRAY || NOTHING

Expand tabs in the list of text to tabstop number of spaces in @ARRAY or internal text. Doesn't modify the internal text just passes back the modified text.

unexpand @ARRAY || NOTHING

Tabstop number of spaces are turned into tabs in @ARRAY or internal text. Doesn't modify the internal text just passes back the modified text.

new \%HASH || NOTHING

Instantiates the object. If you pass a reference to a hash, or an anonymous hash then it's used in setting atributes.

config \%HASH

Allows the configuration of all object attributes at once. Returns the object prior to configuration. You can use it to make a clone of your object before you change attributes.

columns NUMBER || NOTHING

Set width of text or retrieve width. This is total width and includes indentation and the right and left margins.

tabstop NUMBER || NOTHING

Set tabstop size or retrieve tabstop size, only used by expand and unexpand and center.

firstIndent NUMBER || NOTHING

Set or get indent for the first line of paragraph.

bodyIndent NUMBER || NOTHING

Set or get indent for the body of paragraph.

leftMargin NUMBER || NOTHING

Set or get width of left margin.

rightMargin NUMBER || NOTHING

Set or get width of right margin.

rightFill 0 || 1 || NOTHING

Set right fill to true or retrieve its value.

rightAlign 0 || 1 || NOTHING

Set right align to true or retrieve its value.

text \@ARRAY || NOTHING

Pass in a reference to your text that you want the routines to manipulate. Returns the text held in the object.

hangingIndent 0 || 1 || NOTHING

Use hanging indents in front of a paragraph, returns current value of attribute.

hangingText \@ARRAY || NOTHING

The text that will be displayed in front of each paragraph, if you call format than only the first element is used, if you call paragraphs then paragraphs cycles through all of them. If you have more paragraphs than elements in your array than the first one will get reused. Pass a reference to your array.

noBreak 0 || 1 || NOTHING

Set whether you want to use the non-breaking space feature.

noBreakRegex \%HASH || NOTHING

Pass in a reference to your hash that would hold the rgexes on which not to break. Returns the hash. eg.

    {'^Mrs?\.$' => '^\S+$','^\S+$' => '^(?:S|J)r\.$'}

don't break names such as Mr. Jones, Mrs. Jones Jones Jr.

The breaking algorithm is simple. If there should not be a break at the current end of sentence, then a backtrack is done till there are two words on which breaking is allowed. If no two such words are found then the end of sentence is broken anyhow. If there is a single word on current line then no backtrack is done and the word is stuck on the end.

extraSpace 0 || 1 || NOTHING

Add extra space after end of sentence, normally format would add 1 space after end of sentence, if this is set to 1 then 2 spaces are used. Abbreviations are not followed by two spaces. There are a few internal abbreviations and you can add your own to the object with abbrevs

abbrevs \%HASH || @ARRAY || NOTHING

Add to the current abbreviations, takes a reference to your array, if called a second time the original reference is removed. Returns the current INTERNAL abbreviations.

EXAMPLE

    use Text::Format;

    $text = new Text::Format;
    $text->rightFill(1);
    $text->columns(65);
    $text->tabstop(4);
    print $text->format("a line to format to an indented regular
            paragraph using 65 character wide display");
    print $text->paragraphs("paragraph one","paragraph two");
    print $text->center("hello world","nifty line 2");
    print $text->expand("\t\thello world\n","hmm,\twell\n");
    print $text->unexpand("    hello world\n","    hmm");
    $text->config({columns => 132, tabstop => 4});

    $text = Text::Format->new();
    print $text->format(@text);
    print $text->paragraphs(@text);
    print $text->center(@text);
    print $text->format([<FILEHANDLE>]);
    print $text->paragraphs([<FILEHANDLE>]);
    print $text->expand(@text);
    print $text->unexpand(@text);

    $text = Text::Format->new
        ({tabstop => 4,bodyIndent => 4,text => \@text});
    print $text->format();
    print $text->paragraphs();
    print $text->center();
    print $text->expand();
    print $text->unexpand();

    print Text::Format->new({columns => 95})->format(@text);

BUGS

Line length can exceed columns specified if columns is set to a small number and long words plus leading whitespace exceed column length specified. Actually I see this as a feature since it can be used to make up a nice wordlist.

AUTHOR

Gabor Egressy gabor@vmunix.com

Copyright (c) 1998 Gabor Egressy. All rights reserved. All wrongs reversed. This program is free software; you can redistribute and/or modify it under the same terms as Perl itself.

ACKNOWLEDGEMENTS

Tom Phoenix found bug with code for two spaces at end of sentence and provided code fragment for a better solution, some preliminay suggestions on design

Brad Appleton suggesting and explanation of hanging indents, suggestion for non-breaking whitespace, general suggestions with regard to interface design

Byron Brummer suggestion for better interface design and object design, code for better implementation of getting abbreviations

TODO