The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTML::SummaryBasic - Basic summary info from HTML.

SYNOPSIS

        use HTML::SummaryBasic;
        my $p = new HTML::SummaryBasic  {
                PATH => "input.html",
                # or HTML => '<html>...</html>',
                NOT_AVAILABLE => undef,
        };
        foreach (keys %{$p->{SUMMARY}}){
                warn "$_ ... $p->{SUMMARY}->{$_}\n";
        }

DEPENDENCIES

        use HTML::TokeParser;
        use HTML::HeadParser;

DESCRIPTION

From a file or string of HTML, creates a hash of useful summary information from meta and body elements of an HTML document.

GLOBAL VARIABLE

$NOT_AVAILABLE

Value for empty fields. Default is [Not Available]. May be over-ridden directly by supplying the constructor with a field of the same name. See "THE SUMMARY STRUCTURE".

CONSTRUCTOR (new)

Accepts a hash-like structure...

HTML or PATH

Ref to a scalar of HTML, or plain string that is the path to an HTML file to process.

SUMMARY

Filled after get_summary is called (see "METHOD get_summary" and "THE SUMMARY STRUCTURE").

FIELDS

An array of meta tag names whose content value should be placed into the respective slots of the SUMMARY field after get_summary has been called.

THE SUMMARY STRUCTURE

A field of the object which is a hash, with key/values as follows:

AUTHOR

HTML meta tag X-META-AUTHOR.

TITLE

Text of the element of the same name.

DESCRIPTION

Content of the meta tag named X-META-DESCRIPTION.

LAST_MODIFIED_META, LAST_MODIFIED_FILE

Time since of the modification of the file, respectively according to any meta tag of the same name, with a X-META- prefix; failing that, according to the file system.

CREATED_META, CREATED_FILE

As above, but relating to the creation date of the file.

FIRST_PARA

The first HTML p element of the document.

HEADLINE

The first h1 tag; failing that, the first h2; failing that, the value of $NOT_AVAILABLE.

PLUS...

Any meta-fields specified in the FIELDS field.

TODO

Maybe work on URI as well as file paths.

SEE ALSO

HTML::TokeParser, HTML::HeadParser.

AUTHOR

Lee Goddard (LGoddard@CPAN.org)

COPYRIGHT

Copyright 2000-2001 Lee Goddard.

This library is free software; you may use and redistribute it or modify it undef the same terms as Perl itself.

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 40:

'=item' outside of any '=over'

Around line 49:

You forgot a '=back' before '=head1'