Patrick Hochstenbach
and 2 contributors

NAME

Catmandu::Importer::HTML - An HTML importer

SYNOPSIS

    # From the command line
    $ catmandu convert HTML to YAML < ex/test.html

    # From Perl
    use Catmandu;

    my $importer = Catmandu->importer('HTML',file => 'ex/test.html');

    my $n = $importer->each(sub {
        my $hashref = $_[0];
        # ...
    });

DESCRIPTION

This is a Catmandu::Importer for converting HTML data using the HTML::TokeParser parser.

CONFIGURATION

file

Read input from a local file given by its path. Alternatively a scalar reference can be passed to read from a string.

fh

Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the file argument or by using STDIN.

encoding

Binmode of the input stream fh. Set to :utf8 by default.

fix

An ARRAY of one or more fixes or file scripts to be applied to imported items.

METHODS

Every Catmandu::Importer is a Catmandu::Iterable all its methods are inherited.

SEE ALSO

Catmandu::Importer, HTML::TokeParser