The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

AI::MicroStructure::RemoteList - Retrieval of a remote source for a structure

SYNOPSIS

    package AI::MicroStructure::contributors;
    use strict;
    use AI::MicroStructure::List;
    our @ISA = qw( AI::MicroStructure::List );

    # data regarding the remote source
    our %Remote = (
        source =>
            'http://search.cpan.org/dist/AI-MicroStructure/CONTRIBUTORS',
        extract => sub {
            my $content = shift;
            my @items   =
                map { AI::MicroStructure::RemoteList::tr_nonword($_) }
                map { AI::MicroStructure::RemoteList::tr_accent($_) }
                $content =~ /^\* (.*?)\s*$/gm;
            return @items;
        },
    );

    __PACKAGE__->init();

    1;

    # and the usual documentation and list definition

DESCRIPTION

This base class adds the capability to fetch a fresh list of items from a remote source to any structure that requires it.

To be able to fetch remote items, an AI::MicroStructure structure must define the package hash variable %Remote with the appropriate keys.

The keys are:

source

The URL where the data is available. The content will be passed to the extract subroutine.

Because of the various way the data can be made available on the web and can be used in AI::MicroStructure, this scheme has evolved to support several cases:

Single source URL:

    source => $url

Multiple source URL:

    source => [ $url1, $url2, ... ]

For structures with categories, it's possible to attach a URL for each category:

    source => {
        category1 => $url1,
        category2 => $url2,
        ...
    }

In the case where the source is an array or a hash reference, an extra case is supported, in case the source data can only be obtained via a POST request. In that case, the source should be provided as either:

    source => [
        [ $url1 => $data1 ],
        [ $url2 => $data2 ],
        ...
    ]

or

    source => {
        category1 => [ $url1 => $data1 ],
        category2 => [ $url2 => $data2 ],
        ...
    }

It is possible to mix POST and GET URL:

    source => [
        $url1,                  # GET
        [ $url2 => $data2 ],    # POST
        ...
    ]

or

    source => {
        category1 => $url1,                  # GET
        category2 => [ $url2 => $data2 ],    # POST
        ...
    }

This means that even if there is only one source and a POST request must be used, then it must be provided as a list of a single item:

    source => [ [ $url => $data ] ]
extract

A reference to a subroutine that extracts a list of items from a string. The string is meant to be the content available at the URL stored in the source key.

The coderef may receive an optional parameter corresponding to the name of the category (useful if the coderef must behave differently depending on the category).

LWP::Simple is used to download the remote data.

All existing AI::MicroStructure behaviours (AI::MicroStructure::List and AI::MicroStructure::Locale are subclasses of AI::MicroStructure::RemoteList.

METHODS

As an ancestor, this class adds the following methods to an AI::MicroStructure structure:

remote_list()

Returns the list of items available at the remote source, or an empty list in case of error.

has_remotelist()

Return a boolean indicating if the source key is defined (and therefore if the structure actually has a remote list).

source()

Return the data structure containing the source URLs. This can be quite different depending on the class: a single scalar (URL), an array reference (list of URLs) or a hash reference (each value being either a scalar or an array reference) for structures that are subclasses of AI::MicroStructure::MultiList.

sources( [ $category ] )

Return the list of source URL. The $category parameter can be used to select the sources for a sub-category of the structure (in the case of AI::MicroStructure::MultiList).

$category can be an array reference containing a list of categories.

extract( $content )

Return a list of items from the $content string. $content is expected to be the content available at the URL given by source().

TRANSFORMATION SUBROUTINES

The AI::MicroStructure::RemoteList class also provides a few helper subroutines that simplify the normalisation of items:

tr_nonword( $str )

Return a copy of $str with all non-word characters turned into underscores (_).

tr_accent( $str )

Return a copy of $str will all iso-8859-1 accented characters turned into basic ASCII characters.

tr_utf8_basic( $str )

Return a copy of $str with some of the utf-8 accented characters turned into basic ASCII characters. This is very crude, but I didn't to bother and depend on the proper module to do that.

AUTHOR

'santex' << <santex@cpan.org> >>.

SEE ALSO

AI::MicroStructure, AI::MicroStructure::List, AI::MicroStructure::Locale.

COPYRIGHT

Copyright 2009-2016 Hagen Geissler, All Rights Reserved.

LICENSE This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.