The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::AnyXfer::Elastic::Importer

SYNOPSIS

    use Data::AnyXfer::Elastic::Importer;

    my $importer = Data::AnyXfer::Elastic::Importer->new(
        logger => Data::AnyXfer::Elastic::Logger->new,
    );

    my $datafile = DataFile->new( file => 'my/project.datafile' );

    # Put this live on the relevant clusters
    my $response = $importer->deploy( datafile => $datafile );

    # alternativly, you can deploy in steps..
    See source of C<deploy>

DESCRIPTION

The Elasticsearch Importer is designed to take a datafile and stream the index into Elasticsearch cluster(s). This process is known as playing the datafile. The process creates a index with the mappings/settings defined in the datafile. It will create indexes on multiple clusters depending on the silo given. Once the index has been created it can then be finalise which makes the index live by switching over the alias.

ATTRIBUTES

logger

Logs events and errors to file. A instance of a Data::AnyXfer::Elastic::Logger.

bulk_max_count

Perl number. Defaults to 500. The maximum number of items which will be sent by the bulk helper before a flush is performed.

wait_count_timeout

Perl number. Defaults to 10. The maximum number of seconds to wait after indexing for the number of visible documents in the index to reach the expected count before treating the import as a failure.

delete_before_create

Boolean. Defaults to 0. When true, the importer instance will attempt to delete the index before creating them during "execute" in ..

document_id_field

String. Optional.

Allows you to specify a field on each document which will also be supplied to elasticsearch as the document's _id.

METHODS

deploy

This method "plays" the datafile. It streams the data from the datafile into Elasticsearch. It creates a unique index based on the datafiles' time-stamp and will assign the mapping, settings etc. Datafile documents are then streamed into the index via the bulk helper. Finally it swaps the aliases making the index 'live'.

    my $response = $importer->deploy(
        datafile         => $datafile,        # required
        silo             => 'public_data',    # optional
        no_finalise      => 1,                # optional, does not call finalise
    );
datafile

A required Data::AnyXfer::Elastic::Import::DataFile object that defines the content and configuration of an Elasticsearch index.

silo

A optional string that overrides the silo defined in $datafile index info.

no_finalise

An optional bool indicating whether to run finalise at the end of the successful deployment of the datafile to all intended nodes and clusters (defaults to 1).

Useful for situation where you need to delay or co-ordinate switching the data over with some other action.

execute

Use c<deploy> where you can, if you have to do it as several steps (import index, do something else, switch aliases), then see the source of the c<deploy> method.

    my $elastic = Data::AnyXfer::Elastic->new;
    my @clients = $elastic->all_clients_for($silo);

    foreach my $client ( @clients ) {

        $importer->execute(
            datafile         => $datafile,     # required
            elasticsearch    => $client,       # required
        );

    }

This method takes a datafile and plays it into Elasticsearch. It differs from deploy because it does not automatically finalise it. It returns the number of documents played on successful execution, or undef on error. The argument elasticsearch must be a Search::Elasticsearch object generated from Data::AnyXfer::Elastic. If not provided then a client will be generated from the datafile silo configuration.

finalise

    $importer->finalise;

This method finalises deployment by switching aliases for each datafile executed. It will concurrently add the alias to the new index while removing any previous associations, see c<deploy> source before using this directly.

errors

    my $errors = $importer->errors;

List the errors that have occurred.

cleanup

    $importer->cleanup;

This method removes all indexes the importer has created and will empty the cache. This can not be called if finalise() has been called already.

COPYRIGHT

This software is copyright (c) 2019, Anthony Lucas.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 60:

'=item' outside of any '=over'