Elasticsearch - The official client for Elasticsearch
version 1.00
use Elasticsearch; # Connect to localhost:9200: my $e = Elasticsearch->new(); # Round-robin between two nodes: my $e = Elasticsearch->new( nodes => [ 'search1:9200', 'search2:9200' ] ); # Connect to cluster at search1:9200, sniff all nodes and round-robin between them: my $e = Elasticsearch->new( nodes => 'search1:9200', cxn_pool => 'Sniff' ); # Index a document: $e->index( index => 'my_app', type => 'blog_post', id => 1, body => { title => 'Elasticsearch clients', content => 'Interesting content...', date => '2013-09-24' } ); # Get the document: my $doc = $e->get( index => 'my_app', type => 'blog_post', id => 1 ); # Search: my $results = $e->search( index => 'my_app', body => { query => { match => { title => 'elasticsearch' } } } ); # Cluster requests: $info = $e->cluster->info; $health = $e->cluster->health; $node_stats = $e->cluster->node_stats # Index requests: $e->indices->create(index=>'my_index'); $e->indices->delete(index=>'my_index');
Elasticsearch is the official Perl client for Elasticsearch, supported by elasticsearch.com. Elasticsearch itself is a flexible and powerful open source, distributed real-time search and analytics engine for the cloud. You can read more about it on elasticsearch.org.
This version of the client supports the Elasticsearch 1.0 branch by default, which is not backwards compatible with the 0.90 branch.
If you need to talk to a version of Elasticsearch before 1.0.0, please use Elasticsearch::Client::0_90::Direct as follows:
$es = Elasticsearch->new( client => '0_90::Direct' );
The greatest deception men suffer is from their own opinions.
Leonardo da Vinci
All of us have opinions, especially when it comes to designing APIs. Unfortunately, the opinions of programmers seldom coincide. The intention of this client, and of the officially supported clients available for other languages, is to provide robust support for the full native Elasticsearch API with as few opinions as possible: you should be able to read the Elasticsearch reference documentation and understand how to use this client, or any of the other official clients.
Should you decide that you want to customize the API, then this client provides the basis for your code. It does the hard stuff for you, allowing you to build on top of it.
This client provides:
Full support for all Elasticsearch APIs
HTTP backend (currently synchronous only - Any::Event support will be added later)
Robust networking support which handles load balancing, failure detection and failover
Good defaults
Helper utilities for more complex operations, such as bulk indexing, scrolled searches and reindexing.
Logging support via Log::Any
Compatibility with the official clients for Python, Ruby, PHP and Javascript
Easy extensibility
You can download the latest version of Elasticsearch from http://www.elasticsearch.org/download. See the installation instructions for details. You will need to have a recent version of Java installed, preferably the Java v7 from Sun.
The "new()" method returns a new client which can be used to run requests against the Elasticsearch cluster.
use Elasticsearch; my $e = Elasticsearch->new( %params );
The most important arguments to "new()" are the following:
nodes
The nodes parameter tells the client which Elasticsearch nodes it should talk to. It can be a single node, multiples nodes or, if not specified, will default to localhost:9200:
localhost:9200
# default: localhost:9200 $e = Elasticsearch->new(); # single $e = Elasticsearch->new( nodes => 'search_1:9200'); # multiple $e = Elasticsearch->new( nodes => [ 'search_1:9200', 'search_2:9200' ] );
Each node can be a URL including a scheme, host, port, path and userinfo (for authentication). For instance, this would be a valid node:
node
https://username:password@search.domain.com:443/prefix/path
See "node" in Elasticsearch::Role::Cxn::HTTP for more on node specification.
cxn_pool
The CxnPool modules manage connections to nodes in the Elasticsearch cluster. They handle the load balancing between nodes and failover when nodes fail. Which CxnPool you should use depends on where your cluster is. There are three choices:
CxnPool
Static
$e = Elasticsearch->new( cxn_pool => 'Static' # default nodes => [ 'search1.domain.com:9200', 'search2.domain.com:9200' ], );
The Static connection pool, which is the default, should be used when you don't have direct access to the Elasticsearch cluster, eg when you are accessing the cluster through a proxy. See Elasticsearch::CxnPool::Static for more.
Sniff
$e = Elasticsearch->new( cxn_pool => 'Sniff', nodes => [ 'search1:9200', 'search2:9200' ], );
The Sniff connection pool should be used when you do have direct access to the Elasticsearch cluster, eg when your web servers and Elasticsearch servers are on the same network. The nodes that you specify are used to discover the cluster, which is then sniffed to find the current list of live nodes that the cluster knows about. See Elasticsearch::CxnPool::Sniff.
Static::NoPing
$e = Elasticsearch->new( cxn_pool => 'Static::NoPing' nodes => [ 'proxy1.domain.com:80', 'proxy2.domain.com:80' ], );
The Static::NoPing connection pool should be used when your access to a remote cluster is so limited that you cannot ping individual nodes with a HEAD / request.
HEAD /
See Elasticsearch::CxnPool::Static::NoPing for more.
trace_to
For debugging purposes, it is useful to be able to dump the actual HTTP requests which are sent to the cluster, and the response that is received. This can be enabled with the trace_to parameter, as follows:
# To STDERR $e = Elasticsearch->new( trace_to => 'Stderr' ); # To a file $e = Elasticsearch->new( trace_to => ['File','/path/to/filename'] );
Logging is handled by Log::Any. See Elasticsearch::Logger::LogAny for more information.
Other arguments are explained in the respective module docs.
When you create a new instance of Elasticsearch, it returns a client object, which can be used for running requests.
use Elasticsearch; my $e = Elasticsearch->new( %params ); # create an index $e->indices->create( index => 'my_index' ); # index a document $e->index( index => 'my_index', type => 'blog_post', id => 1, body => { title => 'Elasticsearch clients', content => 'Interesting content...', date => '2013-09-24' } );
See Elasticsearch::Client::Direct for more details about the requests that can be run.
Each chunk of functionality is handled by a different module, which can be specified in the call to new() as shown in cxn_pool above. For instance, the following will use the Elasticsearch::CxnPool::Sniff module for the connection pool.
$e = Elasticsearch->new( cxn_pool => 'Sniff' );
Custom modules can be named with the appropriate prefix, eg Elasticsearch::CxnPool::, or by prefixing the full class name with +:
Elasticsearch::CxnPool::
+
$e = Elasticsearch->new( cxn_pool => '+My::Custom::CxnClass' );
The modules that you can override are specified with the following arguments to "new()":
client
The class to use for the client functionality, which provides methods that can be called to execute requests, such as search(), index() or delete(). The client parses the user's requests and passes them to the "transport" class to be executed. See :
search()
index()
delete()
Elasticsearch::Client::Direct (default, for 1.0 branch)
= item * Elasticsearch::Client::0_90::Direct (for 0.90 branch)
Elasticsearch::Client::Compat (for migration from the old ElasticSearch module)
transport
The Transport class accepts a parsed request from the "client" class, fetches a "cxn" from its "cxn_pool" and tries to execute the request, retrying after failure where appropriate. See:
Elasticsearch::Transport
cxn
The class which handles raw requests to Elasticsearch nodes. See:
Elasticsearch::Cxn::LWP (default)
Elasticsearch::Cxn::HTTPTiny
Elasticsearch::Cxn::NetCurl
cxn_factory
The class which the "cxn_pool" uses to create new "cxn" objects. See:
Elasticsearch::Cxn::Factory
The class to use for the connection pool functionality. It calls the "cxn_factory" class to create new "cxn" objects when appropriate. See:
Elasticsearch::CxnPool::Static (default)
Elasticsearch::CxnPool::Sniff
Elasticsearch::CxnPool::Static::NoPing
logger
The class the use for logging events and tracing HTTP requests/responses. See:
Elasticsearch::Logger::LogAny
serializer
The class to use for serializing request bodies and deserializing response bodies. See:
Elasticsearch::Serializer::JSON
See Elasticsearch::Compat, which allows you to run your old ElasticSearch code with the new Elasticsearch module.
The Elasticsearch API is pretty similar to the old ElasticSearch API, but there are a few differences. The most notable are:
hosts
servers
When instantiating a new Elasticsearch instance, use nodes instead of servers:
$e = Elasticsearch->new( nodes => [ 'search1:9200', 'search2:9200' ] );
no_refresh
By default, the new client does not sniff the cluster to discover nodes. To enable sniffing, use:
$e = Elasticsearch->new( cxn_pool => 'Sniff', nodes => [ 'search1:9200', 'search2:9200' ] );
To disable sniffing (the equivalent of setting no_refresh to true), do:
true
In the old client, you could specify query string and body parameters at the same level, eg:
$e->search( search_type => 'count', query => { match_all => {} } );
In the new client, body parameters should be passed in a body element:
body
$e->search( search_type => 'count', body => { query => { match_all => {} } } );
trace_calls
The new client uses Log::Any for event logging and request tracing. To trace requests/responses in curl format, do:
curl
# To STDERR $e = Elasticsearch->new (trace_to => 'Stderr'); # To a file $e = Elasticsearch->new (trace_to => ['File','/path/to/file.log']);
The old API integrated ElasticSearch::SearchBuilder for an SQL::Abstract style of writing queries and filters in Elasticsearch. This integration does not exist in the new client, but will be added in a future module.
scrolled_search()
Bulk indexing has changed a lot in the new client. The helper methods, eg bulk_index() and reindex() have been removed from the main client, and the bulk() method itself now simply returns the response from Elasticsearch. It doesn't interfere with processing at all.
bulk_index()
reindex()
bulk()
These helper methods have been replaced by the Elasticsearch::Bulk class. Similarly, scrolled_search() has been replaced by the Elasticsearch::Scroll.
Async support
Add async support using Promises for AnyEvent and perhaps Mojo.
New frontend
Add a new client with a similar less verbose interface to ElasticSearch and integration with ElasticSearch::SearchBuilder.
This is a stable API but this implementation is new. Watch this space for new releases.
If you have any suggestions for improvements, or find any bugs, please report them to http://github.com/elasticsearch/elasticsearch-perl/issues. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
You can find documentation for this module with the perldoc command.
perldoc Elasticsearch
You can also look for information at:
GitHub
http://github.com/elasticsearch/elasticsearch-perl
CPAN Ratings
http://cpanratings.perl.org/d/Elasticsearch
Search MetaCPAN
https://metacpan.org/module/Elasticsearch
IRC
The #elasticsearch channel on irc.freenode.net.
irc.freenode.net
Mailing list
The main Elasticsearch mailing list.
The full test suite requires a live Elasticsearch node to run, and should be run as :
perl Makefile.PL ES=localhost:9200 make test
TESTS RUN IN THIS WAY ARE DESTRUCTIVE! DO NOT RUN AGAINST A CLUSTER WITH DATA YOU WANT TO KEEP!
You can change the Cxn class which is used by setting the ES_CXN environment variable:
ES_CXN
ES_CXN=HTTPTiny ES=localhost:9200 make test
Clinton Gormley <drtech@cpan.org>
This software is Copyright (c) 2014 by Elasticsearch BV.
This is free software, licensed under:
The Apache License, Version 2.0, January 2004
To install Elasticsearch, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Elasticsearch
CPAN shell
perl -MCPAN -e shell install Elasticsearch
For more information on module installation, please visit the detailed CPAN module installation guide.