ElasticSearch::Transport - Base class for communicating with ElasticSearch
ElasticSearch::Transport is a base class for the modules which communicate with the ElasticSearch server.
It handles failover to the next node in case the current node closes the connection.
All requests are round-robin'ed to all live servers as returned by /_cluster/nodes, except we shuffle the server list when we retrieve it, and thus avoid having all our instances make their first request to the same server.
/_cluster/nodes
shuffle
On the first request and every max_requests after that (default 10,000), the list of live nodes is automatically refreshed. This can be disabled by setting max_requests to 0.
max_requests
0
Regardless of the max_requests setting, a list of live nodes will still be retrieved on the first request. This may not be desirable behaviour if, for instance, you are connecting to remote servers which use internal IP addresses, or which don't allow remote nodes() requests.
nodes()
If you want to disable this behaviour completely, set no_refresh to 1, in which case the transport module will round robin through the servers list only. Failed nodes will be removed from the list (but added back in every max_requests or when all nodes have failed):
no_refresh
1
servers
The HTTP clients check that the post body content length is not greater than the max_content_length, which defaults to 104,857,600 bytes (100MB) - the default that is configured in Elasticsearch. From version 0.19.12, when no_refresh set to false, the HTTP transport clients will auto-detect the minimum max_content_length from the cluster.
max_content_length
Currently, the available backends are:
http (default)
http
Uses LWP to communicate using HTTP. See ElasticSearch::Transport::HTTP
httplite
Uses HTTP::Lite to communicate using HTTP. See ElasticSearch::Transport::HTTPLite
httptiny
Uses HTTP::Tiny to communicate using HTTP. See ElasticSearch::Transport::HTTPTiny
curl
Uses WWW::Curl and thus libcurl to communicate using HTTP. See ElasticSearch::Transport::Curl
aehttp
Uses AnyEvent::HTTP to communicate asynchronously using HTTP. See ElasticSearch::Transport::AEHTTP
aecurl
Uses AnyEvent::Curl::Multi (and thus libcurl) to communicate asynchronously using HTTP. See ElasticSearch::Transport::AECurl
thrift
Uses thrift to communicate using a compact binary protocol over sockets. See ElasticSearch::Transport::Thrift. You need to have the transport-thrift plugin installed on your ElasticSearch server for this to work.
transport-thrift
You shouldn't need to talk to the transport modules directly - everything happens via the main ElasticSearch class.
use ElasticSearch; my $e = ElasticSearch->new( servers => 'search.foo.com:9200', transport => 'httplite', timeout => '10', no_refresh => 0 | 1, deflate => 0 | 1, max_content_length => 104_857_600, ); my $t = $e->transport; $t->max_requests(5) # refresh_servers every 5 requests $t->protocol # eg 'http' $t->next_server # next node to use $t->current_server # eg '127.0.0.1:9200' ie last used node $t->default_servers # seed servers passed in to new() $t->servers # eg ['192.168.1.1:9200','192.168.1.2:9200'] $t->servers(@servers); # set new 'live' list $t->refresh_servers # refresh list of live nodes $t->clear_clients # clear all open clients $t->no_refresh(0|1) # don't retrieve the live node list # instead, use just the nodes specified $t->deflate(0|1); # should ES deflate its responses # useful if ES is on a remote network. # ES needs compression enabled with # http.compression: true $t->max_content_length(1000); # set the max HTTP body content length $t->register('foo',$class) # register new Transport backend
Although the thrift interface has the right buzzwords (binary, compact, sockets), the generated Perl code is very slow. Until that is improved, I recommend one of the http backends instead.
The HTTP backends in increasing order of speed are:
http - LWP based
httplite - HTTP::Lite based, about 30% faster than http
httptiny - HTTP::Tiny based, about 1% faster than httplite
curl - WWW::Curl based, about 60% faster than httptiny!
See also: http://www.elasticsearch.org/guide/reference/modules/http.html and http://www.elasticsearch.org/guide/reference/modules/thrift.html
If you want to add a new transport backend, then these are the methods that you should subclass:
$t->init($params)
By default, a no-op. Receives a HASH ref with the parameters passed in to new(), less servers, transport and timeout.
new()
transport
timeout
Any parameters specific to your module should be deleted from $params
$params
$json = $t->send_request($server,$params) where $params = { method => 'GET', cmd => '/_cluster', qs => { pretty => 1 }, data => '{ "foo": "bar"}', }
This must be overridden in the subclass - it is the method called to actually talk to the server.
See ElasticSearch::Transport::HTTP for an example implementation.
$t->protocol
This must return the protocol in use, eg "http" or "thrift". It is used to extract the list of bound addresses from ElasticSearch, eg http_address or thrift_address.
"http"
"thrift"
http_address
thrift_address
$client = $t->client($server)
Returns the client object used in "send_request()". The server param will look like "192.168.5.1:9200". It should store its clients in a PID specific slot in $t->{_client} as clear_clients() deletes this key.
"192.168.5.1:9200"
$t->{_client}
clear_clients()
See "client()" in ElasticSearch::Transport::HTTP and "client()" in ElasticSearch::Transport::Thrift for an example implementation.
You can register your Transport backend as follows:
BEGIN { ElasticSearch::Transport->register('mytransport',__PACKAGE__); }
ElasticSearch
ElasticSearch::Transport::HTTP
ElasticSearch::Transport::HTTPLite
ElasticSearch::Transport::HTTPTiny
ElasticSearch::Transport::Curl
ElasticSearch::Transport::AEHTTP
ElasticSearch::Transport::AECurl
ElasticSearch::Transport::Thrift
Copyright 2010 - 2011 Clinton Gormley.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install ElasticSearch, copy and paste the appropriate command in to your terminal.
cpanm
cpanm ElasticSearch
CPAN shell
perl -MCPAN -e shell install ElasticSearch
For more information on module installation, please visit the detailed CPAN module installation guide.