The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Couchbase::Bucket - Couchbase Cluster data access

SYNOPSIS

    # Imports
    use Couchbase::Bucket;
    use Couchbase::Document;

    # Create a new connection
    my $cb = Couchbase::Bucket->new("couchbases://anynode/bucket", { password => "secret" });

    # Create and store a document
    my $doc = Couchbase::Document->new("idstring", { json => ["encodable", "string"] });
    $cb->insert($doc);
    if (!$doc->is_ok) {
        warn("Couldn't store document: " . $doc->errstr);
    }

    # Retrieve a document:
    $doc = Couchbase::Document->new("user:mnunberg");
    $cb->get($doc);
    printf("Full name is %s\n", $doc->value->{name});

    # Query a view:
    my $res = Couchbase::Document->view_slurp(['design_name', 'view_name'], limit => 10);
    # $res is actually a subclass of Couchbase::Document
    if (! $res->is_ok) {
        warn("There was an error in querying the view: ".$res->errstr);
    }
    foreach my $row (@{$res->rows}) {
        printf("Key: %s. Document ID: %s. Value: %s\n", $row->key, $row->id, $row->value);
    }

    # Get multiple items at once
    my $batch = $cb->batch;
    $batch->get(Couchbase::Document->new("user:$_")) for (qw(foo bar baz));

    while (($doc = $batch->wait_one)) {
        if ($doc->is_ok) {
            printf("Real name for userid '%s': %s\n", $doc->id, $doc->value->{name});
        } else {
            warn("Couldn't get document '%s': %s\n", $doc->id, $doc->errstr);
        }
    }

DESCRIPTION

Couchbase::Bucket is the main module for Couchbase and represents a data connection to the cluster.

The usage model revolves around an Couchbase::Document which is updated for each operation. Normally you will create a Couchbase::Document and populate it with the relevant fields for the operation, and then perform the operation itself. When the operation has been completed the relevant fields become updated to reflect the latest results.

CONNECTING

Connection String

To connect to the cluster, specify a URI-like connection string. The connection string is in the format of SCHEME://HOST1,HOST2,HOST3/BUCKET?OPTION=VALUE&OPTION=VALUE

scheme

This will normally be couchbase://. For SSL connections, use couchbases:// (note the extra s at the end). See "Using SSL" for more details

host

This can be a single host or a list of hosts. Specifying multiple hosts is not required but may increase availability if the first node is down. Multiple hosts should be separated by a comma.

If your administrator has configured the cluster to use non-default ports then you may specify those ports using the form host:port, where port is the memcached port that the given node is listening on. In the case of SSL this should be the SSL-enabled memcached port.

bucket

This is the data bucket you wish to connect to. If left unspecified, it will revert to the default bucket.

options

There are several options which can modify connection and general settings for the newly created bucket object. Some of these may be modifiable via Couchbase::Settings (returned via the settings() method) as well. This list only mentions those settings which are specific to the initial connection

config_total_timeout

Specify the maximum amount of time (in seconds) to wait until the client has been connected.

config_node_timeout

Specify the maximum amount of time (in seconds) to wait for a given node to respond to the initial connection request. This number may also not be higher than the value for config_total_timeout.

certpath

If using SSL, this option must be specified and should contain the local path to the copy of the cluster's SSL certificate. The path should also be URI-encoded.

Using SSL

To connect to an SSL-enabled cluster, specify the couchbases:// for the scheme. Additionally, ensure that the certpath option contains the correct path, for example:

    my $cb = Couchbase::Bucket->new("couchbases://securehost/securebkt?certpath=/var/cbcert.pem");

Specifying Bucket Credentials

Often, the bucket will be password protected. You can specify the password using the password option in the $options hashref in the constructor.

new($connstr, $options)

Create a new connection to a bucket. $connstr is a "Connection String" and $options is a hashref of options. The only recognized option key is password which is the bucket password, if applicable.

This method will attempt to connect to the cluster, and die if a connection could not be made.

DATA ACCESS

Data access methods operate on an Couchbase::Document object. When the operation has completed, its status is stored in the document's errnum field (you can also use the is_ok method to check if no errors occurred).

get($doc)

get_and_touch($doc)

Retrieve a document from the cluster. $doc is an Couchbase::Document. If the operation is successful, the value of the item will be accessible via its value field.

    my $doc = Couchbase::Document->new("id_to_retrieve");
    $cb->get($doc);
    if ($doc->is_ok) {
        printf("Got value: %s\n", $doc->value);
    }

The get_and_touch variant will also update (or clear) the expiration time of the item. See "Document Expiration" for more details:

    my $doc = Couchbase::Document->new("id", { expiry => 300 });
    $cb->get_and_touch($doc); # Expires in 5 minutes

fetch($id)

This is a convenience method which will create a new document with the given id and perform a get on it. It will then return the resulting document.

    my $doc = $cb->fetch("id_to_retrieve");

insert($doc)

replace($doc, $options)

upsert($doc, $options)

    my $doc = Couchbase::Document->new(
        "mutation_method_names",
        [ "insert", "replace", "upsert"],
        { expiry => 3600 }
    );

    # Store a new item into the cluster, failing if it exists:
    $cb->insert($doc);

    # Unconditionally overwrite the value:
    $cb->upsert($doc);

    # Only replace an existing value
    $cb->replace($doc);

    # Ignore any kind of race conditions:
    $cb->replace($doc, { ignore_cas => 1 });

    # Store the document, wait until it has been persisted
    # on at least 2 nodes
    $cb->replace($doc, { persist_to => 2 });

These three methods will set the value of the document on the server. insert will only succeed if the item does not exist, replace will only succeed if the item already exists, and upsert will unconditionally write the new value regardless of it existing or not.

Storage Format

By default, the document is serialized and stored as JSON. This allows proper integration with other optional functionality of the cluster (such as views and N1QL queries). You may also store items in other formats which may then be transparently serialized and deserialized as needed.

To specify the storage format for a document, specify the `format` setting in the Couchbase::Document object, like so:

    use Couchbase::Document;
    my $doc = Couchbase::Document->new('foo', \1234, { format => COUCHBASE_FMT_STORABLE);

This version of the client uses so-called "Common Flags", allowing seamless integration with Couchbase clients written in other languages.

Encoding Formats

Bear in mind that Perl's default encoding is Latin-1 and not UTF-8. To that effect, any input, unless indicated otherwise, is assumed to thus be Latin-1. There are various ways to change the "type" of a string, the details of which can be found within the utf8 and Encode modules.

From the perspective of this module, any input string which is marked as being JSON or UTF8 will be marked as being UTF-8. This may mean some smaller performance implications. If this is a concern, you can intercept the JSON decoding function and handle the raw string there.

CAS Operations

To avoid race conditions when two applications attempt to write to the same document Couchbase utilizes something called a CAS value which represents the last known state of the document. This CAS value is modified each time a change is made to the document, and is returned back to the client for each operation. If the $doc item is a document previously used for a successful get or other operation, it will contain the CAS, and the client will send it back to the server. If the current CAS of the document on the server does not match the value embedded into the document the operation will fail with the code COUCHBASE_KEY_EEXISTS.

To always modify the value (ignoring whether the value may have been previously modified by another application), set the ignore_cas option to a true value in the $options hashref.

Durability Requirements

Mutation operations in couchbase are considered successful once they are stored in the master node's cache for a given key. Sometimes extra redundancy and reliability is required, where an application should only proceed once the data has been replicated to a certain number of nodes and possibly persisted to their disks. Use the persist_to and replicate_to options to specify the specific durability requirements:

persist_to

Wait until the item has been persisted (written to non-volatile storage) of this many nodes. A value of 1 means the master node, where a value of 2 or higher means the master node and n-1 replica nodes.

replicate_to

Wait until the item has been replicated to the RAM of this many replica nodes. Your bucket must have at least this many replicas configured and online for this option to function.

You may specify a negative value for either persist_to or replicate_to to indicate that a "best-effort" behavior is desired, meaning that replication and persistence should take effect on as many nodes as are currently online, which may be less than the number of replicas the bucket was configured with.

You may request replication without persistence by simply setting replicate_to=0.

Document Expiration

In many use cases it may be desirable to have the document automatically deleted after a certain period of time has elapsed (think about session management). You can specify when the document should be deleted, either as an offset from now in seconds (up to 30 days), or as Unix timestamp.

The expiration is considered a property of the document and is thus configurable via the Couchbase::Document's expiry method.

remove($doc, $options)

Remove an item from the cluster. The operation will fail if the item does not exist, or if the item's CAS has been modified.

    my $doc = Couchbase::Document->new("KILL ME PLEASE");
    $cb->remove($doc);
    if ($doc->is_ok) {
        print "Deleted document OK!\n";
    } elsif ($doc->is_not_found) {
        print "Document already deleted!\n"
    } elseif ($doc->is_cas_mismatch) {
        print "Someone modified our document before we tried to delete it!\n";
    }

touch($doc, $options)

Update the item's expiration time. This is more efficient than get_and_touch as it does not return the item's value across the network.

Client Settings

settings()

Returns a hashref of settings (see Couchbase::Settings). Because this is a hashref, its values may be localized.

Set a high timeout for a specified operation:

    {
        local $cb->settings->{operation_timeout} = 20; # 20 seconds
        $cb->get($doc);
    }

ADVANCED DATA ACCESS

counter($doc, { delta=>n1, initial=n2 })

    sub example_hit_counter {
        my $page_name = shift;
        my $doc = Couchbase::Document->new("page:$page_name");
        $cb->counter($doc, { initial => 1, delta => 1 });
    }

This method treats the stored value as a number (i.e. a string which can be parsed as a number, such as "42") and atomically modifies its value based on the parameters passed.

The options are:

delta

the amount by which the current value should be modified. If the value for this option is negative then the counter will be decremented

initial

The initial value to assign to the item on the server if it does not yet exist. If this option is not specified and the item on the server does not exist then the operation will fail.

append_bytes($doc, { fragment => "string" })

prepend_bytes($doc, { fragment => "string"} )

These two methods concatenate the fragment value and the existing value on the server. They are equivalent to doing the following:

    # Append:
    $doc->value($doc->value . '_suffix');
    $doc->format('utf8');
    $cb->replace($doc);

    # Prepend:
    $doc->value('prefix_' . $doc->value);
    $doc->format('utf8');
    $cb->replace($doc);

The fragment option must be specified, and the value is not updated in the original document.

Also note that these methods do a raw string-based concatenation, and will thus only produce desired results if the existing value is a plain string. This is in contrast to COUCHBASE_FMT_JSON where a string is stored enclosed in quotation marks.

Thus a JSON string may be stored as "foo", and appending to it will yield "foo"bar, which is typically not what you want.

BATCH OPERATIONS

Batch operations allow more efficient utilization of the network by reducing latency and increasing the number of commands sent at a single time to the server.

Batch operations are executed by creating an Couchbase::OpContext; associating commands with the conext, and waiting for the commands to complete.

To create a new context, use the batch method

batch()

Returns a new Couchbase::OpContext which may be used to schedule operations.

Batched Durability Requirements

In some scenarios it may be more efficient on the network to submit durability requirement requests as a large single command. The behavior for the persist_to and replicate_to parameters in the upsert() family of methods will cause a durability request to be sent out to the given nodes node as soon as the success is received for the newly-modified item. This approach reduces latency at the cost of additional bandwidth.

Some bandwidth may be potentially saved if these requests are all batched together:

durability_batch($options)

Volatile - Subject to change

Creates a new durability batch. A durability batch is a special kind of batch where the contained commands can only be documents whose durability is to be checked.

    my $batch;
    $batch = $cb->batch;
    $batch->upsert($_) for @docs;
    $batch->wait_all;

    $batch = $cb->durability_batch({ persist_to => 1, replicate_to => 2 });
    $batch->endure($_) for @docs;
    $batch->wait_all;

The options passed can be persist_to and replicate_to. See the "Durability Requirements" section for information.

N1QL QUERIES (EXPERIMENTAL)

N1QL queries are available as an experimental feature of the client library.

The N1QL API exposes two functions, both of which function similarly to their view counterparts.

At the time of writing, the server does not include N1QL as an integrated feature (because it is still experimental). This means it must be downloaded as a standalone package (see http://docs.couchbase.com/developer/n1ql-dp4/n1ql-intro.html). Once downloaded and configured, the _host option should be passed to the query function (as detailed below).

N1QL functions return a Couchbase::N1QL::Handle object, which functions similarly to Couchbase::View::Handle (internally, they share a lot of code).

query_slurp("query", $queryargs, $queryopts)

Issue an N1QL query. This will send the query to the server (encoding any parameters as needed).

    my $rv = $cb->query_slurp(
        # Query string
        'SELECT *, META().id FROM travel WHERE travel.country = $country ',

        # Placeholder values
        { country => "Ecuador", },

        # Query options
        { _host => "localhost:8093" }
    );

    foreach my $row (@{$rv->rows}) {
        # do something with decoded JSON
    }

The queryargs parameter can either be a hashref of named placeholders (omiting of course, the leading $ which is handled internally), or it can be an arrayref of positional placeholders (if your query uses positional placeholders).

The queryopts is a set of other modifiers for the query. Most of these are sent to the server. One special parameter is the _host parameter, which points to a standalone instance of the N1QL Developer Preview installation; a temporary necesity for pre-release versions. Using of the _host paramter will be removed once Couchbase Server is available (in release or pre-release) with an integrated N1QL process.

query_iterator("query", $queryargs, $queryopts)

This function is to query_slurp as view_iterator is to view_slurp. In short, this allows an iterator over the rows, only fetching data from the network as needed. This is more efficient (but a bit less simple to use) than query_slurp

    my $rv = $cb->query_iterator("select * from default");
    while ((my $row = $rv->next)) {
        # do something with row.
    }

VIEW (MAPREDUCE) QUERIES

View methods come in two flavors. One is an iterator which incrementally fetches data from the network, while the other loads the entire data and then returns. For small queries (i.e. those which do not return many results), which API you use is a matter of personal preference. For larger resultsets, however, it often becomes a necessity to not load the entire dataset into RAM.

Both the view_slurp and view_iterator return Couchbase::View::Handle objects. This has been changed from previous versions which returned a Couchbase::View::HandleInfo object (Though the APIs remain the same).

view_slurp("design/view", %options)

Queries and returns the results of a view. The first argument may be provided either as a string of "$design/$view" or as a two-element array reference containing the design and view respectively.

The %options are options passed verbatim to the view engine. Some options however are intercepted by the client, and modify how the view is queried.

spatial

Indicate that the queried view is a geospatial view. This is required since the formatting of the internal URI is slightly different.

include_docs

Indicate that the relevant documents should be fetched for each view. The following forms are equivalent.

    # fetching directly:
    my $iter = $bkt->view_iterator(['design', 'view']);
    while ((my $row = $iter->next)) {
        my $doc = Couchbase::Document->new($row->id);
        $bkt->get($doc);
    }

    # using include_docs
    my $iter = $bkt->view_iterator(['design', 'view'], include_docs => 1);
    while ((my $row = $iter->next)) {
        my $doc = $row->doc;
    }

Using include_docs is significantly more efficient than fetching the rows manually as it allows the library to issue gets in bulk for each raw chunk of view results received - and also allows the library to "lazily" fetch documents while other rows are being received.

The returned object contains various status information about the query. The rows themselves may be found inside the rows accessor:

    my $rv = $cb->view_slurp("beer/brewery_beers", limit => 5);
    foreach my $row @{ $rv->rows } {
        printf("Got row for key %s with document id %s\n", $row->key, $row->id);
    }

This method returns an instance of Couchbase::View::Handle which may be used to inspect for error messages. The object is in fact a subclass of Couchbase::Document with an additional errinfo method to provide more details about the operation.

    if (!$rv->is_ok) {
        if ($rv->errnum) {
            # handle error code
        }
        if ($rv->http_code !~ /^2/) {
            # Failed HTTP status
        }
    }

As of version 2.0.3, this method is implemented as a wrapper atop view_iterator

view_iterator("design/view", %options)

This works in much the same way as the view_slurp() method does, except that it returns responses incrementally, which is handy if you expect the query to return a large amount of results:

    my $iter = $cb->view_iterator("beer/brewery_beers");
    while (my $row = $iter->next) {
        printf("Got row for key %s with document id %s\n", $row->key, $row->id);
    }

Note that the contents of the Handle object are only considered valid once the iterator has been through at least one iteration; thus:

Incorrect, because it requests the info object before iteration has started

    my $iter = $cb->view_iterator($dpath);
    if (!$iter->info->is_ok) {
        # ...
    }

Correct

    my $iter = $cb->view_iterator($dpath);
    while (my $row = $iter->next) {
        # ...
    }
    if (!$iter->info->is_ok) {
        # ...
    }

INFORMATIONAL METHODS

These methods return various sorts of into about the cluster or specific items

stats()

stats("spec")

Retrieves cluster statistics from each server. The return value is an Couchbase::Document with its value field containing a hashref of hashrefs, like so:

    # Dump all the stats, per server:
    my $results = $cb->stats()->value;
    while (my ($server,$stats) = each %$results) {
        while (my ($statkey, $statval) = each %$stats) {
            printf("Server %s: %s=%s\n", $server, $statkey, $statval);
        }
    }

keystats($id)

Returns metadata about a specific document ID. The metadata is returned in the same manner as in the stats() method. This will solicit each server which is either a master or replica for the item to respond with information such as the cas, expiration time, and persistence state of the item.

This method should be used for informative purposes only, as its output and availability may change in the future.

observe($id, $options)

Returns persistence and replication status about a specific document ID. Unlike the keystats method, the information is received from the network as binary and is thus more efficient.

You may also pass a master_only option in the options hashref, in which case only the master node from the item will be contacted.