The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lucy::Index::SegWriter - Write one segment of an index.

DESCRIPTION

SegWriter is a conduit through which information fed to Indexer passes. It manages Segment and Inverter, invokes the Analyzer chain, and feeds low level DataWriters such as PostingListWriter and DocWriter.

The sub-components of a SegWriter are determined by Architecture. DataWriter components which are added to the stack of writers via add_writer() have Add_Inverted_Doc() invoked for each document supplied to SegWriter’s add_doc().

METHODS

register

    $seg_writer->register(
        api       => $api        # required
        component => $component  # required
    );

Register a DataWriter component with the SegWriter. (Note that registration simply makes the writer available via fetch(), so you may also want to call add_writer()).

  • api - The name of the DataWriter api which writer implements.

  • component - A DataWriter.

fetch

    my $obj = $seg_writer->fetch($api);

Retrieve a registered component.

  • api - The name of the DataWriter api which the component implements.

add_writer

    $seg_writer->add_writer($writer);

Add a DataWriter to the SegWriter’s stack of writers.

add_doc

    $seg_writer->add_doc(
        doc   => $doc    # required
        boost => $boost  # default: 1.0
    );

Add a document to the segment. Inverts doc, increments the Segment’s internal document id, then calls Add_Inverted_Doc(), feeding all sub-writers.

add_segment

    $seg_writer->add_segment(
        reader  => $reader   # required
        doc_map => $doc_map  # default: undef
    );

Add content from an existing segment into the one currently being written.

  • reader - The SegReader containing content to add.

  • doc_map - An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.

finish

    $seg_writer->finish();

Complete the segment: close all streams, store metadata, etc.

INHERITANCE

Lucy::Index::SegWriter isa Lucy::Index::DataWriter isa Clownfish::Obj.