Marvin Humphrey
and 1 contributors

Changes for version 0.30_01

  • Highlights:
    • Many new classes and methods.
    • Improved Searcher open times and decreased process memory footprint.
    • Improved sorting support.
    • Improved subclassing support.
    • Improved indexing speed.
    • Schemas serialized and stored with indexes.
    • Improved pluggability.
    • Expanded tutorial documentation.
    • Restored Windows compatibility.
  • New public classes:
    • KinoSearch::Architecture
    • KinoSearch::Doc
    • KinoSearch::Doc::HitDoc
    • KinoSearch::Indexer (replaces InvIndexer)
    • KinoSearch::FieldType (replaces FieldSpec)
    • KinoSearch::FieldType::BlobField
    • KinoSearch::FieldType::FullTextField (replaces FieldSpec::text)
    • KinoSearch::FieldType::StringField
    • KinoSearch::Highlight::HeatMap
    • KinoSearch::Index::DataReader
    • KinoSearch::Index::DataWriter
    • KinoSearch::Index::DocReader
    • KinoSearch::Index::Lexicon
    • KinoSearch::Index::LexiconReader
    • KinoSearch::Index::PolyReader
    • KinoSearch::Index::PostingList
    • KinoSearch::Index::PostingsReader
    • KinoSearch::Index::Segment
    • KinoSearch::Index::SegReader
    • KinoSearch::Index::SegWriter
    • KinoSearch::Index::Snapshot
    • KinoSearch::Obj
    • KinoSearch::Search::ANDQuery
    • KinoSearch::Search::Compiler
    • KinoSearch::Search::HitCollector
    • KinoSearch::Search::HitCollector::BitCollector
    • KinoSearch::Search::LeafQuery
    • KinoSearch::Search::MatchAllQuery
    • KinoSearch::Search::Matcher
    • KinoSearch::Search::NoMatchQuery
    • KinoSearch::Search::NOTQuery
    • KinoSearch::Search::ORQuery
    • KinoSearch::Search::PolyQuery
    • KinoSearch::Search::RangeQuery (replaces RangeFilter)
    • KinoSearch::Search::RequiredOptionalQuery
    • KinoSearch::Search::SortRule (factored out of SortSpec)
    • KinoSearch::Search::Span
    • KinoSearch::Util::BitVector
    • KSx::Index::ByteBufDocReader
    • KSx::Index::ByteBufDocWriter
    • KSx::Index::ZlibDocReader
    • KSx::Index::ZlibDocWriter
    • KSx::Search::MockScorer
  • New/updated documentation:
    • KinoSearch::Docs::Tutorial::Simple (updated)
    • KinoSearch::Docs::Tutorial::BeyondSimple (updated)
    • KinoSearch::Docs::Tutorial::FieldType (new)
    • KinoSearch::Docs::Tutorial::Analysis (new)
    • KinoSearch::Docs::Tutorial::Highlighter (new)
    • KinoSearch::Docs::Tutorial::QueryObjects (new)
    • KinoSearch::Docs::Cookbook::CustomQuery (new)
    • KinoSearch::Docs::Cookbook::CustomQueryParser (new)
    • KinoSearch::Docs::DocIDs (new)
  • Removed/redacted/replaced:
    • KinoSearch::Analysis::Token - redacted pending API overhaul.
    • KinoSearch::Analysis::TokenBatch - redacted pending API overhaul.
    • KinoSearch::Docs::DevGuide - removed.
    • KinoSearch::FieldSpec - replaced by FieldType.
    • KinoSearch::FieldSpec::text - replaced by FullTextType and StringType.
    • KinoSearch::Highlight::Encoder - rolled into Highlighter.
    • KinoSearch::Highlight::Formatter - rolled into Highlighter.
    • KinoSearch::Highlight::SimpleHTMLEncoder - rolled into Highlighter.
    • KinoSearch::Highlight::SimpleHTMLFormatter - rolled into Highlighter.
    • KinoSearch::Index::Term - removed. Now any object can be a term.
    • KinoSearch::InvIndex - removed.
    • KinoSearch::InvIndexer - replaced by Indexer.
    • KinoSearch::Posting - redacted pending API overhaul.
    • KinoSearch::Posting::MatchPosting - redacted pending API overhaul.
    • KinoSearch::Posting::RichPosting - redacted pending API overhaul.
    • KinoSearch::Posting::ScorePosting - redacted pending API overhaul.
    • KinoSearch::Search::BooleanQuery - replaced by ANDQuery, ORQuery, NOTQuery, and RequiredOptionalQuery.
    • KinoSearch::Search::Filter - removed. Filtering can now be achieved via ANDQuery, NOTQuery, etc.
    • KinoSearch::Search::PolyFilter - removed.
    • KinoSearch::Search::QueryFilter - replaced by KSx::Search::Filter
    • KinoSearch::Search::RangeFilter - replaced by RangeQuery.
    • KinoSearch::Util::Class - removed.
    • KinoSearch::Util::ToolSet - permanently redacted.
  • Renamed:
    • KinoSearch::Analysis::LCNormalizer => KinoSearch::Analysis::CaseFolder
    • KinoSearch::Search::SearchServer => KSx::Remote::SearchServer
    • KinoSearch::Search::SearchClient => KSx::Remote::SearchClient
    • KinoSearch::Simple => KSx::Simple
    • KinoSearch::Search::MultiSearcher => KinoSearch::Search::PolySearcher
  • API Changes:
    • KinoSearch::Analysis::Analyzer o analyze_batch() - redacted pending API overhaul.
    • KinoSearch::Analysis::PolyAnalyzer o get_analyzers() - added.
    • KinoSearch::Analysis::Tokenizer o new() - parameter "token_re" replaced by "pattern".
    • KinoSearch::Highlight::Highlighter o Highlighter objects are now single-field. o Fields must now be marked as "highlightable" at index time via their FieldType. o Excerpts are now created manually rather than automatically inserted via the Hits class. o new() - now takes four params instead of none: "searchable", "field", "query", and "excerpt_length". o add_spec() - removed. o create_excerpt(), highlight(), encode(), set_pre_tag(), get_pre_tag(), set_post_tag(), get_post_tag(), get_searchable(), get_query(), get_compiler(), get_excerpt_length(), get_field - added.
    • KinoSearch::Index::IndexReader o open() - takes an "index" (string filepath or Folder object) instead of an "invindex", plus an optional "snapshot". Always returns a PolyReader (instead of an unspecified IndexReader subclass). o max_doc() - replaced by doc_max(), which has slightly different semantics since doc ids now start at 1 rather than 0. o num_docs() - renamed to doc_count(). o del_count(), seg_readers(), offsets(), fetch(), obtain() - added.
    • KinoSearch::Indexer (replaces KinoSearch::InvIndexer) o new() - parameters changed. Old: "invindex", "lock_factory". New: "schema", "index", "create", "truncate", "lock_factory". o add_doc() - now takes either a hash ref or a Doc object, and optionally takes labeled params. o finish() - refactored into commit(), prepare_commit(), and optimize(). o add_invindexes() - replaced by add_index(). o delete_by_term() - now takes labeled parameters rather than positional args. o delete_by_query() - added.
      • takes "index" (a string filepath or Folder object), "lock_factory", and
    • KinoSearch::QueryParser o tree(), expand(), expand_leaf(), prune(), make_term_query(), make_phrase_query(), make_and_query(), make_or_query(), make_not_query(), make_req_opt_query() - added.
    • KinoSearch::Schema o No longer an abstract class. o "%fields" hash eliminated. o Now gets serialized as JSON and stored with index. o clobber(), open(), read() - removed. o analyzer() - removed. o similarity() - removed. o pre_sort() - removed. o add_field() - replaced by spec_field(), which associates a field name with a FieldType object rather than a class name. o num_fields(), all_fields(), fetch_type(), fetch_sim(), architecture(), get_architecture(), get_similarity() - added.
    • KinoSearch::Search::Hits o fetch_hit_hashref() - replaced by next(), which return a HitDoc by default. o create_excerpts() - removed.
    • KinoSearch::Search::PhraseQuery o new() - now takes params "field" and "terms". o add_term() - removed. o get_field(), get_terms() - added.
    • KinoSearch::Search::PolySearcher (formerly MultiSearcher) o Now supports SortSpec.
    • KinoSearch::Search::Query o make_compiler() - added.
    • KinoSearch::Search::Searchable o search() - renamed to hits(). o new(), glean_query(), get_schema(), collect(), doc_max(), doc_freq(), fetch_doc() - added.
    • KinoSearch::Search::SortSpec o new() - takes new param "rules", an array of SortRules. o add() - removed.
    • KinoSearch::Search::TermQuery o new() - now takes "field", and "term" (which is a string rather than a Term object as before).
    • KinoSearch::Searcher o new() - now takes "index" (a string filepath, a Folder object, or an IndexReader object), rather than "invindex" or "reader". o search() - renamed to hits(). o set_prune_factor() - removed. o collect(), doc_max(), doc_freq(), fetch_doc(), get_schema() - added.
  • Subclassing improvements:
    • Although KinoSearch is now implemented almost entirely in C, pure-Perl dynamic subclassing is supported. (Public methods which are overridden in pure-Perl subclasses are automatically detected and invoked as callbacks by the the internal KS object engine.)
  • Significant internal changes:
    • All classes now implemented in C, with Perl and XS only where necessary.
    • Doc IDs now start at 1 rather than 0.

Documentation

Modules

Provides

Examples