Revision history for KinoSearch

0.315 2012-04-16

    * Omit LockFreeRegistry test from CPAN distro.

0.314 2012-04-15

    * Publicize KinoSearch deprecation and direct users towards Apache Lucy.
    * Make various fixes to deal with changes to object destruction in Perl
    * Omit flaky POD checker test from CPAN distro.

0.313 2011-03-24

    * Make FieldType checking more stringent.
    * Fix a latent bug which might theoretically result in incorrect search
      results on highly customized systems.
    * Fix a test which was failing on automated testing systems.

0.312 2011-03-23

    * Fix several bugs related to locking: insufficient retries, stale locks
      left behind or not cleaned up, etc.
    * Guard against excessive search size requests.
    * Skip a test which was prone to failure on automated testing systems.
    * Tweak META.yml so that the Clownfish build files get excluded properly.

0.311 2010-12-28

    * Disallow "\p" constructs in Tokenizer regexes because of a Perl security
      problem affecting untrusted indexes.
    * Fix a sentence boundary detection bug in Highlighter.
    * Fix a stoplist problem in QueryParser.
    * Improve compatibility with XFS file systems.
    * Eliminate non-core dependencies required by Tutorial.

0.31  2010-10-27

  Overview of changes since release 0.165:

    KinoSearch 0.31 is a major upgrade, adding numerous features and

    * Greatly increased speed.
    * Near-real-time indexing.
    * Sorting by field value.
    * Range queries.
    * Robust Unicode support.
    * Improved subclassing support.
    * "Lightweight Searchers", which open quickly and have low process RAM
    * Expanded public API (increasing from 33 to 78 public classes)
    * Refined and improved schema, query-building, and document APIs.
    * Expanded tutorial and cookbook documentation.

    Release 0.31 is not backwards compatible with the 0.1x branch in terms of
    either file format or API.  Users who require the functionality of 0.1x
    should consider the stable fork "KinoSearch1".

  Major internal changes:

    * Almost all core module code ported to C.
    * Greatly increased integration with the OS file system cache.
    * Schema data serialized and stored with index.
    * New internal object model ("Clownfish").

  New classes:

    * KinoSearch::Document::HitDoc
    * KinoSearch::Index::BackgroundMerger
    * KinoSearch::Index::DataReader
    * KinoSearch::Index::DataWriter
    * KinoSearch::Index::DeletionsWriter
    * KinoSearch::Index::DocReader
    * KinoSearch::Index::IndexManager
    * KinoSearch::Index::IndexReader
    * KinoSearch::Index::Lexicon
    * KinoSearch::Index::LexiconReader
    * KinoSearch::Index::PolyReader
    * KinoSearch::Index::PostingList
    * KinoSearch::Index::PostingListReader
    * KinoSearch::Index::Segment
    * KinoSearch::Index::SegReader
    * KinoSearch::Index::SegWriter
    * KinoSearch::Index::Similarity
    * KinoSearch::Index::Snapshot
    * KinoSearch::Object::BitVector
    * KinoSearch::Object::Err
    * KinoSearch::Object::Obj
    * KinoSearch::Plan::Architecture
    * KinoSearch::Plan::BlobType
    * KinoSearch::Plan::FieldType
    * KinoSearch::Plan::FullTextType
    * KinoSearch::Plan::Schema
    * KinoSearch::Plan::StringType
    * KinoSearch::Search::ANDQuery
    * KinoSearch::Search::Collector
    * KinoSearch::Search::Collector::BitCollector
    * KinoSearch::Search::Compiler
    * KinoSearch::Search::LeafQuery
    * KinoSearch::Search::MatchAllQuery
    * KinoSearch::Search::Matcher
    * KinoSearch::Search::NoMatchQuery
    * KinoSearch::Search::NOTQuery
    * KinoSearch::Search::ORQuery
    * KinoSearch::Search::PolyQuery
    * KinoSearch::Search::QueryParser
    * KinoSearch::Search::RangeQuery
    * KinoSearch::Search::RequiredOptionalQuery
    * KinoSearch::Search::Searcher
    * KinoSearch::Search::SortRule
    * KinoSearch::Search::SortSpec
    * KinoSearch::Search::Span
    * KinoSearch::Store::Lock
    * KinoSearch::Store::LockErr
    * KinoSearch::Store::LockFactory
    * KSx::Index::ByteBufDocReader
    * KSx::Index::ByteBufDocWriter
    * KSx::Index::LongFieldSim
    * KSx::Index::ZlibDocReader
    * KSx::Index::ZlibDocWriter
    * KSx::Search::MockScorer
    * KSx::Search::ProximityQuery
    * KSx::Simple

  New documentation:

    * KinoSearch::Docs::Cookbook
    * KinoSearch::Docs::Cookbook::CustomQuery
    * KinoSearch::Docs::Cookbook::CustomQueryParser
    * KinoSearch::Docs::Cookbook::FastUpdates
    * KinoSearch::Docs::DocIDs
    * KinoSearch::Docs::FileLocking
    * KinoSearch::Docs::IRTheory
    * KinoSearch::Docs::Tutorial::Analysis
    * KinoSearch::Docs::Tutorial::BeyondSimple
    * KinoSearch::Docs::Tutorial::FieldType
    * KinoSearch::Docs::Tutorial::Highlighter
    * KinoSearch::Docs::Tutorial::QueryObjects
    * KinoSearch::Docs::Tutorial::Simple

  Moved classes:

    * KinoSearch::InvIndexer               -> KinoSearch::Index::Indexer
    * KinoSearch::Searcher                 -> KinoSearch::Search::IndexSearcher
    * KinoSearch::Analysis::LCNormalizer   -> KinoSearch::Analysis::CaseFolder
    * KinoSearch::QueryParser::QueryParser -> KinoSearch::Search::QueryParser
    * KinoSearch::Search::MultiSearcher    -> KinoSearch::Search::PolySearcher
    * KinoSearch::Search::QueryFilter      -> KSx::Search::Filter
    * KinoSearch::Search::SearchClient     -> KSx::Remote::SearchClient
    * KinoSearch::Search::SearchServer     -> KSx::Remote::SearchServer
    * KinoSearch::Store::InvIndex          -> KinoSearch::Store::Folder
    * KinoSearch::Store::FSInvIndex        -> KinoSearch::Store::FSFolder
    * KinoSearch::Store::RAMInvIndex       -> KinoSearch::Store::RAMFolder
  Removed/redacted classes:
    * KinoSearch::Analysis::Token - Redacted.
    * KinoSearch::Analysis::TokenBatch - Redacted.
    * KinoSearch::Document::Field - Removed.
    * KinoSearch::Highlight::Encoder - See Highlighter.
    * KinoSearch::Highlight::Formatter - See Highlighter.
    * KinoSearch::Highlight::SimpleHTMLEncoder - See Highlighter.
    * KinoSearch::Highlight::SimpleHTMLFormatter - See Highlighter.
    * KinoSearch::Index::Term - Removed.
    * KinoSearch::Search::BooleanQuery - See ANDQuery, ORQuery,
      NOTQuery, RequiredOptionalQuery, and PolyQuery.
    * KinoSearch::Search::Hit - See HitDoc.
  API Changes:

    * KinoSearch::Index::Indexer (formerly KinoSearch::InvIndexer)
      o Modified:
        * new() - Args changed.  
          o Replaced: "invindex" -> "index".
          o Added: "schema", "truncate", "manager".
          o Removed: "analyzer".
      o Removed:
        * spec_field() - See Schema.
        * new_doc() - See Doc->new.
        * finish() - see commit(), prepare_commit(), optimize().
        * delete_docs_by_term() - see delete_by_term()
        * add_invindexes() - see add_index()
      o Added:
        * commit()
        * prepare_commit()
        * optimize()
        * add_index()
        * delete_by_term()
        * delete_by_query()

    * KinoSearch::Search::IndexSearcher (formerly KinoSearch::Searcher)
      o New behaviors:
        * Searcher objects are now "lightweight" (or rather, the IndexReader
          objects they wrap are lightweight) -- they cache index data via the
          OS file system cache rather than in process RAM, allowing them to
          open quickly and share memory across multiple objects.
      o Modified:
        * new() - Args changed.
          o Removed: "analyzer", "invindex".
          o Added: "index".
      o Renamed:
        * search() -> hits()
      o Added:
        * collect()
        * doc_max()
        * doc_freq()
        * get_schema()
        * get_reader()

    * KinoSearch::Analysis::CaseFolder (formerly LCNormalizer)
      o Modified:
        * new() - no-op parameter "language" removed.

    * KinoSearch::Analysis::Stopalizer
      o Modified:
        * new() - The values of a supplied "stoplist" hash are now ignored.

    * KinoSearch::Analysis::Tokenizer
      o Modified:
        * new() - parameter "token_re" replaced by "pattern".

    * KinoSearch::Document::Doc
      o New behavior: field values accessible via hashref overloading.
      o Removed:
        * set_value()
        * get_value()
      o Added:
        * new()
        * get_fields()
        * get_doc_id()

    * KinoSearch::Highlight::Highlighter
      o Modified:
        * new() - Args changed.  
          o Added: "query", "searcher".
          o Removed: "formatter", "encoder", "pre_tag", "post_tag".
          o Replaced: "excerpt_field" -> "field".
      o Added:
        * set_pre_tag()
        * get_pre_tag()
        * set_post_tag()
        * get_post_tag()
        * get_searcher()
        * get_query()
        * get_compiler()
        * get_excerpt_length()
        * get_field()

    * KinoSearch::Search::QueryParser 
      (formerly KinoSearch::QueryParser::QueryParser)
      o Changed behaviors: 
        * Parsing of 'fieldname:value' constructs disabled by
          default, enabled via set_heed_colons().
      o Modified:
        * new() - Args changed.  
          o Added: "schema".
          o Removed: "default_field".
      o Added:
        * parse()
        * tree()
        * expand()
        * expand_leaf()
        * prune()
        * set_heed_colons()
        * make_term_query()
        * make_phrase_query()
        * make_and_query()
        * make_or_query()
        * make_not_query()
        * make_req_opt_query()

    * KinoSearch::Search::PolySearcher 
      (formerly KinoSearch::Search::MultiSearcher)
      o Modified:
        * new() - Args changed.  
          o Added: "schema".
          o Removed: "analyzer".
          o Renamed: "searchables" -> "searchers"
      o Added:
        * doc_max()
        * doc_freq()
        * fetch_doc()
        * get_schema()
      o Renamed: search() -> hits()

    * KinoSearch::Search::Query 
      o Added:
        * make_compiler()

    * KinoSearch::Search::PhraseQuery 
      o Modified:
        * new() - Args changed.  
          o Added: "terms", "field".
      o Added:
        * get_field()
        * get_terms()
      o Removed:
        * add_term()

    * KinoSearch::Search::TermQuery
      o Modified:
        * new() - now accepts "field" and a text "term" as arguments, rather
          than a "term" which is Term object combining field and term text.

    * KinoSearch::Store::FSFolder
      o Modified:
        * new() - Args changed.
          o Removed: "create".

    * KSx::Remote::SearchClient (formerly KinoSearch::Search::SearchClient)
      o Modified:
        * new() - Args changed.  
          o Removed: "analyzer".

    * KSx::Remote::SearchServer (formerly KinoSearch::Search::SearchServer)
      o Modified:
        * new() - Args changed.  
          o Replaced: "searchable" -> "searcher".

0.30_13 2010-10-25

  API changes:

    * KinoSearch::Search::Searcher and its subclasses (IndexSearcher,
        PolySearcher, SearchClient)
      o Modified:
        * fetch_doc() - Args changed.
          o Now takes one argument, a doc id, instead of three labeled params.

    * KinoSearch::Index::DocReader
      o Replaced:
        * fetch() -> fetch_doc()

    * KinoSearch::Plan::Architecture:
      o Added:
        * new()

    * KinoSearch::Search::TermQuery:
      o Added:
        * get_term()
        * get_field()

    * KinoSearch::Document::Doc
      o Added:
        * set_doc_id()
        * get_doc_id()


    * KinoSearch::Highlight::HeatMap.


    * Spurious failure on Indexer commit() when some docs contain field values
      of empty string eliminated.


    * Documentation brought up to date and corrected in a few places.

0.30_122 2010-10-06


    * Tune merge algorithm to avoid another situation where too much content
      would be recycled.

0.30_121 2010-09-29


    * Improve build portability for various systems.

0.30_12  2010-09-24

  Backwards-incompatible API changes:

    * KinoSearch::Plan::BlobType
      o new() -- new param 'stored', which is false by default.  Indexes which
        have BlobType fields must be regenerated.


    * Improve Solaris compatibility.
    * Make BackgroundMerger fail gracefully when it can't get a write lock.

0.30_112 2010-08-27


    * Prevent accumulation of dead files for indexes initialized by versions
      of KinoSearch prior to 0.30_11.

0.30_111 2010-08-26


    * Improve error messages when locking fails.
    * Improve reliability for Schema and FieldType compatibility stubs.
    * Various portability fixes.

0.30_11  2010-08-19

  New features:

    * KinoSearch::Search::QueryParser now supports escaping double quotes.

    * KinoSearch::Plan::FieldType
      o Removed set_boost(), set_indexed(), set_store(). 
      o Added sortable().

  Moved, but compatibility stubs retained:

    * KinoSearch::Search::Similarity -> KinoSearch::Index::Similarity
    * KSx::Search::LongFieldSim      -> KSx::Index::LongFieldSim 


    * Sorting problem with complex sort specs on x86.
    * Improved segment recycling to avoid pathological state on large indexes.
    * Add missing compatibility stub for KinoSearch::Doc (now
    * Various esoteric bugs in QueryParser.

0.30_101 2010-05-01


    * r6044, r6046-6048: fixes to Build process for compiler values
      other than 'cc'.
    * r6078: fix BitCollector bug, where segment offset was ignored.
    * r6079-6081: Fix problem with corruption of the Perl stack by 

0.30_10 2010-03-29

    * Endian portability issue solved for index sort caches.
    * Security issue solved for untrusted indexes (potential file deletion,
      see commit r5970).


    * Index version bumped.`

0.30_09 2010-03-26

  New public classes:

    * KSx::Search::ProximityQuery

  New documentation:

    * KinoSearch - new backwards compatibility policy.
    * KinoSearch::Docs::DevGuide


    * Lower memory consumption while indexing sortable fields.
    * Lower address space requirements for very large indexes with sortable
    * Improved error reporting for incorrectly implemented subclasses.

  Moved, but compatibility stubs retained:

    * KinoSearch::Schema                  -> KinoSearch::Plan::Schema
    * KinoSearch::Indexer                 -> KinoSearch::Index::Indexer
    * KinoSearch::Searcher                -> KinoSearch::Search::IndexSearcher
    * KinoSearch::FieldType               -> KinoSearch::Plan::FieldType
    * KinoSearch::FieldType::BlobType     -> KinoSearch::Plan::BlobType
    * KinoSearch::FieldType::FullTextType -> KinoSearch::Plan::FullTextType
    * KinoSearch::FieldType::StringType   -> KinoSearch::Plan::StringType
    * KinoSearch::QueryParser             -> KinoSearch::Search::QueryParser
    * KinoSearch::Doc                     -> KinoSearch::Document::Doc
    * KinoSearch::Doc::HitDoc             -> KinoSearch::Document::HitDoc
    * KinoSearch::Search::Searchable      -> KinoSearch::Search::Searcher
    * KinoSearch::Search::HitCollector    -> KinoSearch::Search::Collector
    * KinoSearch::Search::HitCollector::BitCollector
                                -> KinoSearch::Search::Collector::BitCollector

  Public API Changes:

    * KinoSearch::Highlight::Highlighter
      o new() - param "searchable" replaced by "searcher".
      o get_searchable() - replaced by get_searcher().

    * KinoSearch::Search::PolySearcher
      o new() - param "searchables" replaced by "searchers".

    * KinoSearch::Search::Query
      o make_compiler() - parameter "searchable" replaced by "searcher".

    * KinoSearch::Search::Compiler
      o new() - parameter "searchable" replaced by "searcher".
      o highlight_spans() - parameter "searchable" replaced by "searcher".

0.30_083 2010-03-03


    * r5835: Add missing NULL-termination to Charmonizer strdup() clone.
    * r5883, r5884: fix missing get_analyzers() method in PolyAnalyzer.

0.30_082 2010-01-30

    * Improve compatibility with Perl 5.11.x.

0.30_081 2010-01-29

    * Improve compatibility with OS X 10.4 Tiger.


    * Make build process less verbose.

0.30_08 2010-01-28


    * FullTextType fields may now be sortable.
    * QueryParser's parse() method now invokes tree(), expand(), and prune()
      as methods, so that overriding one of them in a subclass affects 


    * Search speed improvement for AND conjunctions: the "skipping"
      optimization for posting lists has been fixed and re-enabled.
    * Fixed a problem where a stale, invalid "write.lock.temp" written after
      disk fillup could block subsequent indexing.
    * Permission problems with subfolders in indexes now trigger exceptions
      rather than segfaults.

  Moved, but compatibility stubs retained:

    * KinoSearch::Architecture   -> KinoSearch::Plan::Architecture
    * KinoSearch::Obj            -> KinoSearch::Object::Obj
    * KinoSearch::Obj::BitVector -> KinoSearch::Object::BitVector


    * KinoSearch::Index::PostingsReader 
        -> KinoSearch::Index::PostingListReader

  Classes with API Changes:

    * KinoSearch::Index::IndexManager
      o new() - "hostname" param replaced by "host".
      o get_hostname() - replaced by get_host().

    * KinoSearch::Store::LockFactory
      o new() - "hostname" param replaced by "host".

    * KinoSearch::Store::Lock
      o new() - "hostname" param replaced by "host".

0.30_072 2009-12-23


    * Update XS binding code for improved compatibility with Perl 5.11.x.

0.30_071 2009-12-16


    * Fix an intermittent problem with lost deletions.
    * Handle UTF-8 hash keys properly within JSON writing code, fixing
      serialization of non-English stoplists.
    * Fix a build-time memory error that could cause some platforms (e.g.
      FreeBSD 7.x) to abort the build.
    * Update Tokenizer for compatibility with Perl 5.11.
    * Improve compatibility with unrecognized compiler platforms.
    * Fix a theoretical bug with truncated offsets for index files > 2 GB.
    * Reword and de-glitch the FastUpdates cookbook entry.

0.30_07  2009-08-30


    * Revisit the bug in IndexManager's recycle().  The 0.30_06 fix attempt
      had made it manifest less often, but had not completely eliminated it.
    * Throw an error in Indexer if recycle() returns duplicated segments.

0.30_06  2009-08-17


    * Solve a fencepost error in IndexManager's recycle() method which could
      cause document duplication and lost deletions.

0.30_05  2009-08-06

    * Support for near-real-time indexing.

  New public classes:

    * KinoSearch::Index::IndexManager
    * KinoSearch::Index::BackgroundMerger
    * KinoSearch::Index::DeletionsWriter
    * KinoSearch::Obj::Err
    * KinoSearch::Store::LockErr

  New documentation:

    * KinoSearch::Docs::Cookbook::FastUpdates

  API changes:

    * KinoSearch::Indexer
      o new() - param "lock_factory" replaced by param "manager"

    * KinoSearch::Index::IndexReader
      o open() - param "lock_factory" replaced by param "manager"

    * KinoSearch::Index::SegReader 
      o get_seg_num() - added.
      o get_seg_name() - added.

    * KinoSearch::Highlight::Highlighter
      o Three dots replaced by Unicode ellipsis.

    * KinoSearch::Store::Lock
      o Now an abstract class.
      o new() 
        * Now an abstract constructor.
        * param "agent_id" renamed to "hostname".
      o get_agent_id() - replaced by get_hostname().
      o request() - added.
      o shared() - added.

    * KinoSearch::Store::LockFactory
      o new() - param "agent_id" renamed to "hostname"
      o make_shared_lock() - Now returns a Lock (instead of a SharedLock).


    * KinoSearch::Store::SharedLock

    * KinoSearch::Util::BitVector -> KinoSearch::Object::BitVector.
      (Compatibility subclass left in place for now.)


    * Fields with empty strings could produce corrupt Lexicons.
    * Segment data files (cf.dat) over 2 GB could cause search-time crashes.


    * File-format compatible with 0.30_04.

0.30_04  2009-07-05

    * Stemmer had been malfunctioning, producing incorrect stems in some cases
      and bailing out with "invalid UTF-8" errors in others.  Bug found and
      solution proposed by Nick Wellnhofer.


    * Memory mapping implemented for Windows, so now that platform gets fast
      Searcher opens and minimized search-time process memory footprint too.


    * Indexes that utilize Stemmer must be regenerated.

0.30_03  2009-07-03


    * Fix a problem in SortCollector that led to ranking errors for some
    * Eliminate a symbol conflict with MSVC.

0.30_02  2009-06-29

  API Changes:

    * KinoSearch::Indexer
      o new() - "schema" argument now required only at index creation.
    * KinoSearch::QueryParser::QueryParser - ancient compatibility stub
      redacted, use KinoSearch::QueryParser instead.

    * Various C compatibility tweaks for Solaris, PowerPC Linux, etc. 


    * Index version bumped.

0.30_01  2009-06-18


    * Many new classes and methods.
    * Improved Searcher open times and decreased process memory footprint.
    * Improved sorting support.
    * Improved subclassing support.
    * Improved indexing speed.
    * Schemas serialized and stored with indexes.
    * Improved pluggability.
    * Expanded tutorial documentation.
    * Restored Windows compatibility.

  New public classes:

    * KinoSearch::Architecture
    * KinoSearch::Doc
    * KinoSearch::Doc::HitDoc
    * KinoSearch::Indexer (replaces InvIndexer)
    * KinoSearch::FieldType (replaces FieldSpec)
    * KinoSearch::FieldType::BlobField
    * KinoSearch::FieldType::FullTextField (replaces FieldSpec::text)
    * KinoSearch::FieldType::StringField
    * KinoSearch::Highlight::HeatMap
    * KinoSearch::Index::DataReader
    * KinoSearch::Index::DataWriter
    * KinoSearch::Index::DocReader
    * KinoSearch::Index::Lexicon
    * KinoSearch::Index::LexiconReader
    * KinoSearch::Index::PolyReader
    * KinoSearch::Index::PostingList
    * KinoSearch::Index::PostingsReader
    * KinoSearch::Index::Segment
    * KinoSearch::Index::SegReader
    * KinoSearch::Index::SegWriter
    * KinoSearch::Index::Snapshot
    * KinoSearch::Obj
    * KinoSearch::Search::ANDQuery
    * KinoSearch::Search::Compiler
    * KinoSearch::Search::HitCollector
    * KinoSearch::Search::HitCollector::BitCollector
    * KinoSearch::Search::LeafQuery
    * KinoSearch::Search::MatchAllQuery
    * KinoSearch::Search::Matcher
    * KinoSearch::Search::NoMatchQuery
    * KinoSearch::Search::NOTQuery
    * KinoSearch::Search::ORQuery
    * KinoSearch::Search::PolyQuery
    * KinoSearch::Search::RangeQuery (replaces RangeFilter)
    * KinoSearch::Search::RequiredOptionalQuery
    * KinoSearch::Search::SortRule (factored out of SortSpec)
    * KinoSearch::Search::Span
    * KinoSearch::Util::BitVector
    * KSx::Index::ByteBufDocReader
    * KSx::Index::ByteBufDocWriter
    * KSx::Index::ZlibDocReader
    * KSx::Index::ZlibDocWriter
    * KSx::Search::MockScorer

  New/updated documentation:

    * KinoSearch::Docs::Tutorial::Simple            (updated)
    * KinoSearch::Docs::Tutorial::BeyondSimple      (updated)
    * KinoSearch::Docs::Tutorial::FieldType         (new)
    * KinoSearch::Docs::Tutorial::Analysis          (new)
    * KinoSearch::Docs::Tutorial::Highlighter       (new)
    * KinoSearch::Docs::Tutorial::QueryObjects      (new)
    * KinoSearch::Docs::Cookbook::CustomQuery       (new)
    * KinoSearch::Docs::Cookbook::CustomQueryParser (new)
    * KinoSearch::Docs::DocIDs                      (new)


    * KinoSearch::Analysis::Token - redacted pending API overhaul.
    * KinoSearch::Analysis::TokenBatch - redacted pending API overhaul.
    * KinoSearch::Docs::DevGuide - removed.
    * KinoSearch::FieldSpec - replaced by FieldType.
    * KinoSearch::FieldSpec::text - replaced by FullTextType and StringType.
    * KinoSearch::Highlight::Encoder - rolled into Highlighter.
    * KinoSearch::Highlight::Formatter - rolled into Highlighter.
    * KinoSearch::Highlight::SimpleHTMLEncoder - rolled into Highlighter.
    * KinoSearch::Highlight::SimpleHTMLFormatter - rolled into Highlighter.
    * KinoSearch::Index::Term - removed.  Now any object can be a term.
    * KinoSearch::InvIndex - removed.
    * KinoSearch::InvIndexer - replaced by Indexer.
    * KinoSearch::Posting - redacted pending API overhaul.
    * KinoSearch::Posting::MatchPosting - redacted pending API overhaul.
    * KinoSearch::Posting::RichPosting - redacted pending API overhaul.
    * KinoSearch::Posting::ScorePosting - redacted pending API overhaul.
    * KinoSearch::Search::BooleanQuery - replaced by ANDQuery, ORQuery,
      NOTQuery, and RequiredOptionalQuery.
    * KinoSearch::Search::Filter - removed.  Filtering can now be achieved via
      ANDQuery, NOTQuery, etc.
    * KinoSearch::Search::PolyFilter - removed.
    * KinoSearch::Search::QueryFilter - replaced by KSx::Search::Filter
    * KinoSearch::Search::RangeFilter - replaced by RangeQuery.
    * KinoSearch::Util::Class - removed.
    * KinoSearch::Util::ToolSet - permanently redacted.


    * KinoSearch::Analysis::LCNormalizer => KinoSearch::Analysis::CaseFolder
    * KinoSearch::Search::SearchServer   => KSx::Remote::SearchServer
    * KinoSearch::Search::SearchClient   => KSx::Remote::SearchClient
    * KinoSearch::Simple                 => KSx::Simple
    * KinoSearch::Search::MultiSearcher  => KinoSearch::Search::PolySearcher

  API Changes:

    * KinoSearch::Analysis::Analyzer
      o analyze_batch() - redacted pending API overhaul.

    * KinoSearch::Analysis::PolyAnalyzer
      o get_analyzers() - added.

    * KinoSearch::Analysis::Tokenizer
      o new() - parameter "token_re" replaced by "pattern".

    * KinoSearch::Highlight::Highlighter
      o Highlighter objects are now single-field.
      o Fields must now be marked as "highlightable" at index time via
        their FieldType.
      o Excerpts are now created manually rather than automatically inserted
        via the Hits class.
      o new() - now takes four params instead of none: "searchable", "field",
        "query", and "excerpt_length".
      o add_spec() - removed.
      o create_excerpt(), highlight(), encode(), set_pre_tag(), get_pre_tag(),
        set_post_tag(), get_post_tag(), get_searchable(), get_query(),
        get_compiler(), get_excerpt_length(), get_field - added.

    * KinoSearch::Index::IndexReader
      o open() - takes an "index" (string filepath or Folder object) instead
        of an "invindex", plus an optional "snapshot".  Always returns a
        PolyReader (instead of an unspecified IndexReader subclass).
      o max_doc() - replaced by doc_max(), which has slightly different
        semantics since doc ids now start at 1 rather than 0.
      o num_docs() - renamed to doc_count().
      o del_count(), seg_readers(), offsets(), fetch(), obtain() - added.

    * KinoSearch::Indexer (replaces KinoSearch::InvIndexer)
      o new() - parameters changed.  Old: "invindex", "lock_factory".  New:
        "schema", "index", "create", "truncate", "lock_factory".
      o add_doc() - now takes either a hash ref or a Doc object, and
        optionally takes labeled params.
      o finish() - refactored into commit(), prepare_commit(), and optimize().
      o add_invindexes() - replaced by add_index().
      o delete_by_term() - now takes labeled parameters rather than positional
      o delete_by_query() - added.
      takes "index" (a string filepath or Folder object),
      "lock_factory", and 

    * KinoSearch::QueryParser
      o tree(), expand(), expand_leaf(), prune(), make_term_query(),
        make_phrase_query(), make_and_query(), make_or_query(),
        make_not_query(), make_req_opt_query() - added.

    * KinoSearch::Schema
      o No longer an abstract class.
      o "%fields" hash eliminated.
      o Now gets serialized as JSON and stored with index.
      o clobber(), open(), read() - removed.
      o analyzer() - removed.
      o similarity() - removed.
      o pre_sort() - removed.
      o add_field() - replaced by spec_field(), which associates a field name
        with a FieldType object rather than a class name.
      o num_fields(), all_fields(), fetch_type(), fetch_sim(), architecture(),
        get_architecture(), get_similarity() - added.

    * KinoSearch::Search::Hits
      o fetch_hit_hashref() - replaced by next(), which return a HitDoc by
      o create_excerpts() - removed.

    * KinoSearch::Search::PhraseQuery
      o new() - now takes params "field" and "terms".
      o add_term() - removed.
      o get_field(), get_terms() - added.

    * KinoSearch::Search::PolySearcher (formerly MultiSearcher)
      o Now supports SortSpec.

    * KinoSearch::Search::Query
      o make_compiler() - added.

    * KinoSearch::Search::Searchable
      o search() - renamed to hits().
      o new(), glean_query(), get_schema(), collect(), doc_max(), doc_freq(),
        fetch_doc() - added.

    * KinoSearch::Search::SortSpec
      o new() - takes new param "rules", an array of SortRules.
      o add() - removed.

    * KinoSearch::Search::TermQuery
      o new() - now takes "field", and "term" (which is a string rather than a
        Term object as before).

    * KinoSearch::Searcher
      o new() - now takes "index" (a string filepath, a Folder object, or an
        IndexReader object), rather than "invindex" or "reader".
      o search() - renamed to hits().
      o set_prune_factor() - removed.
      o collect(), doc_max(), doc_freq(), fetch_doc(), get_schema() - added.

  Subclassing improvements:

    * Although KinoSearch is now implemented almost entirely in C, pure-Perl
      dynamic subclassing is supported.  (Public methods which are overridden
      in pure-Perl subclasses are automatically detected and invoked as
      callbacks by the the internal KS object engine.)

  Significant internal changes:

    * All classes now implemented in C, with Perl and XS only where necessary.
    * Doc IDs now start at 1 rather than 0.

0.20_051 2008-01-20

  Bug Fixes:

    * Occasionally incorrect search results fixed by disabling Skip_To

0.20_05 2007-10-27

  API Changes:

    * KinoSearch::Search::Hits 
      o seek() - Removed. (Patch by Nathan Kurz.)

    * KinoSearch::Schema::FieldSpec has become KinoSearch::FieldSpec::text.
      o The old class is retained for now as a compatibility alias.

    * KinoSearch::Schema
      o %fields hash now accepts 'text' as an alias for

  Significant Bug fixes:

    * Fix index-corrupting bug affecting deletions.  Reported by Scott Beck.
    * Insecure temp file creation during test suite eliminated. Reported by
      Andreas Koenig as RT #28777.
    * Fix phrase matching failure due to underflow.  Repeatable test scenario
      provided by Matthew O'Connor.  Diagnosis and patch provided by Nathan
    * RangeFilter now works with multi-segment indexes. Patch by 
      Chris Nandor.
    * Occasional runaway memory usage curtailed.

0.20_04 2007-06-20


     * Several bug fixes.

  New public classes:

     * KinoSearch::Simple.


    * KinoSearch::QueryParser::QueryParser => KinoSearch::QueryParser

  API Changes:

    * KinoSearch::QueryParser 
      o No longer recognizes 'field:term_text' construct by default.
      o set_heed_colons() - Added.
    * KinoSearch::InvIndex
      o create() - Removed.
      o read() - Added.
      o open() - Behavior changed -- now creates an index if none detected.
    * KinoSearch::Schema
      o create() - Removed.
      o read() - Added.
      o open() - Behavior changed -- now creates an index if none detected.


    * Bug reports from Henry Combrinck, Chris Nandor, and Marco Barromeo.

0.20_03 2007-05-08 


    * Combining filters now possible using PolyFilter.
    * Significantly improved indexing speed.
    * Better NFS compatibility using LockFactory.

  New public classes:

    * KinoSearch::Index::IndexReader
    * KinoSearch::Posting
    * KinoSearch::Posting::ScorePosting
    * KinoSearch::Posting::RichPosting
    * KinoSearch::Search::PolyFilter
    * KinoSearch::Store::Lock
    * KinoSearch::Store::SharedLock
    * KinoSearch::Store::LockFactory

  New/updated documentation:

    * KinoSearch::Docs::IRTheory
    * KinoSearch::Docs::FileFormat


    * KinoSearch::Docs::NFS

    * KinoSearch::Contrib::LongFieldSim => KSx::Search::LongFieldSim

  Classes with API changes:
    * KinoSearch::Schema
      o %FIELDS must now be spelled %fields (resolving conflict with Perl core
        pragmas and
      o pre_sort() - Added. (experimental)

    * KinoSearch::Schema::FieldSpec
      o store_pos_boost() - Removed.
      o posting_type() - Added. (experimental)

    * KinoSearch::Analysis::Analyzer
      o analyze() - Removed.
      o analyze_batch() - Added.

    * KinoSearch::Analysis::Stopalizer
      o Now removes stopwords rather than turning them to empty strings.
    * KinoSearch::InvIndex
      o get_folder() - Added.
      o get_schema() - Added.
    * KinoSearch::InvIndexer
      o new() - Parameters changed.
        * host_id - Removed.
        * lock_factory - Added.
    * KinoSearch::Highlight::Highlighter
      o new() - All arguments removed.
      o add_spec() - Added, making it possible to customize multiple excerpts.

    * KinoSearch::Highlight::SimpleHTMLEncoder
      o Now uses HTML::Entities::encode_entities, so more entities are

    * KinoSearch::Searcher
      o get_reader() - Added.
      o set_prune_factor - Added. (experimental)

    * KinoSearch::Search::Hits
      o Now supports multiple highlighted excerpts per document.
      o Excerpts now use key of "excerpts" rather than "excerpt".

    * KinoSearch::Search::RangeFilter
      o Now supports "open ended searches": all above or all below a bound.
      o new() - Default values added.  


    * Chris Nandor was the driving force behind PolyFilter and Filter,
      contributing code, tests, bug reports and bug fixes.
    * Patches and failing test cases contributed by Edward Betts, Henry
      Combrinck, Simon Cozens, and Peter Karman.

0.20_02 2007-03-06
  * Rework Schema API.
    o Add instance method add_field(), facilitating dynamic schemas.
    o Remove init_fields().
    o Require the declaration of a %FIELDS hash.
    o Change how field names are associated with FieldSpecs.
    o Update documentation throughout KinoSearch to reflect the new API.
  * Fix crashing bug in in TermListWriter/TermListReader isolated by Edward 

0.20_01 2007-02-26
  KinoSearch 0.20 is a major rewrite, adding many new features.  It also
  breaks backwards compatibility in a number of ways.  
  Two key features, UTF-8 support and custom sorting, were not possible to
  implement while preserving backwards compatibility.  Once the decision was
  made to proceed with them, breaking all existing installations, it made
  little sense to proceed by half measures, so the API has been given a
  significant overhaul.

  KinoSearch has always carried an "alpha code" warning; it is being invoked
  for this release.  While it will continue to carry the "alpha" warning for
  a short while longer, the point of jamming so many changes into one release
  is to cause disruption only once; once the code in 0.20 proves itself,
  hopefully no more backwards incompatible changes will be needed any time

  New behaviors:

    * KinoSearch now uses UTF-8 for all input and output, throughout the
      entire library.  This affects many classes, but particularly those under
      Analysis, Highlight, and QueryParser.
    * The default scoring algorithm has changed subtly -- aggressive 
      per-field boosting is no longer important or even desirable.  The old
      behavior is available from KinoSearch::Contrib::LongFieldSim.

  New public classes:

    * KinoSearch::Schema
    * KinoSearch::Schema::FieldSpec
    * KinoSearch::InvIndex
    * KinoSearch::Analysis::Token
    * KinoSearch::Search::RangeFilter
    * KinoSearch::Search::SortSpec
    * KinoSearch::Search::Similarity
    * KinoSearch::Contrib::LongFieldSim

  New documentation:

    * KinoSearch::Docs::NFS

  Removed classes:

    * KinoSearch::Document::Doc
    * KinoSearch::Document::Field
    * KinoSearch::Search::Hit

  Renamed classes:

    * KinoSearch::Store::InvIndex    => KinoSearch::Store::Folder
    * KinoSearch::Store::FSInvIndex  => KinoSearch::Store::FSFolder
    * KinoSearch::Store::RAMInvIndex => KinoSearch::Store::RAMFolder

  Updated documentation:

    * KinoSearch
    * KinoSearch::Docs::DevGuide
    * KinoSearch::Docs::FileFormat
    * KinoSearch::Docs::Tutorial

  Classes with API changes:

    * KinoSearch::InvIndexer
      o new() - Args changed.
        * create - Removed.
        * analyzer - Removed.
        * lock_id - Added.
      o spec_field() - Removed.
      o new_doc() - Removed.
      o add_doc() - Args changed.
        * Takes a hashref rather than a Doc object.
        * Accepts optional labeled param 'boost'.
      o delete_docs_by_term() - Removed.
      o delete_by_term() - Added.  (Behavior differs subtly from

    * KinoSearch::Searcher
      o new() - args changed.
        * analyzer - Removed.
      o search() - Now calls Hits->seek before returning Hits object.  Args
        * offset - Added.
        * num_wanted - Added.
        * sort_spec - Added.

    * KinoSearch::Search::Hits
      o Now comes pre-seeked, courtesy of changes to Searcher.
      o seek() - No longer triggers new number crunching if requested values
        can be accomodated using results of prior search.
      o fetch_hit() - Removed.
      o create_excerpts() - Now puts multiple excerpts under $hit->{excerpts}
        rather than one under $hit->{excerpt}.

    * KinoSearch::Search::MultiSearcher
      o new() - Args changed.
        * schema - Added.
        * analyzer - Removed.

    * KinoSearch::Highlight::Highlighter
      o new() - Args changed.
        * fields - Added.
        * excerpt_length - Now specified in characters rather than bytes.
        * excerpt_field - Removed.
        * pre_tag - Removed.
        * post_tag - Removed.

    * KinoSearch::QueryParser::QueryParser
      o new() - Args changed.
        * schema - Added.
        * default_field - Removed.
        * analyzer - No longer required -- now used to override schema.

    * KinoSearch::Analysis::TokenBatch
      o new() - Args changed.
        * text - Added.
      o next() - Returns a Token instead of a boolean.
      o reset() - Added.
      o add_many_tokens() - Added.
      o set_text(), get_text(), set_start_offset(), get_start_offset(),
        set_end_offset(), get_end_offset(), set_pos_inc(), get_pos_inc - All

  Internal changes:

    Large-scale refactoring has taken place.  The most significant 
    changes are...

    * OO framework imposed on C code via, with
      KinoSearch::Util::Obj as the base class.
    * Charmonizer added.
    * perlapi functions and data structures replaced whenever possible.
    * Lots of classes, especially under KinoSearch::Index, reorganized around
      Schema and SegInfo.  
    * Many tests added, removed, or revised to accomodate changes in the main
      library code.
    * C code moved to dedicated files.
    * Build.PL custom code moved to buildlib/
  File Format:

    * Significantly redesigned.  The most visible change is that the segments
      file is now encoded using YAML rather than an arbitrary binary format.
    * Old indexes cannot be read and must be regenerated.


    * write.lock files now located in the index directory rather than 
      under /tmp.
    * Commit locks are no longer needed due to file format changes.
    * Stale write locks are now removed without warning.

0.15 2006-12-04
  * Remove dead lock files when possible (with a warning), rather than failing
    outright.  (Credit to Matthew O'Connor, Luke Closs, Socialtext for
    providing initial implementation and test.)
  * Fix package name glitch in SearchClient.

0.14 2006-11-12 
  * Add MultiSearcher, SearchServer and SearchClient.

0.13 2006-08-19
  * Fix "negate operator" bug in QueryParser.
  * Allow multiple fields to be spec'd for QueryParser.
  * Add Finnish stoplist.
  * Add ExtUtils::ParseXS and ExtUtils::CBuilder as prereqs, since
    Module::Build doesn't handle C code as of 0.28.

0.12 2006-06-26
  * Modify Highlighter API 
    o Deprecate pre_tag, post_tag arguments to new().
    o Now encodes some HTML entities by default.
    o Add support for new classes Encoder, SimpleHTMLEncoder, Formatter, 
      and SimpleHTMLFormatter.
  * Add new class KinoSearch::Search::Hit.
  * Add Hits::fetch_hit, which returns a Hit object.
  * Expose experimental API for TokenBatch.
  * Expose experimental API for Analyzer::analyze().
  * Fix bug with Stopalized indexes and QueryParser.
  * Fix bug: returned hits now sort secondarily on doc_num as advertised.

0.11 2006-05-17
  * Restore Stopalizer functionality.
  * Launder filenames so they pass taint check when index is initialized.
  * Restore call to optimize() in Lucene benchmarker.

0.10 2006-05-04
  * Improved Windows compatibility.
  * Make it possible to subclass some KinoSearch classes. 
  * Add InVindexer::add_invindexes().
  * Add bin/dump_index, contributed by Brian Phillips.
  * Tighten up C code for ISO C90 compliance.
  * Improved support for Russian and KOI8-R encoding.
  * Fixed bug affecting indexes with segments bigger than 4 GB.
  * Fixed bug #18899, KinoSearch and locale.

0.09 2006-04-13
  * Incremental indexing enabled 
      o delete_docs_by_term() added to InvIndexer.
      o option 'optimize' added to InvIndexer::finish.
  * Hits now returns the top 100 matches by default unless seek() has been
  * QueryFilter added.
  * Benchmarking scripts added.

0.08  2006-03-10
  * Restore ability to overwrite invindexes.

0.07  2006-03-10
  * Cut down on file descriptor requirements at search-time, eliminating "too
    many open files" error.
  * Make cleaning of invindex dir less aggressive when create => 1 is

0.06  2006-03-02
  * Backwards incomaptible file format change (another is coming).
  * Opened up APIs for Query subclasses and QueryParser.
  * added KinoSearch::Highlight::Highlighter
  * Document, field, and query boosting enabled.
  * Behavior of KinoSearch::Search::Hits::fetch_hit_hashref modified.
  * KinoSearch::Document::Doc::to_hashref privatized.
  * Dependencies pared down.
  * Fixed bug affecting invindexes with 10 or more fields

0.05  2006-01-24
  * KinoSearch, a complete rewrite, supersedes Search::Kinosearch.