The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Wiki::Toolkit::Search::Base - Base class for Wiki::Toolkit search plugins.

SYNOPSIS

  my $search = Wiki::Toolkit::Search::XXX->new( @args );
  my %wombat_nodes = $search->search_nodes("wombat");

This class details the methods that need to be overridden by search plugins.

METHODS

new

  my $search = Wiki::Toolkit::Search::XXX->new( @args );

Creates a new searcher. By default the arguments are just passed to _init, so you may wish to override that instead.

search_nodes

  # Find all the nodes which contain the word 'expert'.
  my %results = $search->search_nodes('expert');

Returns a (possibly empty) hash whose keys are the node names and whose values are the scores in some kind of relevance-scoring system I haven't entirely come up with yet. For OR searches, this could initially be the number of terms that appear in the node, perhaps.

Defaults to AND searches (if $and_or is not supplied, or is anything other than OR or or).

Searches are case-insensitive.

analyze

    @terms = $self->analyze($string)

Splits a string into a set of terms for indexing and searching. Typically this is done case-insensitively, splitting at word boundaries, and extracting words that contain at least 1 word characters.

fuzzy_title_match

  $wiki->write_node( "King's Cross St Pancras", "A station." );
  my %matches = $search->fuzzy_title_match( "Kings Cross St. Pancras" );

Returns a (possibly empty) hash whose keys are the node names and whose values are the scores in some kind of relevance-scoring system I haven't entirely come up with yet.

Note that even if an exact match is found, any other similar enough matches will also be returned. However, any exact match is guaranteed to have the highest relevance score.

The matching is done against "canonicalised" forms of the search string and the node titles in the database: stripping vowels, repeated letters and non-word characters, and lowercasing.

index_node

  $search->index_node( $node, $content, $metadata );

Indexes or reindexes the given node in the search engine indexes. You must supply both the node name and its content, but metadata is optional.

If you do supply metadata, it will be used if and only if your chosen search backend supports metadata indexing (see supports_metadata_indexing). It should be a reference to a hash where the keys are the names of the metadata fields and the values are either scalars or references to arrays of scalars. For example:

  $search->index_node( "Calthorpe Arms", "Nice pub in Bloomsbury.",
                       { category => [ "Pubs", "Bloomsbury" ],
                         postcode => "WC1X 8JR" } );

canonicalise_title

    $fuzzy = $self->canonicalise_title( $ node);

Returns the node title as suitable for fuzzy searching: with punctuation and spaces removes, vowels removed, and double letters squashed.

delete_node

  $search->delete_node($node);

Removes the given node from the search indexes. NOTE: It's up to you to make sure the node is removed from the backend store. Croaks on error.

supports_phrase_searches

  if ( $search->supports_phrase_searches ) {
      return $search->search_nodes( '"fox in socks"' );
  }

Returns true if this search backend supports phrase searching, and false otherwise.

supports_fuzzy_searches

  if ( $search->supports_fuzzy_searches ) {
      return $search->fuzzy_title_match("Kings Cross St Pancreas");
  }

Returns true if this search backend supports fuzzy title matching, and false otherwise.

supports_metadata_indexing

  if ( $search->supports_metadata_indexing ) {
      print "This search backend indexes metadata as well as content.";
  }

Returns true if this search backend supports metadata indexing, and false otherwise.

SEE ALSO

Wiki::Toolkit