Barry King


Apache::Wyrd::Site::Index - Wrapper Index for the Apache::Wyrd::Site classes


  use base qw(Apache::Wyrd::Site::Index);

  sub new {
    my ($class) = @_;
    my $init = {
      file => '/var/www/data/pageindex.db',
      debug => 0,
      reversemaps => 1,
      bigfile => 1,
      attributes => [qw(doctype meta)],
      maps => [qw(meta)]
    return &Apache::Wyrd::Site::Index::new($class, $init);
  sub ua {
    return BASENAME::UA->new;
  sub skip_file {
    my ($self, $file) = @_;
    return 1 if ($file eq 'test.html');


Apache::Wyrd::Site::Index provides an extended version of the Apache::Wyrd::Services::Index object for use in the Apache::Wyrd::Site hierarchy.

Although it does not extend the parent class to include useful indexable attributes beyond the default ones (attributes: reverse, timestamp, digest, data, word, wordcount, title, keywords, description; maps: word), there are several that are used by Pull Wyrds in the hierarchy that need to be passed to the initialization hash (see SYNOPSIS for an example) to utilize them. These are: attributes doctype, section, parent, shorttitle, published, auth, orderdate, eventdate, tags, children and maps tags, children. See Apache::Wyrd::Site::Page


Note: This class extends the Apach::Wyrd::Services::Index class, so check the documentation of that module for most methods. It provides an index of Apache::Wyrd::Site::Page objects.

(format: (returns) name (arguments after self))

(arrayref of hashrefs) get_children (scalar, hashref)

Given an pagename (See Page in this subclass), the method returns the entries of all children of that page in the navigation hierarchy. The arrayref is in the order determined by the Index object (see Apache::Wyrd::Services::Index), and returns that data which is limited optionally by the parameters specified in the hashref which is handed directly to the get_entry method (see the get_entry method of the Apache::Wyrd::Services::Index class).

(scalar) index_site (Apache req handle, scalar)

This method is an obsolete way of running through the files of a site and committing them to index. Please use the much newer and fault-tolerant Apache::Wyrd::Site::IndexBot.

That being said, the method takes the current Apache request object handle, and a scalar which indicates whether it should perform a complete index or only update since the last time this flag was non-null, and returns the text output of the update process.

(hashref) lookup (scalar)


(scalar) lookup (scalar, scalar)

Look up and return data from the index. In both forms, the first argument is a scalar representation of the page. This can be the page name, which means the path after document root or the page's internal index ID (an integer).

If the specific attribute is not given, the method returns a hashref of the full data for the page. If the attribute is given, only the value of that attribute is given.

(scalar) purge_missing (Apache request handle)

like index_site, this is an obsolete method of removing deleted documents from an index. Please use the more fault-tolerant Apache::Wyrd::Site::IndexBot object.

It takes the Apache req object as an argument, and returns a scalar of the text of the output from that purge.

(scalar) skip_file (scalar)

Simple filter for removing files from consideration by the index. Intended as an over-loadable handle. Returns 0 if the file should be indexed. Defaults to 0.

(objectref) ua (hashref)

Another over-loadable handle. Should return a handle to a LWP useragent object (See LWP::UserAgent) appropriate for navigating the site, which is to say it should have some way of handling access and authentication appropriate to the site's construction. There is no default ua; the webmaster will need to define it in order to use this object.

Note that this method is required by Apache::Wyrd::Services::IndexBot.


An obsolete appendix to an obsolete mechanism. Reserves the new method, which it passes unaltered to Apache::Wyrd::Services::Index. index_site, skip_file, and purge_missing are obsolete and may be dropped in future versions. See Apache::Wyrd::Services::Index for other bugs/warnings.


Barry King <>



General-purpose HTML-embeddable perl object


General-purpose search engine index object


Copyright 2002-2007 Wyrdwright, Inc. and licensed under the GNU GPL.

See LICENSE under the documentation for Apache::Wyrd.