The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

cpanest - generate an Hyper Estraier index for CPAN

SYNOPSIS

cpanest [-clean] [-noclean] [-cpan url or directory] [-node node_uri] [-force] [-noforce] [-keep directory] [-match regexp] [-test level] [-trust_mtime] [-notrust_mtime]

DESCRIPTION

This is a port of cpanwait from WAIT perl search engine to node API of Hyper Estraier.

All the hard work was done by Ulrich Pfeifer who wrote all parsers and formatters. I just added support for Hyper Estraier back-end after.

This documentation is somewhat incomplete and off-the-sync with code.

OPTIONS

-clean / -noclean

Clean the table befor indexing. Default is off.

-cpan url or directory

Default directory or URL for indexing. If an URL is given, there currently must be a file indices/find-ls.gz relative to it which contains the output of find . -ls | gzip. Default is ftp://ftp.rz.ruhr-uni-bochum.de/pub/CPAN.

-node http://localhost:1978/node/cpan

Specify node URI

-force

Force reindexing, even if cpan thinks files are up to date. Default is off

-keep directory

If fetching from a remote server, keep files in directory. Default is /app/unido-i06/src/share/lang/perl/96a/CPAN/sources.

-match regexp

Limit to patches matching regexp. Default is authors/id/.

-test level

Set test level, were 0 means normal operation, 1 means, don't really index and 2 means, don't even get archives and examine them.

-trust_mtime / -notrust_mtime

If on, the files mtimes are used to decide, which version of an archive is the newest. If b<off>, the version extracted is used (beware, there are far more version numbering schemes than cpan can parse).

AUTHORS

Ulrich Pfeifer <pfeifer@ls6.informatik.uni-dortumund.de>

Dobrica Pavlinusic <dpavlin@rot13.org>

COPYRIGHT

Copyright (c) 1996-1997, Ulrich Pfeifer

Copyright (c) 2005, Dobrica Pavlinusic

NAME

HyperEstraier::WAIT::Table

DESCRIPTION

This is a mode that emulates WAIT::Table functionality somewhat.

There are some limitations and only one key attribute is supported (and used for @uri).

Porting from WAIT to this module.

Since only one key is supported (and used as @uri attribute), use first parametar of keyset as key.

Full text index is specified as invindex, but you need just name of fields.

You will probably need to add

 use WAIT::Parse::Base;

to your code after you remove WAIT::Config and WAIT::Database.

METHODS

new

  my $tb = new HyperEstraier::WAIT::Table(
        uri     => 'http://localhost:1978/node/cpan',
        attr    => qw/docid headline source size parent/,
        key     => 'docid',
        invindex => qw/name synopsis bugs description text environment example author/,
  );

have

  if ( $tb->have(docid => $something) ) ...

insert

  my $key = $tb->insert(
        docid   => $base,
        headline => 'Something',
        ...
  );

delete_by_key

  $tb->delete_by_key($key);

delete

  $tb->delete( docid => $did, ... );