The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

  run_QF.pl -- simple HTTP server for query filtering of SRU

SYNOPSIS

  run_QF.pl [--testdict] [--testquery] [--verbose] <AlvisDir>

DESCRIPTION

--testdict Load up dictionaries, do simple checking, then quit.

--testquery Transform queries and return response without forwarding query to a real SRU server.

--verbose Some additional trace data provided.

This is a simple SRU query filter built using HTTP::Daemon. All configuration data is read from the ALVIS configuration file at <AlvisDir>/alvis.cnf. Error messages and a simple URL trail go to stderr. The linguistic resources used by Alvis::Query filter are located in <AlvisDir>/resources.

It is intended to be copied and modified for any application.

CONFIGURATION

QF_PORT Port number for this server.

QF_TEXT Space delimited list of fields that text matches go to.

YAZ_PORT Port number to forward transformed SRU queries to.

DATA

All resources have one entry per line, and each entry has fields that are tab delimited. Spacing within a field should be standardised to single spaces. The "types" file should be non-existant if named entities are also listed as having ontology nodes.

<AlvisDir>/resources/lemmas : Lists (text-occurrence,lemma-form) for lemmatising words.

<AlvisDir>/resources/NEs : Lists (text-occurrence,canonical-form) for matching named entities.

<AlvisDir>/resources/onto_nodes : Lists (canonical-form,ontology-node) for matching lemmas, terms and named entities that are located in the ontology.

<AlvisDir>/resources/onto_paths : Lists (ontology-node,ontology-path) giving fully expanded path for each node.

<AlvisDir>/resources/terms : Lists (text-occurence,canonical-form) for matching terms.

<AlvisDir>/resources/types : Lists (canonical-form,type) for named entities. Types are short text items (e.g., 'species', 'company', 'person') used to categorise named entities when no ontology is in use.

Entries in "NEs" and "terms" are applied as rules to query words, with longest match applying first. Once all these are done, the typing or ontology forms are applied.

Resources are best manipulated and iported/exported as a single XML file using the routines of zebractl(1).

SEE ALSO

Alvis::QueryFilter(3), zebractl(1), zebrad(1), HTTP::Daemon(3).

See http://www.alvis.info/alvis/Architecture_2fFormats#queryfilter for sample use, the XML formats and the schema. See http://www.alvis.info/alvis/Architecture_2fFormats#filterresources for description of the linguistic resources and an XML Schema.

AUTHOR

Kimmo Valtonen, Wray Buntine

COPYRIGHT AND LICENSE

Copyright (C) 2006 Kimmo Valtonen, Wray Buntine

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.