Martin H. Sluka


Tie::Hash::Abbrev::BibRefs - match bibliographic references to the original titles


  use Tie::Hash::Abbrev::BibRefs;

  tie my %hash, 'Tie::Hash::Abbrev::BibRefs',
      preprocess => sub { s/\s+[[:upper:]]:.*// },
      stopwords  => [ qw( a and de del der des di
                          et for für i if in la las
                          of on part Part Pt. Sect.
                          the to und ) ],
      exceptions => { jpn => 'japan',
                      natl => 'national' };

  $hash{'Physical Review B'} = '0163-1829';

  print $hash{'Phys. Rev. B: Condens. Matter Mater. Phys.'};
    # will print '0163-1829'


This module is an attempt to ease the mapping of often abbreviated bibliographical references to the original titles.

To achieve this, it simplyfies the title according to parameterizable rules and stores it as a normalized key.

When accessing the hash, the key given is also normalized and compared to the normalized version of the original title. In addition, each word (words are separated by whitespace) may be abbreviated by specifying only the first few letters.

If more than one matching hash entry is found, the values of all matching entries are compared; as long as they are all equal (or all undef), the lookup is still considered to be successful.


The process of normalization is implemented as follows:

  1. execute any preprocessing code (see "SYNOPSIS" in example above), which is expected to operate on $_. You can use subroutine references or strings here; strings will be eval()uated.

  2. split the key into parts (at whitespace).

  3. remove any parts contained in the list of stopwords (see example above).

  4. replace any parts contained in the list of exceptions by their corresponding value. If the value is undef, the entire part will be removed. (In the example above, "Jpn" would be replaced by "japan".) This lookup is done case-insensitively.

  5. remove any non-word characters at the end of each part or followed by a dash



turn debug mode on (when given a true value as argument) or off (when given a false value). Returns the (possibly new) value.

In debug mode, the "find" method will print debug messages to STDERR.


  my @deleted = tied(%hash)->delete_abbrev('foo','bar');

Will delete all elements on the basis of all unambiguous abbreviations given as arguments and return a (possibly empty) list of all deleted values.


get or set the exceptions table for the hash. Expects hash references or undef, which clears the table. Returns a reference to the new exception table.


set up the preprocessing code chain for the hash. Any code references or strings will be added to the chain, an undef will clear the chain.


get or set the /stopwords for the hash. Any arguments given will be added to the list of stopwords. An undef as argument will clear the list of stopwords. The method returns the new list of stopwords (in an unsorted manner).


The following methods should usually not be called "from the outside"; the main intention of ducumenting them is that the author still wants to understand his own module in case changes will be neccessary later. :o)


expects a key as first and a position as second argument. Returns the position if the given key equals (case-insensitively) the real key stored at that position or undef if not.


This is the central method for lookups, used by exists() and FETCH.

It expects a key as its only argument.

Upon success, the method returns an array index at which the corresponding value can be found, or undef otherwise.


Given a key as the its only argument, this method will return the normalized key in scalar and a three element list in array context, consisting of


the "prefix"


the "search pattern" and


the "normalized key".


expects an (usually normalized) key as (its only) argument and returns the position at which this key is stored (if it exists) or should be sorted (if it does not already exist).


expects no arguments and simply resets the iterator for the hash, so that the next call to each() will return the first key/value pair again.


None known so far.


        Martin H. Sluka


Dr. Hermann Schier from the Max Planck Institute for Solid State Research in Stuttgart/Germany for initiating and underwriting the development of this module and for contribution a lot of ideas.


This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.



3 POD Errors

The following errors were encountered while parsing the POD:

Around line 14:

Non-ASCII character seen before =encoding in 'für'. Assuming ISO8859-1

Around line 380:

Expected text after =item, not a number

Around line 384:

Expected text after =item, not a number