lccnorm - normalize Library of Congress Classification call numbers


lccnorm [option...] [file...]

lccnorm -h|--help

lccnorm -V|--version


lccnorm transforms LC-style call numbers into a form that may be used in a straight ASCII sort.

By default, each line of input is assumed to consist of a number of tab-delimited fields, of which the first contains an LC-style call number or class.

If no file is specified, or if the file name - is specified, standard input will be processed.

Normalization of call number ranges is a special challenge, because ranges are not normally specified using the exact endpoint. Consider the range B708-B713; while B708 does indicate the beginning point -- a no call number that comes before B 708 can fall within the range -- the end point is only a guide, not a strict limit, since the intent is that call numbers such as B 713 .H94 and B 713 .W55 L86 do fall within the range. Unfortunately, ranges are often specified ambiguously; for example, the call number B 713.14 G92 might or might not be considered to fall within this range.


-d, --delimiter string

Specify a string other than a single tab (ASCII character 9) to delimit the fields in a line of input. This also provides the default for joining fields in the output; see option -j below.

-f, --field range

The call number (or range) is found in the given range of fields. Fields are 1-based (the first field is field 1, not field 0) and are separated by a single tab character (unless option -d is used to specify an alternate delimiter).

When parsing call numbers (not ranges), all fields are concatenated using a single space to form the call number that will be normalized.

When parsing ranges, there are four possibilities:

1 field

The field contains a range in the form prefix (e.g., J80), closed range (e.g., ML566-566.6 or ML566-ML566.6) or half-open range (e.g., KME451<KME500).

2 fields

The first field is the beginning of the (closed) range, and the second is the end.

3 fields

The first field is a prefix common to both the beginning and end of the range, and the second and third fields are the remainders.

4 fields

The first and third fields together are the beginning of the range, and the second and fourth are the end.

For example, the following all produce identical output:

    $ echo 'PL4501-4509'     | lccnorm -d: -f1
    $ echo 'PL4501:PL4509'   | lccnorm -d: -f1-2
    $ echo 'PL:4501:4509'    | lccnorm -d: -f1-3
    $ echo 'PL:4501:PL:4509' | lccnorm -d: -f1-4
-j, --join string

Specify a character to use when joining fields for output. The default is to use the same string specified in option -d, or a single tab if -d was not given.


Don't delete the input fields from which call numbers (or ranges) were obtained. Either the -b and -e option must be provided to specify where the normalized string is to be placed.

-b, --begin

Place normalized strings at the beginning of output lines.

-e, --end

Place normalized strings at the end of output lines.

-D, --die-on-error

Exit with a non-zero status as soon as an unnormalizable input is encountered. The default is to issue a warning and normalize to the empty string.

-v, --verbose

Be verbose. This currently has no effect unless used with the -V or --version option.

-h, --help

Print help information and exit.

-V, --version

Print the version number and exit. If the -v or --verbose option is specified, print out additional information.

-M, --manual

View the manual page for lccnorm.

-L, --license

View the license under which lccnorm is distributed.


Paul Hoffman (nkuitse AT nkuitse DOT com)


Copyright 2007 Paul M. Hoffman.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl:


the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version;



the "Artistic License".

For the full text of these licenses, see the script file itself or enter the command lccnorm -L.