The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Chemistry::Canonicalize - Number the atoms in a molecule in a unique way

SYNOPSIS

    use Chemistry::Canonicalize ':all';

    # $mol is a Chemistry::Mol object
    canonicalize($mol);
    print "The canonical number for atom 1 is: ", 
        $mol->atoms(1)->attr("canon/class");
    print "The symmetry class for for atom 1 is: ", 
        $mol->atoms(1)->attr("canon/symmetry_class");

DESCRIPTION

This module provides functions for "canonicalizing" a molecular structure; that is, to number the atoms in a unique way regardless of the input order.

The canonicalization algorithm is based on: Weininger, et. al., J. Chem. Inf. Comp. Sci. 29[2], 97-101 (1989)

This module is part of the PerlMol project, http://www.perlmol.org/.

ATOM ATTRIBUTES

During the canonicalization process, the following attributes are set on each atom:

canon/class

The unique canonical number; it is an integer going from 1 to the number of atoms.

canon/symmetry_class

The symmetry class number. Atoms that have the same symmetry class are considered to be topologicaly equivalent. For example, the two methyl carbons on 2-propanol would have the same symmetry class.

FUNCTIONS

These functions may be exported, although nothing is exported by default.

canonicalize($mol, %opts)

Canonicalizes the molecule. It adds the canon/class and canon/symmetry class to every atom, as discussed above. This function may take the following options:

sort

If true, sort the atoms in the molecule in ascending canonical number order.

invariants

This should be a subroutine reference that takes an atom and returns a number. These number should be based on the topological invariant properties of the atom, such as symbol, charge, number of bonds, etc.

VERSION

0.11

TO DO

Add some tests.

CAVEATS

Currently there is an atom limit of about 430 atoms.

These algorithm is known to fail to discriminate between non-equivalent atoms for some complicated cases. These are usually highly bridged structures explicitly designed to break canonicalization algorithms; I don't know of any "real-looking structure" (meaning something that someone would actually synthesize or find in nature) that fails, but don't say I didn't warn you!

SEE ALSO

Chemistry::Mol, Chemistry::Atom, Chemistry::Obj, http://www.perlmol.org/.

AUTHOR

Ivan Tubert <itub@cpan.org>

COPYRIGHT

Copyright (c) 2009 Ivan Tubert. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.