The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Chemistry::File::SMILES - SMILES linear notation parser/writer

SYNOPSYS

    #!/usr/bin/perl
    use Chemistry::File::SMILES;

    # parse a SMILES string
    my $s = 'C1CC1(=O)[O-]';
    my $mol = Chemistry::Mol->parse($s, format => 'smiles');

    # print a SMILES string
    print $mol->print(format => 'smiles');

    # print a unique (canonical) SMILES string
    print $mol->print(format => 'smiles', unique => 1);

    # parse a SMILES file
    my @mols = Chemistry::Mol->read("file.smi", format => 'smiles');

    # write a multiline SMILES file
    Chemistry::Mol->write("file.smi", mols => [@mols]);

DESCRIPTION

This module parses a SMILES (Simplified Molecular Input Line Entry Specification) string. This is a File I/O driver for the PerlMol project. http://www.perlmol.org/. It registers the 'smiles' format with Chemistry::Mol.

This parser interprets anything after whitespace as the molecule's name; for example, when the following SMILES string is parsed, $mol->name will be set to "Methyl chloride":

    CCl  Methyl chloride

The name is not included by default on output. However, if the name option is defined, the name will be included after the SMILES string, separated by a tab.

    print $mol->print(format => 'smiles', name => 1);

Multiline SMILES and SMILES files

A file or string can contain multiple molecules, one per line.

    CCl  Methyl chloride
    CO   Methanol

Files with the extension '.smi' are assumed to have this format.

OPTIONS

aromatic

On output, detect aromatic atoms and bonds by means of the Chemistry::Ring module, and represent the organic aromatic atoms with lowercase symbols.

unique

When used on output, canonicalize the structure if it hasn't been canonicalized already and generate a unique SMILES string. This option implies "aromatic".

kekulize

When used on input, assign single or double bond orders to "aromatic" or otherwise unspecified bonds (i.e., generate the Kekule structure). If false, the bond orders will remain single. This option is true by default. This uses assign_bond_orders from the Chemistry::Bond::Find module.

CAVEATS

Reading branches that start before an atom, such as (OC)C, which should be equivalent to C(OC) and COC, according to some variants of the SMILES specification. Many other tools don't implement this rule either.

VERSION

0.42

SEE ALSO

Chemistry::Mol, Chemistry::File

The SMILES Home Page at http://www.daylight.com/dayhtml/smiles/

The Daylight Theory Manual at http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html

The PerlMol website http://www.perlmol.org/

AUTHOR

Ivan Tubert-Brohman <itub@cpan.org>

COPYRIGHT

Copyright (c) 2004 Ivan Tubert-Brohman. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.