The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Chemistry::Harmonia - Decision of simple and difficult chemical puzzles.

SYNOPSIS

  use Chemistry::Harmonia qw(:redox);
  use Data::Dumper;

  for my $formula ('Fe3O4', '[Cr(CO(NH2)2)6]4[Cr(CN)6]3'){
    my $ose = oxidation_state( $formula );
    print Dumper $ose;
  }

Will print something like:

  $VAR1 = {
          'O' => {
                  'num' => [ 4 ],
                  'OS' => [ [ -2 ] ]
                 },
          'Fe' => {
                  'num' => [ 3 ],
                  'OS' => [ [ 2, 3, 3 ] ]
                 }
         };
  $VAR1 = {
          'H' => { 'num' => [ 96 ],
                   'OS' => [ [ 1 ] ]
                 },
          'O' => { 'num' => [ 24 ],
                   'OS' => [ [ -2 ] ]
                 },
          'N' => { 'num' => [ 48, 18 ],
                   'OS' => [ [ -3 ], [ -3 ] ]
                 },
          'C' => { 'num' => [ 24, 18 ],
                   'OS' => [ [ 4 ], [ 2 ] ]
                 },
          'Cr' => { 'num' => [ 4, 3 ],
                    'OS' => [ [ 3 ], [ 2 ] ]
                  }
        };

Transformation of the chemical mix in reagent and product arrays:

  my $chemical_equation = 'KMnO4 + NH3 --> N2 + MnO2 + KOH + H2O';
  print Dumper parse_chem_mix( $chemical_equation );

Will print:

  $VAR1 = [
           ['KMnO4', 'NH3'],
           ['N2', 'MnO2', 'KOH','H2O']
          ];

Preparation of the chemical mix (equation) from reagent and product arrays:

  my $ce = [ [ 'K', 'O2'], [ 'K2O', 'Na2O2', 'K2O2', 'KO2' ] ];
  my $k = { 'K2O' => 1, 'Na2O2' => 0, 'K2O2' => 2, 'KO2' => 3 };
  print prepare_mix( $ce, { 'coefficients' => $k } ),"\n";

Will output:

  K + O2 == 1 K2O + 0 Na2O2 + 2 K2O2 + 3 KO2

'Synthesis' of the good :) chemical formula(s):

  my $abracadabra = 'ggg[crr(cog(nhz2)2)6]4[qcr(cn)5j]3qqq';
  print Dumper good_formula( $abracadabra );

Will output:

  $VAR1 = [
           '[Cr(CO(NH2)2)6]4[Cr(CN)5I]3',
           '[Cr(Co(NH2)2)6]4[Cr(CN)5I]3'
          ];

Calculation CLASS-CIR and brutto (gross) formulas of substances for reaction. See example:

  my $mix = '2 KMnO4 + 5 H2O2 + 3 H2SO4 --> 1 K2SO4 + 2 MnSO4 + 8 H2O + 5 O2';
  my %cf;
  my $ce = parse_chem_mix( $mix, \%cf );
  print Dumper class_cir_brutto( $ce, \%cf );

Will output:

  $VAR1 = [
          'HKMnOS',
          1504979632,
          {
            'O2' => 'O2',
            'MnSO4' => 'Mn1O4S1',
            'KMnO4' => 'K1Mn1O4',
            'K2SO4' => 'K2O4S1',
            'H2SO4' => 'H2O4S1',
            'H2O2' => 'H2O2',
            'H2O' => 'H2O1'
          }
        ];

TTC reaction. Proceeding example above:

  print Dumper ttc_reaction( $ce );

Will output:

  $VAR1 = {
          'r' => 5,
          'a' => 5,
          's' => 7
        };

DESCRIPTION

The module provides the necessary subroutines to solve some puzzles of the general inorganic and physical chemistry. The methods implemented in this module, are all oriented to known rules and laws of general and physical chemistry.

SUBROUTINES

Chemistry::Harmonia provides these subroutines:

    oxidation_state( $formula_of_substance )
    parse_chem_mix( $mix_of_substances [, \%coefficients ] )
    good_formula( $abracadabra [, { 'zero2oxi' => 1 } ] )
    prepare_mix( \@reactants_and_products [, \%facultative_parameters ] )
    class_cir_brutto( \@reactants_and_products [, \%coefficients ] )
    ttc_reaction( \@reactants_and_products )

All of them are context-sensitive.

oxidation_state( $formula_of_substance )

This subroutine returns a hierarchical hash-reference of hash integer oxidation state (key 'OS') and hash with the number of atoms for each element (key 'num') for the inorganic $formula_of_substance.

Always use the upper case for the first character in the element name and the lower case for the second character from Periodic Table. Examples: Na, Ag, Co, Ba, C, O, N, F, etc. Compare: Co - cobalt and CO - carbon monoxide.

For very difficult mysterious formula (usually organic) returns undef. It will be good if to set, for example, 'Pb3C2O7' and 'Pt2Cl6' as '{PbCO3}2{PbO}' and '{PtCl2}{PtCl4}'.

If you doesn't know formulas of chemical elements and/or Periodic Table use subroutine good_formula(). I insist to do it always anyway :)

Now oxidation_state() is checked over 6200 unique inorganic substances.

parse_chem_mix( $mix_of_substances [, \%coefficients ] )

A chemical equation consists of the chemical formulas of the reactants and products. This subroutine parses $mix_of_substances (usually chemical equation) to arrays of the reactants (initial substances) and products (substances formed in the chemical reaction). It is the most simple and low-cost way to carry out reaction without reactants :).

Separator of the reactants from products can be sequence '=', '-' together or without one or some '>'. For example: =, ==, =>, ==>, ==>>, -, --, ->, -->, ->>> etc. Spaces round a separator are not essential. If the separator is not set, last substance of a mix will be a product only.

Each individual substance's chemical formula is separated from others by a plus sign ('+'), comma (','), semicolon (';') and/or space. Valid examples:

  print Dumper parse_chem_mix( 'KNO3 + S ; K2SO4 , NO SO2' );

Will print:

  $VAR1 = [
            [ 'KNO3','S','K2SO4','NO' ],
            [ 'SO2' ]
        ];

If in $mix_of_substances is stoichiometric coefficients they collect in ref to hash \%coefficients. Next example:

  my %coef;
  my $chem_eq = 'BaS + 2 H2O = Ba(OH)2 + 1 Ba(SH)2';

  my $out_ce = parse_chem_mix( $chem_eq, \%coef );
  print Dumper( $out_ce, \%coef );

Will print something like:

  $VAR1 = [  [ 'BaS', 'H2O'], [ 'Ba(OH)2', 'Ba(SH)2'] ];
  $VAR2 = { 
       'Ba(SH)2' => '1',
      'H2O' => '2'
   };

By zero (0) coefficients it is possible to eliminate substances from the mix. Next example:

  my $chem_eq = '2Al O2 = Al2O3 0 CaO*Al2O3';

Will output like:

  $VAR1 = [ [ 'Al', 'O2' ], [  'Al2O3' ] ];
  $VAR2 = {  'Al' => '2' };

However:

  $chem_eq = '2Al O2  Al2O3 0 CaO*Al2O3';

Will output like:

  $VAR1 = [ [ 'Al', 'O2', 'Al2O3', 'O' ], [  'CaO*Al2O3' ] ];
  $VAR2 = {  'Al' => '2' };

As without a separator ('=' or others similar) the last substance will be a product.

If in $mix_of_substances is zero (0) similar oxygen, they are replaced oxygen. Certainly, oxygen is life. I love oxygen :) Some more examples:

  $chem_eq = '2Al 02 Ca CaO*Al2O3';

Will output like:

  $VAR1 = [ [ 'Al', 'O2', 'Ca' ], [ 'CaO*Al2O3' ] ];
  $VAR2 = { 'Al' => '2' };

Input:

  $chem_eq = '2Al 102 Ca CaO*Al2O3';

Output:

  $VAR1 = [ [ 'Al', 'Ca' ], [ 'CaO*Al2O3' ] ];
  $VAR2 = { 'Al' => '2', 'Ca' => '102' };

Input:

  $chem_eq = '2Al --> 0 0Al2O3 0';

Output:

  $VAR1 = [ [ 'Al' ], [ 'O' ] ];
  $VAR2 = { 'Al' => '2' };

Input:

  $chem_eq = 'Al O2 = 1 0Al2O3';

Output:

  $VAR1 = [ [ 'Al', 'O2' ], [  'OAl2O3' ] ];
  $VAR2 = { 'OAl2O3' => '1' };

The forced conversion of single zero to oxygen is set by parameter 'zero2oxi'. It add in \%coefficients. Next example:

  $coef{ 'zero2oxi' } = 1;
  $chem_eq = 'Al CaO = 0 Al2O3';

Output:

  $VAR1 = [ ['Al', 'CaO'], ['O', 'Al2O3'] ];

Without 'zero2oxi' output:

 $VAR1 = [ ['Al'], ['CaO'] ];

Actually the subroutine recognizes more many difficult situations. Here some examples:

  '2Al 1 02 Ca Al2O3' to-> [ ['Al', 'O2', 'Ca'], ['Al2O3'] ], {'Al' => 2, 'O2' => 1}
  '2Al 102 Ca Al2O3'  to-> [ ['Al', 'Ca'], ['Al2O3'] ], {'Al' => 2, 'Ca' => 102}
  '2Al 1 02 4O2 = 1 0Al2O3' to-> [ ['Al', 'O2'], ['OAl2O3'] ], {'Al' => 2, 'O2' => 4, 'OAl2O3' => 1}
  '2Al = 00 Al2O3'  to-> [ ['Al'], ['O0', 'Al2O3'] ], {'Al' => 2}
  '2Al O 2 = Al2O3'  to-> [ ['Al', 'O2'], ['Al2O3'] ], {'Al' => 2}
  '2Al O = '  to-> [ ['Al', 'O'], ['='] ], {'Al' => 2}
  ' = 2Al O'  to-> [ ['=','Al'], ['O'] ], {'Al' => 2}
  '0Al = O2 Al2O3'  to-> [ ['O2'], ['Al2O3'] ]
  '2Al 1 2 3 4 Ca 5 6 Al2O3 7 8 9'  to-> [ ['Al', 'Ca'], ['Al2O3'] ], {'Al2O3' => 56, 'Al' => 2, 'Ca' => 1234}
  '2Al 1 2 3 4 Ca 5 6 Al2O3'  to-> [ ['Al', 'Ca'], ['Al2O3'] ], {'Al2O3' => 56, 'Al' => 2, 'Ca' => 1234}
  '2Al 1 2 3 4 Ca 5 6 = Al2O3'  to-> [ ['Al', 'Ca56'], ['Al2O3'] ], {'Al' => 2, 'Ca56' => 1234}
  '2Al 1 2 3 4 Ca 5 6 = Al2O3 CaO 9'  to-> [ ['Al', 'Ca56'], ['Al2O3', 'CaO'] ], {'Al' => 2, 'Ca56' => 1234}
  'Al O + 2 = Al2O3'  to-> [ ['Al', 'O'], ['Al2O3'] ], {'Al2O3' => 2}
  'Cr( OH )  3 + NaOH = Na3[ Cr( OH )  6  ]'  to-> [ ['Cr(OH)3', 'NaOH'], ['Na3[Cr(OH)6]'] ]

good_formula( $abracadabra [, { 'zero2oxi' => 1 } ] )

This subroutine parses $abracadabra to array reference of "good" chemical formula(s). The "good" formula it does NOT mean chemically correct. The subroutine oxidation_state() will help with a choice chemically correct formula.

Algorithm basis is the robust sense and chemical experience.

  'Co'   to->  'Co'
  'Cc'   to->  'CC'
  'co'   to->  'CO', 'Co'
  'CO2'  to->  'CO2'
  'Co2'  to->  'Co2', 'CO2'
  'mo2'  to->  'Mo2'

The good formula(s) there are chemical elements, brackets ()[]{} and digits only. good_formula() love oxygen. Fraction will be scaled in the integer.

Fragments A*B, xC*yD are transformed to {A}{B}, {C}x{D}y (here A, B, C, D - groups of chemical elements, digits and brackets ()[]{}; x, y - digits only). Next examples:

  '0.3al2o3*1.5sio2'  to->  '{Al2O3}{SIO2}5', '{Al2O3}{SiO2}5'
  'al2(so4)3*10h20'   to->  '{Al2(SO4)3}{H20}10'
  '..,,..mg0,,,,.*si0...s..,..'  to->  '{MgO}{SIOS}', '{MgO}{SiOS}'

Superfluous brackets won't be:

  'Irj(){}[]'  to->  'IrI'
  '[{(na)}]'   to->  'Na'

However:

  '{[[([[CaO]])*((SiO2))]]}'  to->  '{([[CaO]])}{((SiO2))}'

If in $abracadabra is zero (0) similar oxygen, they are replaced oxygen. I love the oxygen is still :) Next examples:

  '00O02'  to->  'OOOO2'
  'h02'    to->  'Ho2', 'HO2'

However:

  'h20'    to->  'H20'

The forced conversion of zero to oxygen is set by parameter 'zero2oxi':

  my $chem_formulas = good_formula( 'h20', { 'zero2oxi' => 1 } );

Output @$chem_formulas:

  'H20', 'H2O'

If mode of paranoiac is necessary, then transform $abracadabra to low case as:

    lc $abracadabra

Beware use very long $abracadabra!

prepare_mix( \@reactants_and_products [, \%facultative_parameters ] )

This subroutine simple but useful. It forms the chemical mix (equation) from ref to array of arrays \@reactants_and_products, i.e. is parse_chem_mix antipode.

The following can be \%facultative_parameters: 'substances' - ref to array of real (required) substances, 'coefficients' - ref to hash stoichiometry coefficients for substances. Full examples:

  my $ce = [ [ 'O2', 'K' ], [ 'K2O', 'Na2O2', 'K2O2', 'KO2' ] ];
  my $k = { 'K' => 2, 'K2O2' => 1, 'KO2' => 0 };

  my $mix = prepare_mix( $ce, { 'coefficients' => $k } );

Will output $mix:

  O2 + 2 K == K2O + Na2O2 + 1 K2O2 + 0 KO2

For real substances:

  my $real = [ 'K', 'O2', 'K2O2', 'KO2' ];
  print prepare_mix( $ce, { 'coefficients' => $k, 'substances' => $real } );

Will print:

  O2 + 2 K == 1 K2O2 + 0 KO2

class_cir_brutto( \@reactants_and_products [, \%coefficients ] )

This subroutine calculates Unique Common Identifier of Reaction \@reactants_and_products with stoichiometry \%coefficients and brutto (gross) formulas of substances, i.e ref to array: 0th - alphabetic CLASS, 1th - Chemical Integer Reaction Identifier (CIR), 2th - hash brutto substances.

  my $reaction = '1 H2O + 1 CO2 --> 1 H2CO3';
  my %cf;
  my $ce = parse_chem_mix( $reaction, \%cf );
  print Dumper class_cir_brutto( $ce, \%cf );

Will print

  $VAR1 = [
          'CHO',
          1334303561,
          {
            'H2CO3' => 'C1H2O3',
            'CO2' => 'C1O2',
            'H2O' => 'H2O1'
          }
        ];

CIR is a 32 bit CRC of normalized chemical equation, generating the same CRC value as the POSIX GNU cksum program. The returned CIR will always be a non-negative integer in the range 0..2^32-1, i.e. 0..4,294,967,295.

The nature is diversiform, but we search simple decisions :)

The class_cir_brutto() protesting CLASS-CIR over 22,100 unique inorganic reactions. Yes, to me it was hard to make it.

ttc_reaction( \@reactants_and_products )

This subroutine calculates Tactico-Technical characteristics (TTC) of reaction \@reactants_and_products, sorry military slang :), i.e. quantity SAR: (s)ubstance, (a)toms and (r)ank of reaction. Proceeding example above:

  print Dumper ttc_reaction( $ce );

Will output:

  $VAR1 = {
          'r' => 2,
          'a' => 3,
          's' => 3
        };

EXPORT

Chemistry::Harmonia exports nothing by default. Each of the subroutines can be exported on demand, as in

  use Chemistry::Harmonia qw( oxidation_state );

the tag redox exports the subroutines oxidation_state, parse_chem_mix and prepare_mix:

  use Chemistry::Harmonia qw(:redox);

the tag equation exports the subroutines good_formula, parse_chem_mix, prepare_mix, class_cir_brutto and ttc_reaction:

  use Chemistry::Harmonia qw(:equation);

and the tag all exports them all:

  use Chemistry::Harmonia qw(:all);

DEPENDENCIES

Chemistry::Harmonia is known to run under perl 5.8.8 on Linux. The distribution uses Chemistry::File::Formula, Algorithm::Combinatorics, Regexp::Common, Math::BigInt, Math::BigRat, Math::Assistant, String::CRC::Cksum, Inline::Files and Carp.

SEE ALSO

Greenwood, Norman N.; Earnshaw, Alan. (1997), Chemistry of the Elements (2nd ed.), Oxford: Butterworth-Heinemann

Irving Langmuir. The arrangement of electrons in atoms and molecules. J. Am. Chem. Soc. 1919, 41, 868-934.

Chemistry-Elements, Chemistry::Mol, Chemistry::File and Chemistry::MolecularMass.

AUTHOR

Alessandro Gorohovski, <angel@feht.dgtu.donetsk.ua>

COPYRIGHT AND LICENSE

Copyright (C) 2011 by A. N. Gorohovski

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.