The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Reconciliation - Perl extension for data reconciliation

SYNOPSIS

   use Data::Table;

   use Data::Reconciliation;
   use Data::Reconciliation::Rule;

   my $src1 = Data::Table::fromCSV('test1.dat');
   my $src2 = Data::Table::fromCSV('test2.dat');

   my $rule = new Data::Reconciliation::Rule($src1, $src2);

   $rule->identification([<col_names>], \&canon_sub_1,
                         [<col_names>], \&canon_sub_2);
   $rule->add_comparison([<col_names>], \&canon_sub_3,
                         [<col_names>], \&canon_sub_4,
                         \&compare_sub, \@constants);

   my $r = new Data::Reconciliation($src1, $src2,
                                    -rules => [$rule]);

   $r->build_signatures(0);

   my($dup_signs_1,
      $dup_signs_2) = $r->duplicate_signatures;

   my($dup_signs_1,
      $dup_signs_2) = $r->delete_dup_signatures;

   my($widow_signs_1,
      $widow_signs_2) = $r->widow_signatures;

   my($widow_signs_1,
      $widow_signs_2) = $r->delete_wid_signatures;

   my @diffs = $r->reconciliate(0);

   package UserFunctions;

   sub fun_1 (\@\@\@\@;\@$) {
       my $field_names_1  = shift;
       my $field_values_1 = shift;
       my $field_names_2  = shift;
       my $field_values_2 = shift;
       
       my $constants      = shift;
       my $func_name      = shift;
       
       my $ok = (...);
       
       return undef if $ok;
       return "Not ok (comparing with $func_name)";
   }

DESCRIPTION

CONSTRUCTOR

new

This creates a new Data::Reconciliation object. The first two parameters are the sources to be reconciliated. They must be Data::Table objects.

The other parameters are optional named parameters.

CONSTRUCTOR OPTIONS

-rules => [ <rule list> ]

Provides the reconciliations rules. Each rule must be a Reconciliated::Data::Rule object (Reconciliated::Data::Rule.) The default rule uses the first column for the identification and compares one to one the other columns.

METHODS

build_signatures

This method is used to initialise a reconciliation process. It will setup the data needed to identify the records to be compared in the two sources. The rule number must be provided as parameter.

signatures

Returns two hash refs containing duplicate signatures as keys and array refs containing record indices as values. These signatures are the signatures actually built by the build_signatures method above.

duplicate_signatures

This method identifies in the two sources signatures which are not uniques. The rule nb must be provided as parameter. (The actual reconciliation algorithm only works on source with cardinality 1..1).

Returns two hash refs containing duplicate signatures as keys and array refs containing record indices as values.

delete_dup_signatures

Returns two hash refs containing the deleted signatures as keys and array refs containing record indices as values. The duplicates keys are calculated by calling the duplicate_signatures method.

widow_signatures

Returns two hash refs containing signatures from one data source missing in the other as keys and array refs containing record indices as values.

delete_wid_signatures

Returns two hash refs containing the deleted sigantures as values and record indices as values. The widow keys are calculated by calling the widow_keys method.

reconciliate

Returns a list of array refs. Each entry being an array containing respectively the signature, a reference on an arrayref containing the record indices in the sources, a reference on the applied rule, and a string describing the difference as returned by the (user defined ?) comparison function.

for reconciliate To work properly it is necessary to remove duplicate and widow signatures.

EXAMPLE

    #!/usr/local/bin/perl -w

    use lib qw(../lib);

    use Data::Table;

    use Data::Reconciliation;
    use Data::Reconciliation::Rule;

    my $file1 = new Data::Table
        ([['1234',  0,  '123,45', 'FRF'],
          ['1234',  1, '-123,45', 'FRF'],
          ['1235',  0,  '122,45', 'FRF'],
          ['1236',  0,  '121,50', 'FRF'],
          ['1237',  0,  '121,50', 'FRF'],
          ['1237',  0,  '50,121', 'CHF']],
         ['dealnb', 'leg', 'amt',     'ccy']);
    my $file2 = new Data::Table
        ([['1234-0',  123.45, 'FRF'],
          ['1234-1', -123.45, 'FRF'],
          ['1235-0',  122.47, 'FRF'],
          ['1236-0',  121.50, 'DEM'],
          ['1239-0',  50.121, 'CHF']],
         ['external-key', 'Amount',    'ccy']);

    my $rule = new Data::Reconciliation::Rule($file1, $file2);

    $rule->identification(['dealnb', 'leg'], sub{ join '-', @_ },
                          ['external-key'], undef);
    $rule->add_comparison(['amt'], sub {(my $v = shift) =~ tr/,/./; $v},
                          ['Amount'], undef,
                      undef);
    $rule->add_comparison(['ccy'], undef,
                          ['ccy'], undef,
                          undef);

    my $r = new Data::Reconciliation($file1,
                                     $file2,
                                     -rules => [$rule]);

    $r->build_signatures(0);

    my($dup_signs_from_1,
       $dup_signs_from_2) = $r->delete_dup_signatures;

    my($widow_signs_1,
       $widow_signs_2) = $r->delete_wid_signatures;

    print "The following signatures in Table1 leads to multiple entries :\n\t[",
        join('][', sort keys %$dup_signs_from_1), "]\n"
        if keys %$dup_signs_from_1;

    print "The following signatures in Table2 leads to multiple entries :\n\t[",
        join('][', sort keys %$dup_keys_from_2), "]\n"
        if keys %$dup_keys_from_2;

    print "The following entries in Table1 have no correspondant in Table 2 :\n\t[",
        join('][', sort keys %$widow_signs_1), "]\n"
        if keys %$widow_signs_1;

    print "The following entries in Table2 have no correspondant in Table 1 :\n\t[",
        join('][', sort keys %$widow_signs_2), "]\n"
        if keys %$widow_signs_2;

    @diffs = $r->reconciliate(0);
    print "The following entries were found to be different :\n\t",
        join("\n\t", map {$_->[0] . ': ' .  $_->[3]} @diffs), "\n"
        if @diffs;

EXAMPLE OUPUT

   The following signatures in Table1 leads to multiple entries :
        [1237-0]
   The following entries in Table2 have no correspondant in Table 1 :
        [1239-0]
   The following entries were found to be different :
        1236-0: SRC1.ccy=[FRF] <> SRC2.ccy=[DEM]
        1235-0: SRC1.amt=[122.45] <> SRC2.Amount=[122.47]

AUTHORS

Martial.Chateauvieux@sfs.siemens.de, O.Capdevielle@cadextan.fr

SEE ALSO

Data::Reconciliation, Data::Table