The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

 File::MultipleDiff - Compare multiple files

VERSION

 Version 0.02

SYNOPSIS

 use File::MultipleDiff;
 multiple_file_diff ( <input_directory>
                    , <file_names_pattern>
                    , 'c|b'
                    , <max_digits_amount> );

DESCRIPTION

 Compares many files with each other.
 Writes comparison results into a symmetric matrix having amount of
 rows and amount of columns equal with the amount of compared files.
 Every matrix element contains amount of differences between
 corresponding pair of compared files.

 If a directory contains file1, ... file5 and only these files, then
 multiple_file_diff ('directory');   produces following output:

 ---------------------------
       |  f   f   f   f   f 
       |  i   i   i   i   i 
       |  l   l   l   l   l 
       |  e   e   e   e   e 
       |  1   2   3   4   5 
 ---------------------------
 file1 |  0   1   2   0   3  
 file2 |  -   0   3   1   4  
 file3 |  -   -   0   2   5  
 file4 |  -   -   -   0   3  
 file5 |  -   -   -   -   0  
 
 In the example above file1 is identical with file4,
                      file2 has 4 differences from file5.
 Amount of different lines in files is meant as an amount of differences.

 Comparison of 2 objects is a commutative operation, that is its result does not depend
 on sequence of compared objects.
 That means, if A is equal with B, than B is equal with A,
             if A is not equal with B, that B is not equal with A.

 This obstacle forces a comparison matrix to be a symmetric one.
 The entries of a symmetric matrix are symmetric with respect to its main diagonal.
 See   http://en.wikipedia.org/wiki/Symmetric_matrix

 For performance's sake a half of the matrix will be filled in.

 Example:
 fileA        fileB
 ------------------
 line1        line1
 line2        line3
 line3        line4
 line4        line5

 Perl module Algorithm::Diff is used for file comparison.
 This module minimizes amount of differences and for the example above
 it finds 2 (not 3) differences between these files.

 fileA        fileB
 ------------------
 line1        line1
 line2                 1st difference
 line3        line3
 line4        line4
              line5    2nd difference

 A "secret" of this minimization is the longest common subsequence (LCS) method,
 that is implemented in that module.

Remark for more curious

 Have you noticed a fraud above?
 Amount of differences between 2 files is strictly speaking not commutative,
 if Algorithm::Diff is used.
 Nevertheless I've decided to create a triangular matrix, as if a full matrix
 were symmetric matrix indeed.
 This is acceptable for implementation of this module as a "chaosmeter".
 Assume, you expectation is that some kind of configuration files on many
 computers must be identical and you want to check this.
 Hopefully most of them might be identical, but some of them are different.
 Zeroes in the matrix mean identical files and identity check is commutative
 operation. Non-zeroes matrix elements mean divergence of file contents and
 a level of chaos. The larger matrix element, the larger distance between 2 files.
 A known from mathematics metric or distance function is similar with a conversion,
 made by Algorithm::Diff.
 Absent commutativity is known as quasimetric.
 Quote from http://en.wikipedia.org/wiki/Metric_(mathematics)#Quasimetrics
 "Quasimetrics are common in real life. ...
 Example is a taxicab geometry topology having one-way streets, where a path from
 point A to point B comprises a different set of streets than a path from B to A."

EXPORT

 multiple_file_diff

SUBROUTINES

multiple_file_diff

 multiple_file_diff (
     <input_directory>        # Directory, that contains all compared files;

   , <file_name_pattern>      # Regular expression, optional parameter,
                              # default value - all files in the input directory;
                             
   , 'c|b'                    # Refers to output of comparison, optional.
                              # c - colour, b - black/white output (b is default).
                              # Prerequisite for usage of colour mode is that
                              # terminal supports ANSI escape sequences.
                              # More about is here
                              # http://search.cpan.org/~rra/Term-ANSIColor-4.02/ANSIColor.pm ;

   , <max_digits_amount> );   # Max amount of digits in amounts of differences.
                              # Optional parameter, default value is 2.
                              # This parameter is self expandable and supports
                              # amount of differences until 9999.
                              # You can ignore the last parameter.

 Only 1st parameter of this subroutine must be specified.
 Undefined or empty further parameters will be replaces by default values.

AUTHOR

Mart E. Rivilis, <rivilism@cpan.org>

BUGS

 Please report any bugs or feature requests to bug-file-multiplediff@rt.cpan.org,
 or through the web interface at
 http://rt.cpan.org/NoAuth/ReportBug.html?Queue=File-MultipleDiff.
 I will be notified, and then you'll automatically be notified of progress on your
 bug as I make changes.

SUPPORT

 You can find documentation for this module with the perldoc command.
    perldoc File::MultipleDiff

 You can also look for information at:
  • RT: CPAN's request tracker (report bugs here)

     http://rt.cpan.org/NoAuth/Bugs.html?Dist=File-MultipleDiff
  • AnnoCPAN: Annotated CPAN documentation

     http://annocpan.org/dist/File-MultipleDiff
  • CPAN Ratings

     http://cpanratings.perl.org/d/File-MultipleDiff
  • Search CPAN

     http://search.cpan.org/dist/File-MultipleDiff/

LICENSE AND COPYRIGHT

 Copyright 2013 Mart E. Rivilis.
 This program is free software; you can redistribute it and/or modify it
 under the terms of the the Artistic License (2.0).