The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Test::Files - A Test::Builder based module to ease testing with files and dirs.

In general, the following can be tested:

  • If the contents of the file being tested match the expected pattern.

  • If the file being tested is identical to the expected file in regard to contents, or size, or existence. If necessary, some parts of the contents can be excluded from the comparison.

  • If the directory being tested contains all expected files.

  • If the files in the directory being tested are identical to the files in the reference directory in regard to contents, or size, or existence. If necessary, some files as well as some parts of contents can be excluded from the comparison.

  • If all files in the directory being tested fulfill certain requirements.

  • If the archive (container) being tested is logically identical to the the reference archive (container). If necessary, some members of archives, as well as some parts of their contents, as well as some metadata can be excluded from the comparison.

SYNOPSIS

All examples listed below can be found and executed using xt/synopsis.t located on GitHub.

use Path::Tiny qw( path );
use Test::Files;

my $got_file       = path( 'path' )->child( qw( got file ) );
my $reference_file = path( 'path' )->child( qw( reference file ) );
my $got_dir        = path( 'path' )->child( qw( got dir ) );
my $reference_dir  = path( 'path' )->child( qw( reference dir with some stuff ) );
my @file_list      = qw( expected file );
my ( $content_check, $expected, $filter, $options );

plan( 24 );

# Simply compares file contents to a string:
$expected = "contents\nof file";
file_ok( $got_file, $expected, 'got file has expected contents' );

# Two identical variants comparing file contents
# to a string ignoring differences in time stamps:
$expected = "filtered contents\nof file\ncreated at 00:00:00";
$filter   = sub {
  shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
            {00:00:00}grx
};
$options  = { FILTER => $filter };
file_ok       (
  $got_file, $expected, $options,
  "'$got_file' has contents expected after filtering"
);
file_filter_ok(
  $got_file, $expected, $filter,
  "'$got_file' has contents expected after filtering"
);

# Simply compares two file contents:
compare_ok( $got_file, $reference_file, 'files are the same' );

# Two identical variants comparing contents of two files
# ignoring differences in time stamps:
$filter  = sub {
  shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
            {00:00:00}grx
};
$options = { FILTER => $filter };
compare_ok       (
  $got_file, $reference_file, $options, 'files are almost the same'
);
compare_filter_ok(
  $got_file, $reference_file, $filter,  'files are almost the same'
);

# Verifies if both got file and reference file exist:
$options = { EXISTENCE_ONLY => 1 };
compare_ok( $got_file, $reference_file, $options, 'both files exist' );

# Verifies if got file and reference file have identical size:
$options = { SIZE_ONLY => 1 };
compare_ok(
  $got_file, $reference_file, $options, 'both files have identical size'
);

# Verifies if the directory has all expected files (not recursively!):
$expected = [ qw( files got_dir must contain ) ];
dir_contains_ok( $got_dir, $expected, 'directory has all files in list' );

# Two identical variants doing the same verification as before,
# but additionally verifying if the directory has nothing
# but the expected files (not recursively!):
$options = { SYMMETRIC => 1 };
dir_contains_ok     (
  $got_dir, $expected, $options, 'directory has exactly the files in the list'
);
dir_only_contains_ok(
  $got_dir, $expected,           'directory has exactly the files in the list'
);

# The same as before, but recursive:
$options = { RECURSIVE => 1, SYMMETRIC => 1 };
dir_contains_ok(
  $got_dir, $expected, $options,
  'directory and its subdirectories have exactly the files in the list'
);

# The same as before, but ignoring files,
# which names do not match the required pattern (file "must" will be skipped):
$options = { NAME_PATTERN => '^[cfg]', RECURSIVE => 1, SYMMETRIC => 1 };
dir_contains_ok(
  $got_dir, $expected, $options,
  'directory and its subdirectories ' .
  "have exactly the files in the list except of file 'must'"
);

# Compares two directories by comparing file contents (not recursively!):
compare_dirs_ok(
  $got_dir, $reference_dir,
  "all files from '$got_dir' are the same in '$reference_dir' " .
  '(same names, same contents), subdirs are skipped'
);

# The same as before, but subdirectories are considered, too:
$options = { RECURSIVE => 1 };
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' and its subdirs are the same in '$reference_dir'"
);

# The same as before, but only file sizes are compared:
$options = { RECURSIVE => 1, SIZE_ONLY => 1 };
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' and its subdirs have same sizes in '$reference_dir'"
);

# The same as before, but only file existence is verified:
$options = { EXISTENCE_ONLY => 1, RECURSIVE => 1 };
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' and its subdirs exist in '$reference_dir'"
);

# The same as before, but only files with base names starting with 'A' are considered:
$options = { EXISTENCE_ONLY => 1, NAME_PATTERN => '^A', RECURSIVE => 1 };
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' and its subdirs " .
  "with base names starting with 'A' exist in '$reference_dir'"
);

# The same as before, but the symmetric verification is requested:
$options = {
  EXISTENCE_ONLY => 1,
  NAME_PATTERN   => '^A',
  RECURSIVE      => 1,
  SYMMETRIC      => 1,
};
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' and its subdirs with base names " .
  "starting with 'A' exist in '$reference_dir' and vice versa"
);

# Two identical variants of comparison of two directories by file contents,
# whereas these contents are first filtered
# so that time stamps in form of 'HH:MM:SS' are replaced by '00:00:00'
# like in examples for file_filter_ok and compare_filter_ok:
$filter  = sub {
  shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
            {00:00:00}grx
};
$options = { FILTER => $filter };
compare_dirs_ok(
  $got_dir, $reference_dir, $options,
  "all files from '$got_dir' are the same in '$reference_dir', " .
  'subdirs are skipped, differences of time stamps ignored'
);
compare_dirs_filter_ok(
  $got_dir, $reference_dir, $filter,
  "all files from '$got_dir' are the same in '$reference_dir', " .
  'subdirs are skipped, differences of time stamps ignored'
);

# Verifies if all plain files in directory and its subdirectories
# contain the word 'good' (take into consideration the -f test below
# excluding special files from comparison!):
$content_check = sub {
  my ( $file ) = @_;
  ! -f $file or path( $file )->slurp =~ / \b good \b /x;
};
$options       = { RECURSIVE => 1 };
find_ok(
  $got_dir, $content_check, $options,
  "all files from '$got_dir' and subdirectories contain the word 'good'"
);

# Compares PKZIP archives considering both global and file comments.
# Both archives contain the same members in different order:
my $extract = sub {
  my ( $file ) = @_;
  my $zip = Archive::Zip->new();
  die( "Cannot read '$file'" ) if $zip->read( $file ) != AZ_OK;
  die( "Cannot extract from '$file'" ) if $zip->extractTree != AZ_OK;
};
my $meta_data = sub {
  my ( $file ) = @_;
  my $zip = Archive::Zip->new();
  die( "Cannot read '$file'" ) if $zip->read( $file ) != AZ_OK;
  my %meta_data = ( '' => $zip->zipfileComment );
  $meta_data{ $_->fileName } = $_->fileComment foreach $zip->members;
  return \%meta_data;
};
my $got_compressed_content       = path( "$got_file.zip"       )->slurp;
my $reference_compressed_content = path( "$reference_file.zip" )->slurp;
ok(
  $got_compressed_content ne $reference_compressed_content,
  "'$got_file.zip' and '$reference_file.zip' are physically different, but"
);
compare_archives_ok(
  "$got_file.zip", "$reference_file.zip", { EXTRACT => $extract, META_DATA => $meta_data },
  "'$got_file.zip' and '$reference_file.zip' are logically identical"
);

DESCRIPTION

This module is like Test2::V0 or Test::Expander, in fact you should use that first as shown above. It supports comparison of files and directories in different ways.

Any file or directory passed to functions of this module can be both a string or an object of Path::Tiny.

Though the test names i.e. the last parameter of every function is optional, you should provide a name of each test for a better maintainability.

You should follow the lead of the "SYNOPSIS" examples and use Path::Tiny or, if you prefer, File::Spec. This makes it much more likely that your tests will pass on a different operating system.

All of the contents comparison routines provide diff diagnostic output when they report failure. The diff output style can be changed using the option STYLE (see below).

The filter function receives each line of each file. It may perform any necessary transformations (like excising dates), then it must return the line in (possibly) transformed state. For example, the first filter of Phil Crow, the creator of this module was

sub chop_dates {
  my $line = shift;
  $line =~ s/\d{4}(.\d\d){5}//g;
  return $line;
}

This removes all strings like 2003.10.14.14.17.37. Everything else is unchanged and failing tests started passing when they should. If you want to exclude the line from consideration, return empty string or undef.

FUNCTIONS

file_ok

There are two forms of calls:

The generic form.

file_ok( $got_file, $expected_string, \%options, $test_name )

The short form, which is also backward compatible.

file_ok( $got_file, $expected_string, $test_name )

Compares the contents of a file $got_file to a string $expected_string.

In the generic form, if the parameter \%options is passed and contains the key FILTER, file_ok provides the same functionality as file_filter_ok.

Supported options:

FILTER

Code reference providing filtering of file contents before comparison. The only expected parameter is the current line from the file contents, the return value replaces this line. In addition, the special variable $. representing the number of the current line in the file can be used. If the return value is undefined, empty string is returned instead. Line breaks are neither removed nor added after the execution.

Defaults to undef i.e. no filtering is provided.

All options supported by Text::Diff except of FILENAME_A and FILENAME_B.

The most useful of them seems to be STYLE defining the style of output for content differences. Defaults to Unified.

file_filter_ok

There is only one form of call namely file_filter_ok( $got_file, $expected_string, \&filter_func, $test_name ).

Works like file_ok with the option FILTER i.e. compares the contents of a file to a string, but filters the file first using &filter_func for that. The string contents must be filtered before if necessary.

This function is deprecated and stays for backward compatibility reasons only.

compare_ok

There are two forms of calls:

The generic form.

compare_ok( $got_file, $reference_file, \%options, $test_name )

The short form, which is also backward compatible.

compare_ok( $got_file, $reference_file, $test_name )

Compares two files.

In the generic form, if the parameter \%options is passed and contains the key FILTER, compare_ok provides the same functionality as compare_filter_ok.

Supported options:

EXISTENCE_ONLY

Boolean. If set to true, only existence of both $got_file and $reference_file is compared.

Defaults to false.

FILTER

Code reference providing filtering of file contents before comparison and being applied to both $got_file and $reference_file. The only expected parameter is the current line from the file contents, the return value replaces this line. In addition, the special variable $. representing the number of the current line in the file can be used. If the return value is undefined, empty string is returned instead. Line breaks are neither removed nor added after the execution.

Ignored if either EXISTENCE_ONLY or SIZE_ONLY is set to true.

Defaults to undef i.e. no filtering is provided.

SIZE_ONLY

Boolean. If set to true and the options EXISTENCE_ONLY is not set to true, $got_file and $reference_file are compared by size only.

Defaults to false.

All options supported by Text::Diff except of FILENAME_A and FILENAME_B.

The most useful of them seems to be STYLE defining the style of output for content differences. Defaults to Unified.

compare_filter_ok

There is only one form of call namely compare_filter_ok( $got_file, $reference_file, \&filter_func, $test_name ).

Works like compare_ok with option FILTER i.e. compares the contents of two files, but sends each line through the filter &filter_func so things that shouldn't count against success can be stripped.

This function is deprecated and stays for backward compatibility reasons only.

dir_contains_ok

There are two forms of calls:

The generic form.

dir_contains_ok( $got_dir, \@file_list, \%options, $test_name )

The short form, which is also backward compatible.

dir_contains_ok( $got_dir, \@file_list, $test_name )

Verifies the directory $got_dir for the presence of a list files in @file_list. If $got_dir is a symlink, this will be accepted, but symlinks therein are not followed. Subdirectories are not involved in the verification, but files located therein are considered if recursive appraoch is required (see the option RECURSIVE below). Special files like named pipes are involved in the verification only if the sole file existence is required (see the option EXISTENCE_ONLY below), otherwise they are skipped and reported as error.

In the generic form, if the parameter \%options is passed and contains the key SYMMETRIC set to true, dirs_contains_ok provides the same functionality as dir_only_contains_ok.

Supported options:

NAME_PATTERN

String containing RegEx. Files with base names not matching this RegEx will be skipped.

Defaults to the dot sign (.) i.e. no file will be skipped.

RECURSIVE

Boolean. If set to true, subdirectories of $got_dir will be checked, too.

Defaults to false.

SYMMETRIC

Boolean. If set to true, additionally verifies if all files from $got_dir are listed in @file_list.

Defaults to false.

dir_only_contains_ok

There is only one form of call namely dir_only_contains_ok( $got_dir, \@file_list, $test_name ).

Works like dir_contains_ok with option SYMMETRIC set to true i.e. checks directory without following symlinks therein to ensure that the listed files are present and that they are the only ones present.

This function is deprecated and stays for backward compatibility reasons only.

compare_dirs_ok

There are two forms of calls:

The generic form.

compare_dirs_ok( $got_dir, $reference_dir, \%options, $test_name )

The short form, which is also backward compatible.

compare_dirs_ok( $got_dir, $reference_dir, $test_name )

Compares all files in the directories $got_dir and $reference_dir reporting differences. If $got_dir or $reference_dir is a symlink, this will be accepted, but symlinks therein are not followed.

In the generic form, if the parameter \%options is passed and contains the key FILTER, compare_dirs_ok provides the same functionality as compare_dirs_filter_ok.

Supported options:

EXISTENCE_ONLY

Boolean. If set to true, only checks if every file from $reference_dir is found in $got_dir.

Defaults to false.

FILTER

Code reference providing filtering of file contents before comparison and applied to files from both $got_dir and $reference_dir. The only expected parameter is the current line from the file contents, the return value replaces this line. In addition, the special variable $. representing the number of the current line in the file can be used. If the return value is undefined, empty string is returned instead. Line breaks are neither removed nor added after the execution.

Ignored if either EXISTENCE_ONLY or SIZE_ONLY is set to true.

Defaults to undef i.e. no filtering is provided.

NAME_PATTERN

String containing RegEx. Files with base names not matching this RegEx will be skipped both in $got_dir and $reference_dir.

Defaults to the dot sign (.) i.e. no file will be skipped.

RECURSIVE

Boolean. If set to true, subdirectories of both $got_dir and $reference_dir will be checked, too.

Defaults to false.

SIZE_ONLY

Boolean. If set to true and the options EXISTENCE_ONLY is not set to true, files from $got_dir and $reference_dir are compared by size only.

Defaults to false.

SYMMETRIC

Boolean. If set to true, additionally verifies if all files from $got_dir exist in $reference_dir, too.

Defaults to false.

All options supported by Text::Diff except of FILENAME_A and FILENAME_B.

The most useful of them seems to be STYLE defining the style of output for content differences. Defaults to Unified.

compare_dirs_filter_ok

There is only one form of call namely compare_dirs_filter_ok( $got_dir, $reference_dir, \&filter_func, $test_name ).

Works like compare_dirs_ok with option FILTER i.e. calls the filter function &filter_func on each line of every file allowing you to exclude or alter some text to avoid spurious failures (like timestamp disagreements).

This function is deprecated and stays for backward compatibility reasons only.

find_ok

The signature is find_ok( $got_dir, \&content_check_func, \%options, $test_name ).

Verifies if the condition &content_check_func is true for all files in directory $got_dir. The code reference &content_check_func returning boolean is called for any type of file except of directory i.e. for symlinks, devices, etc and the only parameter is the full-qualified file name. If you want to consider plain files only, you must apply the test operator -f to the parameter like shown in "SYNOPSIS".

Supported options:

RECURSIVE

Boolean. If set to true, subdirectories of $got_dir will be checked, too.

Defaults to false.

compare_archives_ok

The signature is compare_archives_ok( $got_archive, $reference_archive, \%options, $test_name ).

Verifies if the archives (containers) $got_archive and $reference_archive are logically identical. The term "logically identical" means that these files might be physically different e.g. because their members are stored in different order, or because some members are marked as deleted, but the metadata relevant for the current test case and the members are identical.

Which metadata and which members must be compared can be controlled using \%options.

The comparison itself begins with the extraction and comparison of metadata; if they are not identical, no further comparison is provided and the test fails. If the metadata comparison succeeds, members of $got_archive and $reference_archive are extracted in temporary directories and compared in the same manner like compare_dirs_ok this does.

Supported options:

All options supported by compare_dirs_ok.
EXTRACT

Code reference. Extracts members from the archive in the current directory. The only expected parameter is the archive file name. The current directory at the time point of extraction is a temporary directory that is removed after the test.

The return value is ignored.

Defaults to empty function sub {}.

META_DATA

Code reference. Returns metadata e.g. comments from a PKZIP archive. The only expected parameter is the archive file name.

Defaults to empty function sub {}.

SEE ALSO

Consult Test::Simple, Test2::V0, and Test::Builder for more testing help. This module really just adds functions to what Test2::V0 does. As recommended by the author of Test::More and Test2::V0, the latter module should be preferred, that's why Test::More is not listed in "SYNOPSIS".

BUGS

Please report any bugs or feature requests through the web interface at https://github.com/jsf116/Test-Files/issues.

CAVEATS

Although this module can cope with binary files, too, confirming their equality, but in case of differences a proper representation of comparison results is not guaranteed.

AUTHOR

Phil Crow, <philcrow2000@yahoo.com>

Jurij Fajnberg, <fajnbergj@gmail.com>

COPYRIGHT AND LICENSE

Copyright 2003-2007 by Phil Crow

Copyright 2020-2024 by Jurij Fajnberg

This module is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.