Devel::Git::MultiBisect - Study build and test output over a range of git commits
You will typically construct an object of a class which is a child of Devel::Git::MultiBisect, such as Devel::Git::MultiBisect::AllCommits, Devel::Git::MultiBisect::Transitions or Devel::Git::MultiBisect::BuildTransitions. All methods documented in this parent package may be called from any of these child classes.
use Devel::Git::MultiBisect::AllCommits; $self = Devel::Git::MultiBisect::AllCommits->new(\%parameters);
... or
use Devel::Git::MultiBisect::Transitions; $self = Devel::Git::MultiBisect::Transitions->new(\%parameters);
use Devel::Git::MultiBisect::BuildTransitions; $self = Devel::Git::MultiBisect::BuildTransitions->new(\%parameters);
... and then:
$commit_range = $self->get_commits_range(); $full_targets = $self->set_targets(\@target_args);
... or, under certain circumstances:
$full_targets = $self->set_outside_targets(\@target_args); $outputs = $self->run_test_files_on_one_commit($commit_range->[0]);
... followed by methods specific to the child class.
... and then perhaps also:
$timings = $self->get_timings();
Given a Perl library or application kept in git for version control, it is often useful to be able to compare the output collected from running one or more test files over a range of git commits. If that range is sufficiently large, a test may fail in more than one way over that range.
If that is the case, then simply asking, "When did this file start to fail?" -- a question which git bisect is designed to answer -- is insufficient. In order to identify more than one point of failure, we may need to (a) capture the test output for each commit; or, (b) capture the test output only at those commits where the output changed. The output of a run of a test file may change for a variety of reasons: test failures, segfaults, changes in the number or content of tests, etc.
git bisect
Devel::Git::MultiBisect provides methods to achieve that objective. Its child classes, Devel::Git::MultiBisect::AllCommits and Devel::Git::MultiBisect::Transitions, provide different flavors of that functionality for objectives (a) and (b), respectively. Please refer to their documentation for further discussion.
Child class Devel::Git::MultiBisect::BuildTransitions focuses on failures during the build process rather than during testing. It can handle three different types of problems which arise when you run make to build a Perl library or to build Perl itself:
Exceptions detected by the C-compiler
Warnings emitted by the C-compiler
Warnings emitted by perl or other languages invoked during make
See the documentation for further details.
commit
A source code change set entered ("committed") to a git repository. Each commit is denoted by a SHA. In this library, whenever a commit is called for as the argument to a function, you can also use a git tag.
commit range
The range of sequential commits (determined by git log) requested for analysis.
target
A test file from the test suite of the application or library under study.
outside_target
A test file outside the test suite of the application or library under study.
test output
What is sent to STDOUT or STDERR as a result of calling a test program such as prove or t/harness on an individual target file. Currently we assume that all such test programs are written based on the Test Anything Protocol (TAP).
transitional commit
A commit at which the test output for a given target (or outside target) changes from that of the commit immediately preceding.
digest
A string holding the output of a cryptographic process run on test output which uniquely identifies that output. (Currently, we use the Digest::SHA::md5_hex algorithm.) We assume that if the test output does not change between one or more commits, then that commit is not a transitional commit.
Digest::SHA::md5_hex
Note: Before taking a digest on a particular test output, we exclude text such as timings which are highly likely to change from one run to the next and which would introduce spurious variability into the digest calculations.
multisection or multibisection
A series of configure-build-test process sequences at those commits within the commit range which are selected by a bisection algorithm.
Normally, when we bisect (via git bisect, Porting/bisect.pl or otherwise), we are seeking a single point where a Boolean result -- yes/no, true/false, pass/fail -- is returned. What the test run outputs to STDOUT or STDERR is a lesser concern.
In multisection we bisect repeatedly to determine all points where the output of the test command changes -- regardless of whether that change is a PASS, FAIL or whatever. We capture the output for later human or programmatic examination.
PASS
FAIL
new()
Purpose
Constructor.
Arguments
$self = Devel::Git::MultiBisect::AllCommits->new(\%params);
or
$self = Devel::Git::MultiBisect::Transitions->new(\%params);
$self = Devel::Git::MultiBisect::BuildTransitions->new(\%params);
Reference to a hash, typically the return value of Devel::Git::MultiBisect::Opts::process_options().
Devel::Git::MultiBisect::Opts::process_options()
The hashref passed as argument must contain key-value pairs for gitdir, outputdir. new() tests for the existence of each of these directories.
gitdir
outputdir
Return Value
Object of Devel::Git::MultiBisect child class.
get_commits_range()
Identify the SHAs of each git commit identified by new().
$commit_range = $self->get_commits_range();
None; all data needed is already in the object.
Array reference, each element of which is a SHA.
set_targets()
Identify the test files which will be run at different points in the commits range. We shall assume that each such test file has existed with its name unchanged over the entire commit range. We further assume that each such test file resides in or under the top-level directory of the git checkout, i.e., that the file can be specified by its relative path from the top-level directory. (Should the latter assumption not be valid, use set_outside_targets().)
set_outside_targets()
$target_args = [ 't/44_func_hashes_mult_unsorted.t', 't/45_func_hashes_alt_dual_sorted.t', ]; $full_targets = $self->set_targets($target_args);
Reference to an array holding the relative paths beneath the gitdir to the test files selected for examination.
Reference to an array holding hash references with these elements:
path
Absolute paths to the test files selected for examination. Test file is tested for its existence.
stub
String composed by taking an element in the array ref passed as argument and substituting underscores C(<_>) for forward slash (/) and dot (.) characters. So,
/
.
t/44_func_hashes_mult_unsorted.t
... becomes:
t_44_func_hashes_mult_unsorted_t
Identify the test files which will be run at different points in the commits range. This method differs from set_targets() in that it assumes that the targeted test file sits outside the git repository in which the source code resides and, consequently, must be specified with an absolute path.
$target_args = [ '/tmp/gh-22159-class.t', ]; $full_targets = $self->set_outside_targets($target_args);
Reference to an array holding the absolute paths to the test files selected for examination. NOTE: This method has not yet been tested with more than one file in $target_args.
$target_args
/tmp/gh-22159-class.t
_tmp_gh-22159-class_t
run_test_files_on_one_commit()
Capture the output from running the selected test files at one specific git checkout.
$outputs = $self->run_test_files_on_one_commit("2a2e54a");
$excluded_targets = [ 't/45_func_hashes_alt_dual_sorted.t', ]; $outputs = $self->run_test_files_on_one_commit("2a2e54a", $excluded_targets);
String holding the SHA from a single commit in the repository. This string would typically be one of the elements in the array reference returned by $self-get_commits_range()>. If no argument is provided, the method will default to using the first element in the array reference returned by $self-get_commits_range()>.
$self-
Reference to array of target test files to be excluded from a particular invocation of this method. Optional, but will die if argument is not an array reference.
Reference to an array, each element of which is a hash reference with the following elements:
String holding the SHA from the commit passed as argument to this method (or the default described above).
commit_short
String holding the value of commit (above) to the number of characters specified in the short element passed to the constructor; defaults to 7.
short
file_stub
String holding a rewritten version of the relative path beneath gitdir of the test file being run. In this relative path forward slash (/) and dot (.) characters are changed to underscores C(<_>). So,
t_44_func_hashes_mult_unsorted_t'
file
String holding the full path to the file holding the TAP output collected while running one test file at the given commit. The following example shows how that path is calculated. Given:
output directory (outputdir) => '/tmp/DQBuT_SRAY/' SHA (commit) => '2a2e54af709f17cc6186b42840549c46478b6467' shortened SHA (commit_short) => '2a2e54a' test file (target->[$i]) => 't/44_func_hashes_mult_unsorted.t'
... the file is placed in the directory specified by outputdir. We then join commit_short (the shortened SHA), file_stub (the rewritten relative path) and the strings output and txt with a dot to yield this value for the file element:
output
txt
2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt
md5_hex
String holding the return value of Devel::Git::MultiBisect::Auxiliary::hexdigest_one_file() run with the file designated by the file element as an argument. (More precisely, the file as modified by Devel::Git::MultiBisect::Auxiliary::clean_outputfile().)
Devel::Git::MultiBisect::Auxiliary::hexdigest_one_file()
Devel::Git::MultiBisect::Auxiliary::clean_outputfile()
Example:
[ { commit => "2a2e54af709f17cc6186b42840549c46478b6467", commit_short => "2a2e54a", file => "/tmp/1mVnyd59ee/2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt", file_stub => "t_44_func_hashes_mult_unsorted_t", md5_hex => "31b7c93474e15a16d702da31989ab565", }, { commit => "2a2e54af709f17cc6186b42840549c46478b6467", commit_short => "2a2e54a", file => "/tmp/1mVnyd59ee/2a2e54a.t_45_func_hashes_alt_dual_sorted_t.output.txt", file_stub => "t_45_func_hashes_alt_dual_sorted_t", md5_hex => "6ee767b9d2838e4bbe83be0749b841c1", }, ]
Comment
In this method's current implementation, we start with a git checkout from the repository at the specified commit. We configure (e.g., perl Makefile.PL) and build (e.g., make) the source code. We then test each of the test files we have targeted (e.g., prove -vb relative/path/to/test_file.t). We redirect both STDOUT and STDERR to outputfile, clean up the outputfile to remove the line containing timings (as that introduces unwanted variability in the md5_hex values) and compute the digest.
git checkout
perl Makefile.PL
make
prove -vb relative/path/to/test_file.t
outputfile
This implementation is very much subject to change.
If a true value for verbose has been passed to the constructor, the method prints Created [outputfile] to STDOUT before returning.
verbose
Created [outputfile]
Note: While this method is publicly documented, in actual use you probably will not need to call it directly. Instead, you will probably use either Devel::Git::MultiBisect::AllCommits::run_test_files_on_all_commits() or Devel::Git::MultiBisect::Transitions::multisect_all_targets().
Devel::Git::MultiBisect::AllCommits::run_test_files_on_all_commits()
Devel::Git::MultiBisect::Transitions::multisect_all_targets()
get_timings()
Get information on the time a multisection took to run.
Hash reference. The selection of elements in this hashref will depend on which subclass of Devel::Git::MultiBisect you are using and may differ among subclasses. Example:
{ elapsed => 4297, mean => 186.83, runs => 23 }
In this example (taken from a run of one test file over 220 commits in Perl 5 blead), 23 runs were needed to achieve a result. These took 4297 seconds (approximately 71 minutes) with a mean run time of approximately 3 minutes each.
Method will return undefined value if timings are not yet available within the object.
Please report any bugs by mail to bug-Devel-Git-MultiBisect@rt.cpan.org or through the web interface at http://rt.cpan.org.
bug-Devel-Git-MultiBisect@rt.cpan.org
James E. Keenan (jkeenan at cpan dot org). When sending correspondence, please include 'Devel::Git::MultiBisect' or 'Devel-Git-MultiBisect' in your subject line.
Creation date: October 12 2016. Last modification date: April 23 2024.
Development repository: https://github.com/jkeenan/devel-git-multibisect
Thanks to the following contributors and reviewers:
Smylers
For naming suggestion: http://www.nntp.perl.org/group/perl.module-authors/2016/10/msg10851.html
Ricardo Signes
For feedback during initial development.
Eily and Monk::Thomas
For diagnosis of regex problems in http://perlmonks.org/?node_id=1175983.
Max Maischein
For diagnosis of File::Temp problems in https://perlmonks.org/?node_id=11136181.
Copyright (c) 2016-2021 James E. Keenan. United States. All rights reserved. This is free software and may be distributed under the same terms as Perl itself.
To install Devel::Git::MultiBisect, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Devel::Git::MultiBisect
CPAN shell
perl -MCPAN -e shell install Devel::Git::MultiBisect
For more information on module installation, please visit the detailed CPAN module installation guide.