The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Devel::Git::MultiBisect::BuildTransitions - Gather build-time output where it changes over a range of git commits

SYNOPSIS

    use Devel::Git::MultiBisect::BuildTransitions;

    $self = Devel::Git::MultiBisect::BuildTransitions->new(\%parameters);

    $commit_range = $self->get_commits_range();

    $self->multisect_builds();

    $multisected_outputs = $self->get_multisected_outputs();

    $transitions = $self->inspect_transitions();
}

DESCRIPTION

When the number of commits in the specified range is large and you only need the build-time output at those commits where the output materially changed, you can use this package, Devel::Git::MultiBisect::BuildTransitions.

METHODS

new()

multisect_builds()

  • Purpose

    With a given set of configuration options and a specified range of git commits, identify the point where the "build command" -- typically, make -- first threw exceptions and then all subsequent commits where the build-time exceptions materially changed. A "material change" would be either a correction of all exceptions or a set of different build-time exceptions from those first observed. Store the test output at those transition points for human inspection.

  • Arguments

        $self->multisect_builds();
    
        $self->multisect_builds({ probe => 'error' });
    
        $self->multisect_builds({ probe => 'warning' });

    Optionally takes one hash reference. At present that hashref may contain only one element whose key is probe and whose possible values are error or warning. Defaults to error. Select between those values depending on whether you are probing for changes in errors or changes in warnings.

  • Return Value

    Returns true value upon success.

  • Comment

    As multisect_builds() runs it does two kinds of things:

    • It stores results data within the object which you can subsequently access through method calls.

    • It captures error messages from each commit run and writes them to a file on disk for later human inspection.

At this point, $all_outputs is an array ref with one element per commit in the commit range. If a commit has been visited, the element is a hash ref with 4 key-value pairs like the ones below. If the commit has not yet been visited, the element is undef.

    [
      {
        commit => "7c9c5138c6a704d1caf5908650193f777b81ad23",
        commit_short => "7c9c513",
        file => "/home/jkeenan/learn/perl/multisect/7c9c513.make.errors.rpt.txt",
        md5_hex => "d41d8cd98f00b204e9800998ecf8427e",
      },
      undef,
      undef,
    ...
      undef,
      {
        commit => "8f6628e3029399ac1e48dfcb59c3cd30e5127c3e",
        commit_short => "8f6628e",
        file => "/home/jkeenan/learn/perl/multisect/8f6628e.make.errors.rpt.txt",
        md5_hex => "fdce7ff2f07a0a8cd64005857f4060d4",
      },
    ]

Unlike Devel::Git::MultiBisect::Transitions -- where we could have been testing multiple test files on each commit -- here we're only concerned with recording the presence or absence of build-time errors. Hence, we only need an array of hash refs rather than an array of arrays of hash refs.

The multisection process will entail running run_build_on_one_commit() over each commit selected by the multisection algorithm. Each run will insert a hash ref with the 4 KVPs into @{$self->{all_outputs}}. At the end of the multisection process those elements which we did not need to visit will still be undef. We will then analyze the defined elements to identify the transitional commits.

The objective of multisection is to identify the git commits at which the build output -- as reflected in a file on disk holding a list of normalized errors -- materially changed. We are using an md5_hex value for that error file as a presumably valid unique identifier for that file's content. A transition point is a commit at which the output file's md5_hex differs from that of the immediately preceding commit. So, to identify the first transition point, we need to locate the commit at which the md5_hex changed from that found in the very first commit in the designated commit range. Once we've identified the first transition point, we'll look for the second transition point, i.e., that where the md5_hex changed from that observed at the first transition point. We'll continue that process until we get to a transition point where the md5_hex is identical to that of the very last commit in the commit range.

get_multisected_outputs()

  • Purpose

    Get results of multisect_builds() (other than test output files created) reported on a per commit basis.

  • Arguments

        my $multisected_outputs = $self->get_multisected_outputs();

    None; all data needed is already present in the object.

  • Return Value

    Reference to an array with one element for each commit in the commit range.

    • If a particular commit was not visited in the course of multisect_builds(), then the array element is undefined. (The point of multisection, of course, is to not have to visit every commit in the commit range in order to figure out the commits at which test output changed.)

    • If a particular commit was visited in the course of multisect_builds(), then the array element is a hash reference whose elements have the following keys:

          commit
          commit_short
          file
          md5_hex

inspect_transitions()

  • Purpose

    Get a data structure which reports on the most meaningful results of multisect_builds(), namely, the first commit, the last commit and all transitional commits.

  • Arguments

        my $transitions = $self->inspect_transitions();

    None; all data needed is already present in the object.

  • Return Value

    Reference to a hash with 3 key-value pairs. Each element's value is another hash reference. The elements of the top-level hash are:

    • oldest

      Value is reference to hash keyed on idx, md5_hex and file, whose values are, respectively, the index position of the very first commit in the commit range, the digest of that commit's test output and the path to the file holding that output.

    • newest

      Value is reference to hash keyed on idx, md5_hex and file, whose values are, respectively, the index position of the very last commit in the commit range, the digest of that commit's test output and the path to the file holding that output.

    • transitions

      Value is reference to an array with one element for each transitional commit. Each such element is a reference to a hash with keys older and newer. In this context older refers to the last commit in a sub-sequence with a particular digest; newer refers to the next immediate commit which is the first commit in a new sub-sequence with a new digest.

      The values of older and newer are, in turn, references to hashes with keys idx, md5_hex and file. Their values are, respectively, the index position of the particular commit in the commit range, the digest of that commit's test output and the path to the file holding that output.

    Example:

  • Comment

    The return value of inspect_transitions() should be useful to the developer trying to determine the various points in a long series of commits where a target's test output changed in meaningful ways. Hence, it is really the whole point of Devel::Git::MultiBisect::BuildTransitions.