The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Git::More - A Git extension with some goodies for hook developers

VERSION

version 1.14.1

SYNOPSIS

    use Git::More;

    my $git = Git::More->repository();

    my $config  = $git->get_config();
    my $branch  = $git->get_current_branch();
    my @commits = $git->get_commits($oldcommit, $newcommit);
    my $message = $git->get_commit_msg('HEAD');

    my $files_modified_by_commit = $git->filter_files_in_index('AM');
    my $files_modified_by_push   = $git->filter_files_in_range('AM', $oldcommit, $newcommit);

DESCRIPTION

This is an extension of the Git class. It's meant to implement a few extra methods commonly needed by Git hook developers.

In particular, it's used by the standard hooks implemented by the Git::Hooks framework.

CONFIGURATION VARIABLES

CONFIG_ENCODING

Git configuration files usually contain just ASCII characters, but values and sub-section names may contain any characters, except newline. If your config files have non-ASCII characters you should ensure that they are properly decoded by specifying their encoding like this:

    $Git::More::CONFIG_ENCODING = 'UTF-8';

The acceptable values for this variable are all the encodings supported by the Encode module.

METHODS

get_config [SECTION [VARIABLE]]

This method groks the configuration options for the repository by invoking git config --list. The configuration is cached during the first invokation in the object Git::More object. So, if the configuration is changed afterwards, the method won't notice it. This is usually ok for hooks, though.

With no arguments, the options are returned as a hash-ref pointing to a two-level hash. For example, if the config options are these:

    section1.a=1
    section1.b=2
    section1.b=3
    section2.x.a=A
    section2.x.b=B
    section2.x.b=C

Then, it'll return this hash:

    {
        'section1' => {
            'a' => [1],
            'b' => [2, 3],
        },
        'section2.x' => {
            'a' => ['A'],
            'b' => ['B', 'C'],
        },
    }

The first level keys are the part of the option names before the last dot. The second level keys are everything after the last dot in the option names. You won't get more levels than two. In the example above, you can see that the option "section2.x.a" is split in two: "section2.x" in the first level and "a" in the second.

The values are always array-refs, even it there is only one value to a specific option. For some options, it makes sense to have a list of values attached to them. But even if you expect a single value to an option you may have it defined in the global scope and redefined in the local scope. In this case, it will appear as a two-element array, the last one being the local value.

So, if you want to treat an option as single-valued, you should fetch it like this:

    $h->{section1}{a}[-1]
    $h->{'section2.x'}{a}[-1]

If the SECTION argument is passed, the method returns the second-level hash for it. So, following the example above, this call:

    $git->get_config('section1');

This call would return this hash:

    {
        'a' => [1],
        'b' => [2, 3],
    }

If the section doesn't exist an empty hash is returned. Any key/value added to the returned hash will be available in subsequent invokations of get_config.

If the VARIABLE argument is also passed, the method returns the value(s) of the configuration option SECTION.VARIABLE. In list context the method returns the list of all values or the empty list, if the variable isn't defined. In scalar context, the method returns the variable's last value or undef, if it's not defined.

cache SECTION

This method may be used by plugin developers to cache information in the context of a Git::More object. SECTION is a string (usually a plugin name) that is associated with a hash-ref. The method simply returns the hash-ref, which can be used by the caller to store any kind of information.

clean_cache SECTION

This method deletes the cache entry for SECTION. It may be used by hooks just before returning to Git::Hooks::run_hooks in order to get rid of any value kept in the SECTION's cache.

get_commit COMMIT

This method returns a hash representing COMMIT. It obtains this information by invoking git rev-list --no-walk --encoding=UTF-8 COMMIT.

The returned hash has the following structure (the codes are explained in the git help rev-list document):

    {
        commit          => %H:  commit hash
        tree            => %T:  tree hash
        parent          => %P:  parent hashes (space separated)
        author_name     => %aN: author name
        author_email    => %aE: author email
        author_date     => %ai: author date in ISO8601 format
        committer_name  => %cN: committer name
        committer_email => %cE: committer email
        committer_date  => %ci: committer date in ISO8601 format
        body            => %B:  raw body (aka commit message)
    }

All character data is UTF-8 encoded.

get_commits OLDCOMMIT NEWCOMMIT

This method returns a list of hashes representing every commit reachable from NEWCOMMIT but not from OLDCOMMIT. It obtains this information by invoking git rev-list NEWCOMMIT ^OLDCOMMIT.

There are two special cases, though:

If NEWCOMMIT is the null SHA-1, i.e., '0000000000000000000000000000000000000000', this means that a branch, pointing to OLDCOMMIT, has been removed. In this case the method returns an empty list, meaning that no new commit has been created.

If OLDCOMMIT is the null SHA-1, this means that a new branch poiting to NEWCOMMIT is being created. In this case we want all commits reachable from NEWCOMMIT but not reachable from any other branch. The syntax for this is NEWCOMMIT ^B1 ^B2 ... ^Bn", i.e., NEWCOMMIT followed by every other branch name prefixed by carets. We can get at their names using the technique described in, e.g., this discussion.

get_commit_msg COMMIT_ID

This method returns the commit message (a.k.a. body) of the commit identified by COMMIT_ID. The result is a string.

read_commit_msg_file FILENAME

This method returns the relevant contents of the commit message file called FILENAME. It's useful during the commit-msg and the prepare-commit-msg hooks.

The file is read using the character encoding defined by the i18n.commitencoding configuration option or utf-8 if not defined.

Some non-relevant contents are stripped off the file. Specifically:

  • diff data

    Sometimes, the commit message file contains the diff data for the commit. This data begins with a line starting with the fixed string diff --git a/. Everything from such a line on is stripped off the file.

  • comment lines

    Every line beginning with a # character is stripped off the file.

  • trailing spaces

    Any trailing space is stripped off from all lines in the file.

  • trailing empty lines

    Any empty line at the end is stripped off from the file, making sure it ends in a single newline.

All this cleanup is performed to make it easier for different plugins to analyse the commit message using a canonical base.

write_commit_msg_file FILENAME, MSG, ...

This method writes the list of strings MSG to FILENAME. It's useful during the commit-msg and the prepare-commit-msg hooks.

The file is written to using the character encoding defined by the i18n.commitencoding configuration option or utf-8 if not defined.

An empty line (\n\n) is inserted between every pair of MSG arguments, if there is more than one, of course.

filter_files_in_index FILTER

This method returns a list of the names of the files that are changed in the index (staging area) compared to the HEAD commit. It's useful in the pre-commit hook when you want to know which files are being modified in the upcoming commit.

FILTER specifies in which kind of changes you're interested in. It's passed as the argument to the --diff-filter option of git diff-index, which is documented like this:

  --diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]

    Select only files that are Added (A), Copied (C), Deleted (D), Modified
    (M), Renamed (R), have their type (i.e. regular file, symlink,
    submodule, ...) changed (T), are Unmerged (U), are Unknown (X), or have
    had their pairing Broken (B). Any combination of the filter characters
    (including none) can be used. When * (All-or-none) is added to the
    combination, all paths are selected if there is any file that matches
    other criteria in the comparison; if there is no file that matches other
    criteria, nothing is selected.

filter_files_in_range FILTER, FROM, TO

This method returns a list of the names of the files that are changed between FROM and TO commits. It's useful in the update and the pre-receive hooks when you want to know which files are being modified in the commits being received by a git push command.

FILTER specifies in which kind of changes you're interested in. Please, read the filter_files_in_index documetation above.

FROM and TO are revision parameters (see git help revisions) specifying two commits. They're passed as arguments to git diff-tree in order to compare them and grok the files that differ between them.

filter_files_in_commit FILTER, COMMIT

This method returns a list of the names of the files that are changed in COMMIT. It's useful in the patchset-created and the draft-published hooks when you want to know which files are being modified in the single commit being received by a git push command.

FILTER specifies in which kind of changes you're interested in. Please, read the filter_files_in_index documetation above.

COMMIT is a revision parameter (see git help revisions) specifying the commit. It's passed a argument to git diff-tree in order to compare it to its parents and grok the files that changed in it.

Merge commits are treated specially. Only files that are changed in COMMIT with respect to all of its parents are returned. The reasoning behind this is that if a file isn't changed with respect to one or more of COMMIT's parents, then it must have been checked already in those commits and we don't need to check it again.

set_affected_ref REF OLDCOMMIT NEWCOMMIT

This method should be used in the beginning of an update, pre-receive, or post-receive hook in order to record the references that were affected by the push command. The information recorded will be later used by the following get_affected_ref* methods.

get_affected_refs

This method returns the list of names of references that were affected by the current push command, as they were set by calls to the set_affected_ref method.

get_affected_ref_range(REF)

This method returns the two-element list of commit ids representing the OLDCOMMIT and the NEWCOMMIT of the affected REF.

get_affected_ref_commit_ids(REF)

This method returns the list of commit ids leading from the affected REF's NEWCOMMIT to OLDCOMMIT.

get_affected_ref_commits(REF)

This routine returns the list of commits leading from the affected REF's NEWCOMMIT to OLDCOMMIT. The commits are represented by hashes, as returned by the get_commits method.

authenticated_user

This method returns the username of the authenticated user performing the Git action. It groks it from the githooks.userenv configuration variable specification, which is described in the Git::Hooks documentation. It's useful for most access control check plugins.

push_input_data DATA

This method gets a single value and tucks it in an internal list so that every piece of data can be gotten later with the get_input_data method below.

It's used by Git::Hooks to save arguments read from STDIN by some Git hooks like pre-receive, post-receive, pre-push, and post-rewrite.

get_input_data

This method returns an array-ref pointing to a list of all pieces of data saved by calls to push_input_data method above.

set_authenticated_user USERNAME

This method can be used to set the username of the authenticated user when the default heristics defined above aren't enough. The name will be cached so that subsequent invokations of authenticated_user will return this.

get_current_branch

This method returns the repository's current branch name, as indicated by the git symbolic-ref HEAD command.

If the repository is in a dettached head state, i.e., if HEAD points to a commit instead of to a branch, the method returns undef.

get_sha1 REV

This method returns the SHA1 of the commit represented by REV, using the command

  git rev-parse --verify REV

It's useful, for instance, to grok the HEAD's SHA1 so that you can pass it to the get_commit method.

get_head_or_empty_tree

This method returns the string "HEAD" if the repository already has commits. Otherwise, if it is a brand new repository, it returns the SHA1 representing the empty tree. It's useful to come up with the correct argument for, e.g., git diff during a pre-commit hook. (See the default pre-commit.sample script which comes with Git to understand how this is used.)

blob REV, FILE, ARGS...

This method returns the name of a temporary file into which the contents of the file FILE in revision REV has been copied.

It's useful for hooks that need to read the contents of changed files in order to check anything in them.

These objects are cached so that if more than one hook needs to get at them they're created only once.

By default, all temporary files are removed when the Git::More object is destroyed.

Any remaining ARGS are passed as arguments to File::Temp::newdir so that you can have more control over the temporary file creation.

If REV:FILE does not exist or if there is any other error while trying to fetch its contents the method throws a Git::Simple or a Git::Error::Command exception.

file_size REV FILE

This method returns the size (in bytes) of FILE (a path relative to the repository root) in revision REV.

error PREFIX MESSAGE [DETAILS]

This method should be used by plugins to record consistent error or warning messages. It gets two or three arguments. The PREFIX is usually the plugin's package name. The MESSAGE is a oneline string. These two arguments are combined to produce a single line like this:

  [PREFIX] MESSAGE

DETAILS is an optional string. If present, it is appended to the line above, separated by an empty line, and with its lines prefixed by two spaces, like this:

  [PREFIX] MESSAGE

    DETAILS
    MORE DETAILS...

The method simply records the formatted error message and returns. It doesn't die.

get_errors

This method returns a list of all error messages recorded with the error method.

nocarp

By default all errors produced by Git::Hooks use Carp::croak, so that they contain a suffix telling where the error occurred. Sometimes you may not want this. For instance, if the user is going to receive the error message produced by a server hook he/she won't be able to use that information.

This method makes error strip any such suffixes from its DETAILS argument and to produce its own message with warn instead of carp.

SEE ALSO

Git

AUTHOR

Gustavo L. de M. Chaves <gnustavo@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2016 by CPqD <www.cpqd.com.br>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.