The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Git::Hooks::CheckFile - Git::Hooks plugin for checking files

VERSION

version 3.6.0

SYNOPSIS

As a Git::Hooks plugin you don't use this Perl module directly. Instead, you may configure it in a Git configuration file like this:

  [githooks]

    # Enable the plugin
    plugin = CheckFile

    # These users are exempt from all checks
    admin = joe molly

    # These groups are used in ACL specs below
    groups = architects = tiago juliana
    groups = dbas       = joao maria

  [githooks "checkfile"]

    # Check specific files with specific commands
    name = *.p[lm] perlcritic --stern --verbose 10
    name = *.pp    puppet parser validate --verbose --debug
    name = *.pp    puppet-lint --no-variable_scope-check --no-documentation-check
    name = *.sh    bash -n
    name = *.sh    shellcheck --exclude=SC2046,SC2053,SC2086
    name = *.yml   yamllint
    name = *.js    eslint -c ~/.eslintrc.json

    # Reject files bigger than 1MiB
    sizelimit = 1M

    # Reject files with names that would conflict with other files in the
    # repository in case-insensitive filesystems, such as the ones on Windows.
    deny-case-conflict = true

    # Subject MS-Office and OpenDocument documents to the locking mechanism
    lock = *.docx
    lock = *.pptx
    lock = *.xlsx
    lock = qr/\\.(od[fgpst])$/

    # Reject commits adding scripts without the executable bit set.
    executable = *.sh
    executable = *.csh
    executable = *.ksh
    executable = *.zsh

    # Reject commits adding source files with the executable bit set.
    not-executable = qr/\\.(?:c|cc|java|pl|pm|txt)$/

    # Only architects may add, modify, or delete pom.xml files.
    acl = deny  AMD ^(?i).*pom\\.xml
    acl = allow AMD ^(?i).*pom\\.xml by @architects

    # Only dbas may add or delete SQL files under database/
    acl = deny  AD ^database/.*\\.sql$
    acl = allow AD ^database/.*\\.sql$ by @dba

    # Reject new files containing dangerous characters, avoiding names which may
    # cause problems.
    acl = deny  A   ^.*[^a-zA-Z0-1/_.-]

DESCRIPTION

This Git::Hooks plugin hooks itself to the hooks below to check if the names and contents of files added to or modified in the repository meet specified constraints. If they don't, the commit/push is aborted.

  • pre-applypatch

  • pre-commit

  • update

  • pre-receive

  • ref-update

  • patchset-created

  • draft-published

To enable it you should add it to the githooks.plugin configuration option:

    [githooks]
      plugin = CheckFile

NAME

CheckFile - Git::Hooks plugin for checking files

CONFIGURATION

The plugin is configured by the following git options under the githooks.checkfile subsection.

It can be disabled for specific references via the githooks.ref and githooks.noref options about which you can read in the Git::Hooks documentation.

name PATTERN COMMAND

This directive tells which COMMAND should be used to check files matching PATTERN.

Only the file's basename is matched against PATTERN.

PATTERN is usually expressed with globbing to match files based on their extensions, for example:

    [githooks "checkfile"]
      name = *.pl perlcritic --stern

If you need more power than globs can provide you can match using regular expressions, using the qr// operator, for example:

    [githooks "checkfile"]
      name = qr/xpto-\\d+.pl/ perlcritic --stern

COMMAND is everything that comes after the PATTERN. It is invoked once for each file matching PATTERN with the name of a temporary file containing the contents of the matching file passed to it as a last argument. If the command exits with any code different than 0 it is considered a violation and the hook complains, rejecting the commit or the push.

If the filename can't be the last argument to COMMAND you must tell where in the command-line it should go using the placeholder {} (like the argument to the find command's -exec option). For example:

    [githooks "checkfile"]
      name = *.xpto cmd1 --file {} | cmd2

COMMAND is invoked as a single string passed to system, which means it can use shell operators such as pipes and redirections.

Some real examples:

    [githooks "checkfile"]
      name = *.p[lm] perlcritic --stern --verbose 5
      name = *.pp    puppet parser validate --verbose --debug
      name = *.pp    puppet-lint --no-variable_scope-check
      name = *.sh    bash -n
      name = *.sh    shellcheck --exclude=SC2046,SC2053,SC2086
      name = *.erb   erb -P -x -T - {} | ruby -c

COMMAND may rely on the GIT_COMMIT environment variable to identify the commit being checked according to the hook being used, as follows.

Since the external commands may take much time to run, the plugin checks if the githooks.timeout option has been violated after each command runs.

  • pre-commit

    This hook does not check a complete commit, but the index tree. So, in this case the variable is set to :0. (See git help revisions.)

  • update, pre-receive, ref-updated

    In these hooks the variable is set to the SHA1 of the new commit to which the reference has been updated.

  • patchset-created, draft-published

    In these hooks the variable is set to the argument of the --commit option (a SHA1) passed to them by Gerrit.

The reason that led to the introduction of the GIT_COMMIT variable was to enable one to invoke an external command to check files which needed to grok some configuration from another file in the repository. Specifically, we wanted to check Python scripts with the pylint command passing to its --rcfile option the configuration file pylint.rc sitting on the repository root. So, we configured CheckFile like this:

    [githooks "checkfile"]
      name = *.py mypylint.sh

And the mypylint.sh script was something like this:

    #!/bin/bash

    # Create a temporary file do save the pylint.rc
    RC=$(tempfile)
    trap 'rm $RC' EXIT

    git cat-file $GIT_COMMIT:pylint.rc >$RC

    pylint --rcfile=$RC "$@"

sizelimit INT

This directive specifies a size limit (in bytes) for any file in the repository. If set explicitly to 0 (zero), no limit is imposed, which is the default. But it can be useful to override a global specification in a particular repository.

basename.sizelimit BYTES REGEXP

This directive takes precedence over the githooks.checkfile.sizelimit for files which basename matches REGEXP.

deny-case-conflict BOOL

This directive checks for newly added files that would conflict in case-insensitive file-systems.

Git itself is case-sensitive with regards to file names. Many operating system's file-systems are case-sensitive too, such as Linux, macOS, and other Unix-derived systems. But Windows's file-systems are notoriously case-insensitive. So, if you want your repository to be absolutely safe for Windows users you don't want to add two files which filenames differ only in a case-sensitive manner. Enable this option to be safe

Note that this check have to check the newly added files against all files already in the repository. It can be a little slow for large repositories. Take heed!

lock PATTERN

This multi-valued directive implements a poor man's version of the Subversion locking mechanism. The PATTERNs specified by one or more of these directives match, case-insensitively, the files that should be subject to the locking checks. Whenever one pushes a commit which creates or copies a file with a name matching one of the PATTERNs, the hook checks if the file is locked. If it is, the commit is rejected.

The PATTERN argument can be expressed either as a globbing pattern or as a regular expression. Take a look at the explanation for the PATTERN argument to the name directive above for the details.

To lock a file you just need to create another file with the same name and suffixed with .lock. For instance, suppose you're going to edit a file called doc/file.odt. In order to signal to your colleagues that they should refrain from editing the file, just create another file called doc/file.odt.lock and commit it. The lock file may even be empty.

If a colleague doesn't notice the lock and tries to push a new version of the file, the push is rejected telling them that the file is locked and that they should coordinate with you how to integrate their changes into yours.

When you commit the changes to the file, you should delete the lock file in the same commit. This way, when you push the commit, the commit will be accepted, the file will be changed and the lock removed.

Note that this mechanism isn't able to alert users that they shouldn't edit locked files. In Subversion, files subject to locking are kept read-only so that the users have to execute the svn lock command before editing them. There is no such alert in Git. The users should really remember to create a lock file before editing any file. If they forget, they will be reminded that the file is locked when they try to push them, which will avoid the race condition, but it will be too late to avoid the need to manually integrate the changes later. You should consider this mechanism as an informational protocol only. All people involved in changing the repository must learn to pay attention to it.

max-path LENGTH

This directive checks for newly added or copied files if their full path is at most LENGTH characters long.

This is useful to avoid problems with the Windows maximum path length limitation, which is defined as 260 characters.

Note that this check counts just the path length inside the repository. When cloning it on Windows you have to make some allowance for the path length of the clone's root directory. So, you should allow less than 260 characters!

executable PATTERN

This directive requires that all added or modified files with names matching PATTERN must have the executable permission. This allows you to detect common errors such as forgetting to set scripts as executable.

PATTERN is specified as described in the githooks.checkfile.name directive above.

You can specify this option multiple times so that all PATTERNs are considered.

non-executable PATTERN

This directive requires that all added or modified files with names matching PATTERN must not have the executable permission. This allows you to detect common errors such as setting source code as executable.

PATTERN is specified as described in the githooks.checkfile.name directive above.

You can specify this option multiple times so that all PATTERNs are considered.

acl RULE

This multi-valued option specifies rules allowing or denying specific users to perform specific actions on specific files. By default any user can perform any action on any file. So, the rules are used to impose restrictions.

The acls are grokked by the Git::Repository::Plugin::GitHooks's grok_acls method. Please read its documentation for the general documentation.

A RULE takes three or four parts, like this:

  (allow|deny) [AMD]+ <filespec> (by <userspec>)?

Some parts are described below:

  • [AMD]+

    The second part specifies which actions are being considered by a combination of letters: (A) for files added, (M) for files modified, and (D) for files deleted. (These are the same letters used in the --diff-filter option of the git diff-tree command.) You can specify one, two, or the three letters.

  • <filespec>

    The third part specifies which files are being considered. In its simplest form, a filespec is a complete path beginning at the repository root, without a leading slash (e.g. lib/Git/Hooks.pm). These filespecs match a single file exactly.

    If the filespec starts with a caret (^) it's interpreted as a Perl regular expression, the caret being kept as part of the regexp. These filespecs match potentially many files (e.g. ^lib/.*\\.pm$).

See the "SYNOPSIS" section for some examples.

[DEPRECATED after v2.13.0] deny-token REGEXP

This option is deprecated. Please, use the CheckDiff::deny-token option instead.

This directive rejects commits or pushes which diff (patch) matches REGEXP. This is a multi-valued directive, i.e., you can specify it multiple times to check several REGEXes.

It is useful to detect marks left by developers in the code while developing, such as FIXME or TODO. These marks are usually a reminder to fix things before commit, but as it so often happens, they end up being forgotten.

Note that this option requires Git 1.7.4 or newer.

AUTHOR

Gustavo L. de M. Chaves <gnustavo@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2023 by CPQD <www.cpqd.com.br>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.