The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Porting/updateAUTHORS.pl - Automatically update AUTHORS and .mailmap and Porting/exclude_contrib.txt based on commit data.

SYNOPSIS

Porting/updateAUTHORS.pl [OPTIONS] [GIT_REF_RANGE]

By default scans the commit history specified (or the entire history from the current commit) and then updates AUTHORS and .mailmap so all contributors are properly listed.

 Options:
   --help               brief help message
   --man                full documentation
   --verbose            be verbose

 Commit Range:
   --from=GIT_REF       Select commits to use
   --to=GIT_REF         Select commits to use, defaults to HEAD

 File Locations:
   --authors-file=FILE  override default of 'AUTHORS'
   --mailmap-file=FILE  override default of '.mailmap'

 Action Modifiers
   --no-update          Do not update.
   --validate           output TAP about status and change nothing
   --exclude-missing    Add new names to the exclude file so they never
                        appear in AUTHORS or .mailmap.

 Details Changes
    Update canonical name or email in AUTHORS and .mailmap properly.
    --exclude-contrib       NAME_AND_EMAIL
    --exclude-me
    --change-name           OLD_NAME=NEW_NAME
    --change-name-for-email OLD_ADDR=NEW_NAME
    --change-email-for-name OLD_NAME=NEW_ADDR
    --change-email          OLD_ADDR=NEW_EMAIL

 Reports About People
    --stats             detailed report of authors and what they did
    --who               Sorted, wrapped list of who did what
    --thanks-applied    report who applied stuff for others
    --rank              report authors by number of commits created

 Reports About Files
    --files             detailed report files that were modified
    --activity          simple report of files that grew the most
    --chainsaw          simple report of files that shrank the most

 Report Modifiers
    --percentage        show percentages not counts
    --cumulative        show cumulative numbers not individual
    --reverse           show reports in reverse order
    --numstat           show additional file based data in some reports
                        (not needed for most reports)
    --as-list           show reports with names with common values
                        folded into a list like checkAUTHORS.pl used to
    --numbered          add rank numbers to reports where they are missing

OPTIONS

--help

Print a brief help message and exits.

--man

Prints the manual page and exits.

--verbose

Be verbose about what is happening. Can be repeated more than once.

--no-update

Do not update files on disk even if they need to be changed.

--validate
--tap

Instead of modifying files, test to see which would be modified and output TAP test output about the validation.

--authors-file=FILE
--authors_file=FILE

Override the default location of the authors file, which is by default the AUTHORS file in the current directory.

--mailmap-file=FILE
--mailmap_file=FILE

Override the default location of the mailmap file, which is by default the .mailmap file in the current directory.

--exclude-file=FILE
--exclude_file=FILE

Override the default location of the exclude file, which is by default the Porting/exclude_contrib.txt file reachable from the current directory.

--exclude-contrib=NAME_AND_EMAIL
--exclude_contrib=NAME_AND_EMAIL

Exclude a specific name/email combination from our contributor datasets. Can be repeated multiple times on the command line to remove multiple items at once. If the contributor details correspond to a canonical identity of a contributor (one that is in the AUTHORS file or on the left in the .mailmap file) then ALL records, including those linked to that identity in .mailmap will be marked for exclusion. This is similar to --exclude-missing but it only affects the specifically named users. Note that the format for NAME_AND_EMAIL is similar to that of the .mailmap file, email addresses and @github style identifiers should be wrapped in angle brackets like this: <@github>, users with no email in the AUTHORS file should use <unknown>.

For example:

  Porting/updateAUTHORS.pl --exclude-contrib="Joe B <b@joe.com>"

Would remove all references to "Joe B" from AUTHORS and .mailmap and add the required entires to Porting/exclude_contrib.txt such that the contributor would never be automatically added back, and would be automatically removed should someone read them manually.

--exclude-missing
--exclude_missing
--exclude

Normally when the tool is run it *adds* missing data only. If this option is set then the reverse will happen, any author data missing will be marked as intentionally missing in such a way that future "normal" runs of the script ignore the author(s) that were excluded.

The exclude data is stored in Porting/exclude_contrib.txt as a SHA256 digest (in base 64) of the user name and email being excluded so that the list itself doesnt contain the contributor details in plain text.

The general idea is that if you want to remove someone from AUTHORS and .mailmap you delete their details manually, and then run this tool with the --exclude option. It is probably a good idea to run it first without any arguments to make sure you dont exclude something or someone you did not intend to.

--stats

Show detailed stats about committers and the work they did in a tabular form. If the --numstat option is provided this report will provide additional data about the files a developer worked on. May be slow the first time it is used as git unpacks the relevant data.

--who

Show a list of which committers and authors contributed to the project in the selected range of commits. The list will contain the name only, and will sorted according to unicode collation rules. This list is suitable in release notes and similar contexts.

--thanks-applied

Show a report of which committers applied work on behalf of someone else, including counts. Modified by the --as-list and --display-rank.

--rank

Shows a report of which commits did the most work. Modified by the --as-list and --display-rank options.

--files

Show detailed stats about the files that have been modified in the selected range of commits. Implies --numstat. May be slow the first time it is used as git unpacks the relevant data.

--activity

Show simple stats about which files had the most additions. Implies --numstat. May be slow the first time it is used as git unpacks the relevant data.

--chainsaw

Show simple stats about whcih files had the most removals. Implies --numstat. May be slow the first time it is used as git unpacks the relevant data.

--percentage

Show numeric data as percentages of the total, not counts.

--cumulative

Show numeric data as cumulative counts in the reports.

--reverse

Show the reports in reverse order to normal.

--numstat

Gather additional data about the files that were changed, not just the authors who did the changes. This option currently is only necessary for the --stats option, which will display additional data when this option is also provided.

--as-list

Show the reports with name data rolled up together into a list like the older checkAUTHORS.pl script would have.

--numbered

Show an additional column with the rank number of a row in the report in reports that do not normally show the rank number.

--change-name OLD_NAME=NEW_NAME
--change-name-for-email OLD_EMAIL=NEW_NAME
--change-email OLD_EMAIL=NEW_EMAIL
--change-email-for-name OLD_NAME=NEW_EMAIL

Change email or name based on OLD_NAME or OLD_EMAIL.

Eg,

    --change-name-for-email somebody@gmail.com="Bob Rob"

would cause the preferred name for the person with the preferred email somebody@gmail.com to change to "Bob Rob" in our records. If that persons name was "Daniel Dude" then we might have done this as well:

    --change-name "Bob Rob"="Daniel Dude"

DESCRIPTION

This program will automatically manage updates to the AUTHORS file and .mailmap file based on the data in our commits and the data in the files themselves. It uses no other sources of data. Expects to be run from the root directory of a git repo of perl.

In simple, execute the script and it will either die with a helpful message or it will update the files as necessary, possibly not at all if there is no need to do so. If the --validate option is provided the the content will not be updated and instead the tool will act as a test script validating that the AUTHORS and .mailmap files are up to date.

By default the script operates on the *entire* history of Perl development that is reachable from HEAD. This can be overriden by using the --from and --to options, or providing a git commit range as an argument after the options just like you might do with git log.

The script can also be used to produce various reports and other content about the commits it has analyzed.

ADDING A NEW CONTRIBUTOR

Commit your changes. Run the tool with no arguments. It will add anything that is missing. Check the changes and then commit them.

CHANGING A CONTRIBUTORS CANONICAL NAME OR EMAIL

Use the --change-name-for-name and related options. This will do things "properly" and update all the files.

A CONTRIBUTOR WANTS TO BE FORGOTTEN

There are several ways to do this:

Manual Exclusion

Manually modify AUTHORS and .mailmap so the user detals are removed and then run this tool with the --exclude option. This should result in various SHA-256 digests (in base64) being added to Porting/exclude_contrib.txt. Commit the changes afterwards.

Exclude Yourself

Use the --exclude-me option to the tool, review and commit the results. This will use roughly the same rules that git would to figure out what your name and email are.

Exclude Someone Else

Use the --exclude-contrib option and specify their name and email. For example

 --exclude-contrib="Their Name <email@provider.com>"

Should exclude the person with this name from our files.

Note that excluding a person by canonical details (that is the details in the AUTHORS file) will result in their .mailmap'ed names being excluded as well. Excluding a persons secondary account details will simply block that specific email from being listed, and is likely not what you want to do most of the time.

AFTER RUNNING THE TOOL

Review the changes to make sure they are sane. If they are ok (and they should be most of the time) commit. If they are not then update the AUTHORS or .mailmap files as is appropriate and run the tool again.

Do not panic that your email details get added to .mailmap, this is by design so that your chosen name and email are displayed on GitHub and in casual use of git log and other git tooling.

RECIPES

  perl Porting/updateAUTHORS.pl --who --from=v5.31.6 --to=v5.31.7
  perl Porting/updateAUTHORS.pl --who v5.31.6..v5.31.7
  perl Porting/updateAUTHORS.pl --rank --percentage --from=v5.31.6
  perl Porting/updateAUTHORS.pl --thanks-applied --from=v5.31.6
  perl Porting/updateAUTHORS.pl --tap --from=v5.31.6
  perl Porting/updateAUTHORS.pl --files --from=v5.31.6
  perl Porting/updateAUTHORS.pl --activity --from=v5.31.6
  perl Porting/updateAUTHORS.pl --chainsaw v5.31.6..HEAD
  perl Porting/updateAUTHORS.pl --change-name "Old Name"="New Name"
  perl Porting/updateAUTHORS.pl --change-name-for-email "x@y.com"="Name"
  perl Porting/updateAUTHORS.pl --change-email-for-name "Name"="p@q.com"

RELATED FILES

AUTHORS, .mailmap, Porting/excluded_author.txt

TODO

More documentation and testing.

AUTHOR

Yves Orton <demerphq@gmail.com>

THANKS

Loosely based on the older Porting/checkAUTHORS.pl script which this tool replaced. Thanks to the contributors of that tool. See the Perl change log.