The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

unpack-log-json-gz - Correctly unpack a .log.json.tar.gz file created by Test::Against::Dev

SYNOPSIS

    perl scripts/unpack-log-json-gz \
        --topdir=/home/username/tmp/scratch \
        --gzfile=/home/username/var/tad/results/perl-5.29.0/storage/cpan-river-3000.perl-5.29.0.log.json.tar.gz

DESCRIPTION

When you use Test::Against::Dev or Test::Against::Commit methods to assess the impact of changes in the Perl 5 core distribution against selected subsets of CPAN modules, your program uses cpanm to attempt to install those modules and the cpanm build.log file is parsed to create one .log.json file in the .../results/<perl_version>/analysis directory for each distribution attempted. Those files are then tarred up and gzipped-compressed and the resulting tarball is deposited in the corresponding .../results/<perl_version>/storage directory.

Unpacking that tarball can be a bit tricky because there is a possibility that certain characters in the build.log may result in malformed JSON. This program unpacks the tarball but along the way provides a list of files with malformed JSON.

USAGE

Command-Line Invocation

The program takes 3 command-line options, 2 of which are mandatory.

    perl scripts/unpack-log-json-gz \
        --topdir=/home/username/tmp/scratch \
        --gzfile=/home/username/var/tad/results/perl-5.29.0/storage/cpan-river-3000.perl-5.29.0.log.json.tar.gz \
        --verbose 1>output 2>&1
  • topdir

    String holding path to directory underneath which the tarball will be unpacked. Required.

    The files will in fact be found in an analysis directory immediately beneath the directory provided to topdir.

  • gzfile

    String holding path to tarball of .log.json files. Required.

  • verbose

    Extra information on STDOUT. Optional; should you use this you probably should redirect that output to a file for further review.

Results

The tarball will be unpacked as described above.

Any file where an exception was recorded during the decoding of its JSON is placed in an internal data structure whose contents are dumped to STDOUT at the conclusion of the program.

    2 problematic .json files:
    {
      "analysis/AUTHORA.HTML-Widget-1.11.log.json"   => "malformed UTF-8 character in JSON string, at character offset 8516 (before \"\\x{fffd} /><input cl...\")",
      "analysis/AUTHORB.IO-Util-1.5.log.json"        => "malformed UTF-8 character in JSON string, at character offset 5544 (before \"\\x{fffd}'. Assuming ...\")",
    }

If verbose is requested on the command-line, you will in addition get:

  • The total number of files extracted.

        Extracted 2961 files
  • A list -- probably a very large one -- of the files whose JSON was satisfactory.

        2956 good .json files:
        {
          "analysis/AAR.Net-LDAP-Server-0.43.log.json"  => 1,
          "analysis/ABELTJE.V-0.13.log.json"            => 1,
          ...
        }
  • An indication that the program has concluded successfully.

        Finished!

PREREQUISITES

Perl 5 Core Distribution

    Archive::Tar
    Carp
    Cwd
    File::Spec
    Getopt::Long

CPAN-only Distributions

    Data::Dump
    JSON
    Path::Tiny

NOTES

As this is a helper script, not library code, the author reserves the right to change the interface and functionality of the program at any time. The author has used this program satisfactorily but is not providing a test suite for it.

While this program was designed to meet the needs of users of the Test-Against-Dev CPAN distribution, it could probably be used to unpack any gzipped tarball of files in JSON format. YMMV.

Same copyright, licensing, etc., as other parts of the Test-Against-Dev CPAN distribution.