Eric Pozharski

OVERVIEW

The Regexp::Common::debian is collection of REs for various strings found in the Debian Porject <http://debian.org>. It's no way intended to be a validation tool.

R::C::d needs perl v5.10.0 or later because:

  • 4 patterns (29%) make use of decent perls (one doesn't make it through with anything older);

  • It's time to move, v5.10.0 is six year old and three years no-support;

  • lenny is four year old and a year no-support;

INSTALL

The R::C::d builds with Module::Build.

    $ perl Build.PL
    $ perl Build
    $ perl Build test
    $ perl Build install

Since we're about strings we need a lots of strings to test against (Test::More, unspecified version). To access them easily (it's all about reuse, not implemented yet) I need an apropriate storage. Accidentally it's YAML::Tiny (unspecified version).

v0.2.1 Reading reports of cpantesters I've got to conclusion that YAML::Tiny isn't popular. (v0.2.13 Wandering through errors of v0.2.12 I should say it totally is.) And avoiding installing (or unability to install (there could be reasons)) build requirements isn't that uncommon. Although I experience a strong belief that some YAML reader happens to be installed anyway. And still I can't find a way to specify that %build_requires one of but all known to me YAML reader. So here is a dirty trick. t::TestSuite attempts to require() one of known (to me, see below) YAML reader. Then (upon initial perl Build.PL) t::TestSuite is asked what it has found (if nothing then cosmetic Compilation failed in require message will be seen). And one what has been found will be added to %build_requires; If nothing then fair YAML::Tiny will be added. (I think it's fair because YAML::Tiny is pure-Perl, small footprint, and no dependencies.)

(note) I'm talking about "known to me YAML readers" because I've found out that different YAML readers treat source differently. So I attempt to keep t/*.yaml files semantically equal and sintactically correct. Hopefully there're no differences among versions in wild.

v0.2.13 (Actually, this feature has been here for years.) Any supported YAML reader can be enforced with $ENV{RCD_YAML_ENGINE} magic (in spite of any build-time choice):

    RCD_YAML_ENGINE=syck ./Build test

Readers are assigned by nicks. Here they are:

  • syck -- YAML::Syck.

  • xs -- YAML::XS.

  • tiny -- YAML::Tiny.

  • old -- YAML.

  • data -- Data::YAML::Reader 'does not support multi-line quoted scalars', 'YAML document header not found' -- unsupported, so far.

v0.2.2 Various (all, except t/preferences.t and t/sourceslist.t) test-units know a magic command '$ENV{RCD_ASK_DEBIAN}'. Apply it this way (enabling all possible external inquiries):

    RCD_ASK_DEBIAN=all ./Build test

or this (separate keys with any non-word):

    RCD_ASK_DEBIAN=binary,architecture ./Build test

When applied a test-unit would ask Debian's commands or inspect Debian specific files for information the test-unit is interested in. For obvious reasons that magic will fail on non-Debian system; So don't. Although if used correctly that could warn of strange ('not known before') compatibility problems. Details:

architecture of t/architecture.t

This asks dpkg-architecture -L for list of known architectures (per Section 11.1 of debian-policy). That wouldn't find architectures dropped (had that happen ever?) but omissions won't stay unnoted anymore.

binary of t/archive.misc.t

v0.2.3 Inspects all records in /var/lib/apt/lists/*_Packages, extracts Filename: entries and matches all of them against m/^$RE{debian}{archive}{binary}$/. All (if any) failure will be reported at the end.

changelog of t/changelog.t

v0.2.8 That will inspect /usr/share/doc/*/changelog.Debian files. To do a complete scan it would take loads of time (really). You should understand, that's not enough to just run through changelogs. It has to be verified that none entry is skipped. The only reliable (for sake of interface, and, trivially, presence) source of verification is dpkg-parsechangelog. And here's the fork-mare. perl forks shell, then perl, then perl again. There seems to be fork of tail too. And that for each entry. (Not to count gunzip to decompress the changelog.) loadavg climbs over 1.50..2.00 You've got the picture. Although that's where choice begins.

v0.2.12 That happens that urgency=high, probably when it's that really high, is expressed in blocks (like this: urgency=HIGH). $RE{d}{changelog} is case-keeping, and then dpkg-parsechangelog(1) strikes back and lowercase. From now on such manipulations won't fail a particular entry.

changelog

v0.2.9 That defaults to changelog=5. See below.

changelog=package

Only one changelog will be checked. The one that eqs. The package name is picked from directory name.

changelog=a

Only those changelogs will be checked that m/^a/.

changelog=5

v0.2.9 That will check all changelogs, although it will look no more than requested number of entries deep.

     v0.2.9 ~15min ~1.2K changelogs
    v0.2.12 ~30min ~1.3K changelogs ~6.0K subchecks;
    v0.2.13 ~35min ~1.3K changelogs ~6.1K subchecks;

And that has a perfect sense. Do you know that cron once changed it's name to Cron (beware leading block) (cron_3.0pl1-46)? C'mon, it has happened 12 (tweleve) years ago! (And you know what? That default is pretty fair (liblog-log4perl-perl_1.16-1). Probably it should look for time passed but entry number.)

changelog=-5

v0.2.9 That's different. It will check as many entries as possible (there are changelogs what $RE{d}{changelog} finds out more entries than dpkg-parsechangelog (dpkg_1.2.13 vs dpkg_0.93.79), but if the offending record is more than that far from top then it's reported and otherwise ignored.

     v0.2.9  ~3h ~1.2K changelogs ~45K subchecks
    v0.2.12 ~5¼h ~1.3K changelogs ~63K subchecks
    v0.2.13 ~5½h ~1.3K changelogs ~59K subckecks
changelog=_5

v0.2.12 That's a mix of changelog=5 and changelog=-5 (thanks to irda-utils_0.9.18-8.1 and mime-support_3.49-1). It goes no more than configured entries deep and ignores (and reports) any errors.

changelog=0

(bug) v0.2.9 That will check all changelogs, check all possible entries and BAIL_OUT off first failure. Shortly -- don't. You're warned. (Although, do it. t/changelog.t will give up pretty soon.)

To slightly sweeten all that, t/changelog.t attempts to filter duplicates. And it BAIL_OUTs upon first failure.

package of t/package.t

v0.2.10 Nothing special. Output of dpkg-query -f '${Package}\n' -W is matched against m/^$RE{debian}{package}$/. Probably should parse *_Packagees.

source of t/archive.source.t

v0.2.3 Inspects all records in /var/lib/apt/lists/*_Sources, extracts Files: entries, then collects trailing filenames. They are matched against m/^$RE{debian}{archive}{source_1_0}$/, m/^$RE{debian}{archive}{patch_1_0}$/, m/^$RE{debian}{archive}{source_3_0_native}$/, m/^$RE{debian}{archive}{source_3_0_quilt}$/, m/^$RE{debian}{archive}{patch_3_0_quilt}$/, and m/^$RE{debian}{archive}{dsc}$/ (in fact ||). If none matches then it will be reported at the end. m/$RE{debian}{archive}{changes}/ is missing here because there is no source of such on no-build system.

version of t/version.t

v0.2.10 Again nothing special. Output of dpkg-query -f '${Version}\n' -W is matched against m/^$RE{debian}{version}$/. Probably should parse *_Packages too.

If any test string fails I need to know what and how. To provide that info I've picked Test::Differences (maybe there's other option I'm not aware of?) (I'm, Test::Deep). v0.60 of T::D closes [38320@rt.cpan.org] and [41241@rt.cpan.org].

AVAILABILITY

Distribution -- http://search.cpan.org/dist/Regexp-Common-debian/

BUGS

Please report here -- http://rt.cpan.org/Public/Dist/Display.html?Name=Regexp-Common-debian

COPYRIGHT AND LICENSING

1 POD Error

The following errors were encountered while parsing the POD:

Around line 228:

Non-ASCII character seen before =encoding in '~5¼h'. Assuming UTF-8