NAME

File::Raw::Hash - Cryptographic and integrity digests as a File::Raw plugin

VERSION

Version 0.02

SYNOPSIS

Loading the module registers one plugin (hash) with File::Raw. It is a passthrough on the data path: bytes flow through unchanged, and the digest is delivered through a caller-supplied scalar or hash reference passed via the into option.

use File::Raw::Hash;
use File::Raw qw(import);

# Single algorithm, scalar destination.
my $bytes = file_slurp("input.bin",
    plugin => 'hash',
    algo   => 'sha256',
    into   => \my $digest,
);
# $digest is now the lowercase-hex SHA-256 of the file.

# Multiple algorithms in one pass; result lands in a hash.
my %digests;
file_slurp("input.bin",
    plugin => 'hash',
    algos  => [qw(sha256 md5 crc32)],
    into   => \%digests,
);

# Streaming: each_line digests the file in chunks; the digest
# arrives in $d after iteration completes. RAM stays bounded
# regardless of file size.
each_line("huge.log", sub { ... },
    plugin => 'hash',
    algo   => 'sha256',
    into   => \my $d,
);

# Same plugin, write side. The bytes spewed are the original
# payload; the digest of that payload lands in $d.
file_spew("output.bin", $payload,
    plugin => 'hash',
    algo   => 'sha256',
    into   => \my $d,
);

OPTIONS

algo

Single algorithm name. One of sha256 (default), sha512, sha1, md5, crc32, xxh64, blake3. Names are case-insensitive and tolerate dashes / underscores: SHA-256, sha_256, SHA256 all resolve to the same algorithm. Mutually exclusive with algos.

algos

Arrayref of algorithm names; one pass, all digests computed in lockstep. The result hash is keyed by canonical lowercase name (sha256, not SHA-256). Mutually exclusive with algo.

into

Required. Where the digest goes.

  • For single-algo mode (algo or default): a scalar reference. The referent is overwritten with the formatted digest.

  • For multi-algo mode (algos): a hash reference. Existing keys are left alone; one entry is stored per requested algo.

format

Output format. One of:

hex (default) - lowercase hexadecimal.
HEX - uppercase hexadecimal. Note this name is case-sensitive: that's the signal.
base64 - RFC 4648 section 4 base64, padded with =.
base64url - RFC 4648 section 5 URL-safe base64, no padding.
raw - Raw binary digest bytes.
hmac_key

If set, switches the algorithm to RFC 2104 HMAC mode. Available for sha256, sha512, sha1, and md5; rejected for crc32, xxh64, and blake3. The key may be any byte string, including binary or empty; keys longer than the algorithm's block size are hashed down per the spec.

my $mac;
file_slurp("payload.bin",
    plugin   => 'hash',
    algo     => 'sha256',
    hmac_key => $secret,
    into     => \$mac,
);
xxh64_seed

Seed for the xxh64 algorithm. Default 0. Ignored by every other algorithm.

PHASES

read, write, stream, and record are all implemented.

The plugin is a passthrough on read and write - the byte stream is returned unchanged. record_fn behaves the same way: the original record is returned and a digest is appended to the caller-supplied arrayref. RECORD-phase output goes into an arrayref (one entry per record), regardless of single/multi algo:

my @per_line_digests;
# File::Raw 0.11+ does not yet expose a public per-record iterator
# API; the helper below dispatches a single record at a time and
# is the same path future high-level entry points will use.
File::Raw::Hash::_test_record_one("the line",
    algo => 'sha256',
    into => \@per_line_digests);

The plugin is a passthrough on read and write - the byte stream is returned unchanged. That makes it composable in a chain anywhere the caller wants to checksum a particular representation:

# Hash the wire bytes (the .gz file as it sits on disk):
my $payload = file_slurp("data.json.gz",
    plugin => ['hash', 'gzip', 'json'],
    hash   => { algo => 'sha256', into => \my $disk_digest },
);

# Hash the decompressed payload (after gunzip, before JSON parse):
my $payload = file_slurp("data.json.gz",
    plugin => ['gzip', 'hash', 'json'],
    hash   => { algo => 'sha256', into => \my $payload_digest },
);

(Plugin chains require File::Raw 0.10+.)

ALGORITHM CHOICE

  • sha256 - modern default. Cryptographically secure for new designs.

  • sha512 - same security family, faster on 64-bit hosts for large inputs because it processes 1024-bit blocks.

  • sha1 - kept for git/openssh/legacy interop. Cryptographically broken; do not use for new security designs.

  • md5 - kept for content fingerprinting and upstream interop (etags, checksum manifests). Cryptographically broken; do not use for security.

  • crc32 - IEEE 802.3 polynomial, the same CRC zlib/gzip/PNG/Ethernet use. Not a hash function; use only for integrity / dedup, never for authenticity.

  • xxh64 - non-cryptographic 64-bit hash by Yann Collet. Very fast; useful for content fingerprinting / dedup. Optional 64-bit seed via the xxh64_seed option (default 0). Not for security.

  • blake3 - modern cryptographic hash (32-byte default output). Faster than SHA-2 family on most modern hardware. Sequential reference implementation (no SIMD) in v0.01; multi-threaded fan-out is a future enhancement.

IMPLEMENTATION

All algorithms are vendored, public-domain reference implementations. There is no external library dependency (no OpenSSL, no libsodium).

SEE ALSO

File::Raw, Digest::SHA, Digest::MD5, Digest::CRC.

AUTHOR

LNATION <email@lnation.org>

LICENSE AND COPYRIGHT

This software is Copyright (c) 2026 by LNATION <email@lnation.org>.

This is free software, licensed under:

The Artistic License 2.0 (GPL Compatible)