NAME

File::Info - Store file information persistently for fast lookup

SYNOPSIS

  use File::Info qw( $PACKAGE $VERSION );

  my $info = File::Info->new($dir);
  # $fn is "basename"; contains no directory portion
  my $hex  = $info->md5hex($fn);  # Reads cached data if possible

DESCRIPTION

This package stores per-file information for speedy lookup later. It is intended to store file info that takes a significant time to determine --- e.g., the MD5 sum of a large file, to avoid uneccessarily recalculation. This may be particularly helpful for searching across many files for some specific property.

File statistics are recalculated on demand. If the file size or modification time have changed since the calculations were last made, then they will be purged and recalculated.

File information is stored on a per-directory basis. Each file info file is stored in a directory; the files to which it refers are in the same directory, and are referred as names without paths.

CLASS CONSTANTS

TYPE_CONSTANTS

As returned by the type method. These constants are exported by request, either individually, or together with the ':types' tag.

TYPE_UNKNOWN

File type not identified

TYPE_JPEG

A 'JPEG' image file.

TYPE_PAR

A 'par' (parity archive) file.

CLASS COMPONENTS

CLASS HIGHER-LEVEL FUNCTIONS

CLASS HIGHER-LEVEL PROCEDURES

add_global_lookup

Add a lookup function to the. A method with the same name will be created, to provide the cached lookup.

ARGUMENTS
name

The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.

builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.

These other names are reserved:

  add_local_lookup add_global_lookup isa import new dirname
code

The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.

INSTANCE CONSTRUCTION

new

Create & return a new thing.

ARGUMENTS
_dirname

Name of the directory represented

INSTANCE COMPONENTS

INSTANCE HIGHER-LEVEL FUNCTIONS

dirname

The name of the directory to which this instance refers

STANDARD LOOKUPS

Each of the following functions takes a filename (without path, relative to the directory of the instance), and returns the relevant value for the file.

Alternatively, they may be called as class methods, in which case the filename value must be absolute. This mode will never invoke a local method (see add_local_lookup, and is less efficient if multiple lookups are made on files in the same directory.

md5_hex

The MD5 signature of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.

md5

The MD5 signature of the file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.

md5_16khex

The MD5 signature of the first 16k of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.

md5_16k

The MD5 signature of the first 16k of the file, file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.

line_count

The number of lines in the file. More acurrately, the number of "\n" characters in the file (as for wc). No attempt is made to guess the line terminator of the running system; for that would lead to inconsistent results on the same file on a (say) Samba-mounted drive accessed from both Windoze and UN*X.

type

The file type, as determined by reading the file itself. This is similar in intent to the file command under UN*X, with the following distinctions:

  • The means of identification is consistent across all systems, rather than relying on a system-specific magic file

  • The type is returned as a constant (which happens to be a simple string), rather than having to parse the output of file

  • This method only returns the basic type, not any details about versions, bitrates, sizes, etc. This is a feature. Other details may be queried elsewhere with the same module.

  • The file database is considerably less big. Of course, if you submit some additions, it will grow 8*).

The returned value is a TYPE_x constant.

par_set_hash

Behaviour is defined only for files whose type is TYPE_PAR.

This is the hash used to identify par files that belong to a single set. It is a 16-byte binary file.

par_set_hash_hex

Behaviour is defined only for files whose type is TYPE_PAR.

As for par_set_hash, but a 16 pairs of hex characters representing the 16 bytes.

INSTANCE HIGHER-LEVEL PROCEDURES

add_local_lookup

Add a lookup function to this instance only. A method with the same name will be created, to provide the cached lookup.

This method will only work on this instance. Any other instances with their own local methods will be respected. The local method will override any global method of the same name. However, using the class interface (e.g., File::Info->local($absname) will always invoke the global instance, if any (and fail, if not).

ARGUMENTS
name

The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.

builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.

These other names are reserved:

  add_local_lookup add_global_lookup isa import new dirname
code

The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.

EXAMPLES

BUGS

REPORTING BUGS

Email the author.

AUTHOR

Martyn J. Pearce fluffy@cpan.org

COPYRIGHT

Copyright (c) 2002, 2003 Martyn J. Pearce. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

1 POD Error

The following errors were encountered while parsing the POD:

Around line 422:

You forgot a '=back' before '=head1'

You forgot a '=back' before '=head1'