The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

 FileHash::FormatString - Supports parsing of formatted lines of file data.

SYNOPSIS

 use FileHash::Formatstring;
 $obj  = FileHash::FormatString->alloc;

 $obj  = $obj->init  ($formatline);
 $hash = $obj->parse (@lexemes);
 $cnt  = $obj->fields;

Inheritance

 UNIVERSAL

Description

This is an internal class used by FileHashes.

Format strings are used to map a positionally significant list of lexemes to a set of field names.

If the format line is empty, the format will default to a single SKIP field which will absorb an entire line of input during parse.

It was created primarily to make it easy to read assorted dumps of metadata about files that might be hanging around in one's system and which might help to define what files used to be in that directory you just deleted...

Field Names

The following are the field names which may appear in a format string.

 pathQuoted             "C:/home/amon/Photo for Dale 00000.jpg"
 path                   C:/home/amon/Photo_for_Dale_00000.jpg
 deviceQuoted           "C:"
 device                 C:
 directoryQuoted        "/home/amon"
 directory              /home/amon
 fileQuoted             "Photo for Dale 00000.jpg"
 file                   Photo_for_Dale_00000.jpg
 mode                   33152
 modeChars              -rw-------
 modeOctal              0600
 atime                  1214479354
 atimeQuoted            "2008-06-26 12:22"
 atimeDate              2008-06-26
 atimeTime              12:22
 ctime                  1203083422
 ctimeQuoted            "2008-02-15 13:50"
 ctimeDate              2008-02-15
 ctimeTime              13:50
 mtime                  1124835415
 mtimeQuoted            "2005-08-23 23:16"
 mtimeDate              2005-08-23
 mtimeTime              23:16
 uidName                amon
 uid                    1000
 gidName                amon
 gid                    1000
 hardlinks              1 
 sizeBytes              661340
 inode                  2163352
 blocksAllocated        1304
 blocksizePreference    4096
 deviceSpecialId        0
 deviceNumber           771
 md5sum                 2d6431f79028879f7aa2976e8222e76e
 SKIP                   arbitraryword

Any space delimited item which does not match one of these items exactly, down to the capitalization, is replaced with the no op field name 'SKIP'. Later, during parsing, this will cause the corresponding item in a list of lexemes to be ignored, ie dumped into the 'SKIP' bucket.

If field names are repeated in a field string, only the last instance will be meaningful. Parsed values for the earlier tokens are overwritten by later ones. This is also true of 'SKIP' tokens, including ones that are added as replacements for unknown field names.

If there is likely to be junk at the end of the line, a single SKIP at the end will absorb all of the remaing text to the end of the line.

If more than one possibility is available for a given bit of information about a file, all should have the same value, but only the 'best' will be selected. The prioritization is done thusly:

For the path name of the file

 1 pathQuoted
 2 Path
 3 1 deviceQuoted  1 directoryQuoted  1 fileQuoted
   2 device        2 directory        2 file

The end result will be strings for device,directory and file, and the null string for any that are missing.

For atime, ctime and mtime:

 1 *time
 2 *timeQuoted
 3 1 *timeDate  1 *timeTime

For the mode value:

 1 mode
 2 modeOctal
 3 modeChars

If the original line contains incomplete path data, it may be supplied by the calling object pre-pending a pathQuoted or directoryQuoted. If deviceQuoted is not null on the file system and is missing, it should be included.

Examples

 use FileHash::FormatString;
 my $fmt  = "modeChars hardlinks uidName gidName sizeBytes mtimeDate mtimeTime file";
 my $line = "-rwxr-xr-x 1 root root       262 2003-08-23 15:58 20030823-ipsec1";
 my $a    = FileHash::FormatString->alloc;

 $a->init ($fmt);
 my @lexemes = split $line,$a->fields;
 $hash = $a->parse (@lexemes);

Class Variables

 None.

Instance Variables

 fields         Number of lexemes required for this line format.
 format         List of field names to match sequentially to lexemes.
 notepad        Notepad object used to record the unexpected.

Class Methods

$obj = FileHash::FormatString->alloc

Allocate an empty FormatString object.

Instance Methods

$cnt = $obj->fields

Returns the number of format fields, including SKIP tokens, expected by this object.

$obj = $obj->init ($formatline)

Initialize a FormatString object. It has one required argument, a format line which contains field names from the list given earlier.

For example, a format line useable with a current Linux 'ls -l' output line is:

 "modeChars hardlinks uidName gidName sizeBytes mtimeDate mtimeTime file"
$hash = $obj->parse (@lexemes)

Match the format field names one to one with the list of lexemes and then return a hash with the 'best data' from those fields in cases where different fields should contain the same information in different forms.

The returned hash uses field names suitable for direct insertion in a FileHash::Entry object.

Private Class Method

 None.

Private Instance Methods

 None.

Errors and Warnings

 Lots.

KNOWN BUGS

 See TODO.

SEE ALSO

 File::Spec, HTTP::Date, Fault::Notepad, Fault::Logger

AUTHOR

Dale Amon <amon@vnl.com>

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 512:

You forgot a '=back' before '=head1'

Around line 540:

=back doesn't take any parameters, but you said =back 4