NAME

File::Searcher -- Searches for files and performs search/replacements on matching files

SYNOPSIS

use File::Searcher;
my $search = File::Searcher->new('*.cgi');
$search->add_expression(name=>'street',
    search=>'1234 Easy St.',
    replace=>'456 Hard Way',
    options=>'i');
$search->add_expression(name=>'department',
    search=>'(Dept\.|Department)(\s+)(\d+)',
    replace=>'$1$2$3',
    options=>'im');
$search->add_expression(name=>'place',
    search=>'Portland, OR(.*?)97212',
    replace=>'Vicksburg, MI${1}49097',
    options=>'is');
$search->start;
# $search->interactive; SEE File::Searcher::Interactive
@files_matched = $search->files_matched;
print "Files Matched\n";
print "\t" . join("\n\t", @files_matched) . "\n";
print "Total Files:\t" . $search->file_cnt . "\n";
print "Directories:\t" . $search->dir_cnt . "\n";
my @files_replaced = $search->expression('street')->files_replaced;
my @files_replaced = $search->expression($expression)->files_replaced;
my %matches = $search->expression('street')->matches;
my %replacements = $search->expression('street')->replacements;

DESCRIPTION

File::Searcher allows for the traversing of a directory tree for files matching a Perl regular expression. When a match is found, the statistics are stored and if the file is a text file a series of searches and replacements can be performed. File::Searcher has options that allow for backing-up / archiving files and has OO access to reporting and statistics of matches and replacements.

USAGE

General Use

 # constructor - with options

 my $search = File::Searcher->new(
   file_expression=>'*.txt', # required unless files
   files=>\@files,                 # required unless file_expression
   start_directory=> '/path/to/dir',       # default './'
   backup_extension=> '~',             # default '.bak'
   do_backup=> '0',                # default 1 will create backup file
   recurse_subs=> '0',             # default 1 will recurse subs
   do_replace=> '1',               # default 0 will not replace matches
   log_mode=> '111',               # unimplemented
   archive=>'my_archive.tgz',          # default is /start_directory/(system time).tgz
   do_archive=> '1', # default 0 will not archive matched files
);

 # constructor - with file expression

 my $search = File::Searcher->new('*.txt');

 # constructor - with ref to array of absolute paths

 my $search = File::Searcher->new(\@files);

The constructor comes in 3 flavors; with options, with file expression, or reference to array of absolute paths. If you do not specify the options in the constructor, they can be set by accessor methods.

$search->start_directory('/path/to/dir');
$search->backup_extension('~');
$search->do_backup(0);
$search->recurse_subs(0);
$search->do_replace(1);
$search->archive('my_archive.tgz');
$search->do_archive(0);

Next, the series of expressions are set with options. Expressions will be searched in the order which they are added to the search.

$search->add_expression(
   name=>'street', # required
   search=>'1234 Easy St.',
   replace=>'456 Hard Way',
   case_insensitive=>1,
);

 $search->add_expression(
   name=>'department',
   search=>'(Dept\.|Department)(\s+)(\d+)',
   replace=>'$1$2$3',
   case_insensitive=>1,
   multiline=>1,
 );

$search->add_expression(
   name=>'place',
   search=>'Portland, OR(.*?)97212',
   replace=>'Vicksburg, MI${1}49097',
   singleline=>1,);

Expression options can be set in two ways:

# as a single string
...add_expression(..., options=> 'ismx');

# as named paramaters
...add_expression(..., singleline=>1, multiline=>1,case_insensitive=>1, extended=>1);

# Run search

$search->start;

Expanded Functionality

For expanded FUN-ctionality set references to subroutines to process when a file match is encountered on_file_match and when a search expression is encountered on_expression_match.

$search->on_file_match(sub{
my ($file) = @_;
 return 0 unless $file->writable_r; # writable by real id?
 return 0 unless $file->stats->size_bytes < 100;
 chmod(0777, $file->path);
 return 1;
});
# alternatively
# $search->on_file_match(\&my_sub);

on_file_match receives a file object with properties methods (path, readable_e, writable_e, executable_e, readable_r, writable_r, executable_r, owned_e, owned_r, exist, exist_non_zero, zero_size, file, directory, link_, pipe_, socket_, block, character, setuid_bit, setgid_bit, sticky_bit, opened_tty, text, binary) if it is a file it also has stats methods (device_code, inode_number, mode_flags, link_cnt, user_id, group_id, device_type, size_bytes, time_access_seconds, time_modified_seconds, time_status_seconds, block_system, block_file, time_access_string, time_modified_string, time_status_string, mode_string) returns 1 to continue processing files (i.e. look for matches to expressions) returns 0 to move to next file

$search->on_expression_match( sub{
 my ($match,$expression) = @_;
 return -100 if scalar($expression->files_replaced) > 7;
 return -10 if length($match->post) < 120;
 return 1 if $match->match =~ /special(.*?)case/;
 return 10 unless $match->contents =~ /special/;
 # this is sort of what this module does, but,hey!
 my $file_contents = $match->contents;
 eval("\$contents =~ s/$match->search/$match->replace/g$match->options;");
 return $contents;
});

# alternatively
# $search->on_expression_match(\&my_sub);

on_expression_match receives a match object with methods(match, pre, post, last, start_offset, end_offset,contents), expression object access expression options (search, replace, options, %replacements, %matches, @files_replaced)

returns -100 to ignore expression, and do not search for it again in any file
returns -10 to skip to next file
returns -1 to skip to next match (possibly next file)
returns 1 to process match (as specified in $search object)
returns 10 to process all matches in file
returns 100 to process all occurences in all files
returns $content (scalar) of file contents, overwrites contents (only to file if specified) and moves to next file

Reporting

To see what happened, for the search and each expression, access results.

# search results reports

@files_matched = $search->files_matched;
print "Files Matched\n";
print "\t" . join("\n\t", @files_matched) . "\n";
print "Text Files:\t" . $search->file_text_cnt . "\n";
print "Binary Files:\t" . $search->file_binary_cnt . "\n";
print "Uknown Files:\t" . $search->file_unknown_cnt . "\n";
print "Total Files:\t" . $search->file_cnt . "\n";
print "Directories:\t" . $search->dir_cnt . "\n";
print "Hard Links:\t" . $search->link_cnt . "\n";
print "Sockets:\t" . $search->socket_cnt . "\n";
print "Pipes:\t" . $search->pipe_cnt . "\n";
print "Uknown Entries:\t" . $search->unknown_cnt . "\n";
print "\n";

# expression results reports


foreach my $expression (@{$search->get_expressions}){

   my @files_replaced = $search->expression($expression)->files_replaced;
   my %matches = $search->expression($expression)->matches;
   my %replacements = $search->expression($expression)->replacements;

   print "Search/Replace:\t" .>
   $search->expression($expression)->search .
   "\t" . $search->expression($expression)->replace . "\n";

   print "\tNo Replacements Made\n" and next if @files_replaced < 1;
   print "\tFile\t\t\t\t\tMatches\tReplacements\n";

   foreach my $file (@files_replaced){
      print "\t$file\t\t$matches{$file}\t$replacements{$file}\n";
   }
     print "\n";
}

CAVEATS

Super complex regular expressions probably won't work the way you think they will.

BUGS

Let me know...

TO DO

  • More advanced functionality

  • More reporting (line numbers, etc.)

  • Maybe get rid of Class::Generate

SEE ALSO

File::Searcher::Interactive, File::Find, File::Copy, File::Flock, Class::Struct::FIELDS, Class::Generate, Cwd, Time::localtime, Archive::Tar

COPYRIGHT

Copyright 2000, Adam Stubbs This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Please email me if you find this module useful.

AUTHOR

Adam Stubbs, astubbs@advantagecommunication.com Version 0.91, Last Updated Tue Sep 25 23:08:50 EDT 2001