NAME

File::Split

SYNOPSIS

Splits files.

my $fs = File::Split->new({keepSource=>'1'});

my $files_out = $fs->split_file({'parts' => 10},'filepath');

Creates ten files named 'filepath.1','filepath.2',...,'filepath.10'.



=head1 DESCRIPTION

File::Split defaults to removing the now-split file.

my $fs = File::Split->new({keepSource=>'1'});

Split the file into ten equal-sized parts called filepath.1,filepath.2,...

my $files_out = $fs->split_file({'parts' => 10},'filepath');

Split the file into multiple parts with a size of 1000 lines or less.

my $files_out = $fs->split_file({'lines' => 1000},'filepath');

Split files into sub-sections based on a substring value. Gives filepath.MB, filepath.SK

my $files_out = $fs->split_file({'substr'=>{pos=>'10000',val=>['MB','SK']}},'filepath');

Split file based on regular expressions grouped in a hash of arrays of regular expressions. Gives files filepath.BC, filepath.AB,...

my $files_out = $fs->split_file({'grep'=>{
                                   'BC'=>['\t(V\d\C\d\C\d)\t'],
                                   'AB'=>['\t(T\d\C\d\C\d)\t'],
                                   'SK'=>['\t(S\d\C\d\C\d)\t'],
                                   'MB'=>['\t(R\d\C\d\C\d)\t'],
                                   'ON'=>['\t(P\d\C\d\C\d)\t','\t(N\d\C\d\C\d)\t','\t(M\d\C\d\C\d)\t','\t(L\d\C\d\C\d)\t','\t(K\d\C\d\C\d)\t'],
                                   'QC'=>['\t(G\d\C\d\C\d)\t','\t(H\d\C\d\C\d)\t','\t(J\d\C\d\C\d)\t','\t(K\d\C\d\C\d)\t','\t(S\d\C\d\C\d)\t'],
                                   'NS'=>['\t(B\d\C\d\C\d)\t'],
                                   'NB'=>['\t(E\d\C\d\C\d)\t'],
                                   'PE'=>['\t(C\d\C\d\C\d)\t'],
                                   'NL'=>['\t(A\d\C\d\C\d)\t'],
                                   'NT'=>['\t(X\d\C\d\C\d)\t'],
                                   'NU'=>[],
                                   'YT'=>['\t(Y\d\C\d\C\d)\t'],
                                       }
                               },'dat/zip411Bus040710.TXT');

Split file on array of regular expressions. filename extensions are based on the matched value.

$files_out = $fs->split_file({'grep'=>['\t(MB)\t','\t(SK)\t','\t(NB)\t','\t(NL)\t','\t(NT)\t','\t(NS)\t','\t(YT)\t','\t(PE)\t','\t(NU)\t','\t(BC)\t','\t(ON)\t','\t(AB)\t','\t(QC)\t']},'dat/zip411Bus041013.TXT');

Merge any file that matches 'filepath_for_reconstructed_file*'

my $out_name = $fs->merge_file('filepath_for_reconstructed_file');

CAVEATS

This script isn't fully mature, and interfaces may change.

File::Split will create empty files if you split an empty file. If you request five parts, you will receive five parts.

File::Split will return undef if you try to split a non-existant file.

AUTHOR

Phil Middleton