The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

TextFileParser - a Perl extension to ease the parsing of text files. Can be used as a base class to write your own parser.

VERSION

version 0.1821900

SYNOPSIS

    use strict;
    use warnings;
    use TextFileParser;

    my $parser = new TextFileParser;
    $parser->read(shift @ARGV);
    print $parser->get_records, "\n";

The above code reads a text file and prints the content to STDOUT.

Here's another parser which is derived from TextFileParser as the base class. See how simple it is to make your own parser.

    use strict;
    use warnings;
    package AnotherParser;
    use parent 'TextFileParser';
    use Exception::Class (
        'SomeException' => {
            description => 'Some description',
            alias => 'throw_some_error'
        }
    );

    sub make_sense_of_line {
        my ($self, $line) = @_;
        my (@words) = split /\s+/, $line;
        throw_some_error 
            error => $self->filename . ': expected at least one character: ' . $self->lines_parsed
                if not @words;
        $self->save_record(\@words);
    }

    package main;
    use Data::Dumper qw(Dumper);
    use Try::Tiny;

    try {
        my $a_parser = new AnotherParser;
        $a_parser->read(shift @ARGV);
        print Dumper($a_parser->get_records);
    } catch {
        print STDERR $_, "\n";
    };

METHODS

new

Takes no arguments. Returns a blessed reference of the object.

    my $pars = new TextFileParser;

read

Takes zero or one string argument with the name of the file. Throws an exception if filename provided is either non-existent or cannot be read for any reason.

    $pars->read($filename);

    # The above is equivalent to the following
    $pars->filename($anotherfile);
    $pars->read();

Returns once all records have been read or if an exception is thrown for any parsing errors. This function will handle all open and close operations on all files even if any exception is thrown.

    use Try::Tiny;

    try {
        $pars->read('myfile.txt');
    } catch {
        print STDERR $_, "\n";
    }

You're better-off not overriding this subroutine. Override make_sense_of_line instead. If you want to intervene in the file open step you can't do it for now. A new version will explain how you can do that.

filename

Takes zero or one string argument with the name of a file. Returns the name of the file that was last opened if any. Returns undef if no file has been opened. This is most useful in generating error messages.

lines_parsed

Takes no arguments. Returns the number of lines last parsed.

    print $pars->lines_parsed, " lines were parsed\n";

This is also very useful for error message generation. See example under Synopsis.

make_sense_of_line

Takes exactly one string argument. This method can be overridden in derived classes to extract the relevant information from each line and store records. In general once the relevant data has been collected, you would want to call save_record. By default, this method saves the input string as the record. When you override this method, remember to either do that or call the SUPER::make_sense_of_line method.

See Synopsis for how a derived class could write their own method to handle data. Below is an alternative way to override the method without calling save_record.

    package MyParser;
    use parent 'TextFileParser';

    sub make_sense_of_line {
        my ($self, $line) = @_;
        my $data = __extract_some_info($line);
        $self->SUPER::make_sense_of_line($data);
    }

Here's another example when you want to append the data from current line to the data collected from a previous line. In the below example, if a line starts with a '+' character then it is to be treated as a continuation of the previous line. Otherwise, any line that begins with anything other than '+' is to be treated as a new record.

    use strict;
    use warnings;
    package MultilineParser;
    use parent 'TextFileParser';
    use Exception::Class (
        'NothingOnLine', 
    );

    sub make_sense_of_line {
        my ($self, $line) = @_;
        my (@words) = split /\s+/, $line;
        $self->__throw_an_exception if not @words;
        my $method = ($words[0] eq '+') ? '__append_last_record' : '__save_new_record';
        $self->$method(@words);
    }

    sub __append_last_record {
        my ($self, $plus, @words) = @_;
        $self->__throw_an_exception if not @words;
        my $last_rec = $self->last_record;
        push @{last_rec}, @words;
    }

    sub __save_new_record {
        my ($self, @words) = @_;
        $self->save_record(\@words);
    }

    sub __throw_an_exception {
        my $self = shift;
        NothingOnLine->throw error => 'Nothing on line: '.$self->lines_parsed;
    }

save_record

Takes exactly one argument. It will save the argument as a record. You can save any data as long as you make it all into one single variable. It can be a reference to an array or a hash or whatever you like. But there must be one unit of data ; if no arguments are given to a call of save_record it will save undef.

See many examples above to see how this can be used.

get_records

Takes no arguments. Returns an array containing all the records that were read by the parser.

record_list_pointer

Takes no arguments and returns the reference to the array containing all the records. This may be useful if you want to re-order the records in some way.

last_record

Takes no arguments and returns the last saved record.

AUTHOR

Balaji Ramasubramanian <balajiram@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2018 by Balaji Ramasubramanian.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.