Text::Parser::Multiline - Adds multi-line support to the Text::Parser object.
version 0.925
use Text::Parser; my $parser = Text::Parser->new(multiline_type => 'join_last'); $parser->read('filename.txt'); print $parser->get_records(); print scalar($parser->get_records()), " records were read although ", $parser->lines_parsed(), " lines were parsed.\n";
Some text formats allow line-wrapping with a continuation character, usually to improve human readability. To handle these types of text formats with the native Text::Parser class, the derived class would need to have a save_record method that would:
save_record
Detect if the line is wrapped or is part of a wrapped line. To do this the developer has to implement a function named is_line_continued.
is_line_continued
Join any wrapped lines to form a single line. For this, the developer has to implement a function named join_last_line.
join_last_line
With these two things, the developer can implement their save_record assuming that the line is already unwrapped.
This role may be composed into an object of the Text::Parser class. To use this role, just set the multiline_type attribute. A derived class may set this in their constructor (or BUILDARGS if you use Moose). If this option is set, the developer should re-define the is_line_continued and join_last_line methods.
multiline_type
BUILDARGS
It should also look for the following error conditions (see Text::Parser::Errors):
If the end of file is reached, and the line is expected to be still continued, an exception of Text::Parser::Errors::UnexpectedEof is thrown.
Text::Parser::Errors::UnexpectedEof
It is impossible for the first line in a text input to be wrapped from a previous line. So if this condition occurs, an exception of Text::Parser::Errors::UnexpectedCont is thrown.
Text::Parser::Errors::UnexpectedCont
These methods must be implemented by the developer in the derived class. There are default implementations provided in Text::Parser but they may not handle your target text format.
$parser->is_line_continued($line)
Takes a string argument containing the current line (also available through the this_line method) as input. Your implementation should return a boolean that indicates if the current line is wrapped.
this_line
sub is_line_continued { my ($self, $line) = @_; chomp $line; $line =~ /\\\s*$/; }
The above example method checks if a line is being continued by using a back-slash character (\).
\
$parser->join_last_line($last_line, $current_line)
Takes two string arguments. The first is the previously read line which is wrapped in the next line (the second argument). The second argument should be identical to the return value of this_line. Neither argument will be undef. Your implementation should join the two strings stripping any continuation character(s), and return the resultant string.
undef
Here is an example implementation that joins the previous line terminated by a back-slash (\) with the present line:
sub join_last_line { my $self = shift; my ($last, $line) = (shift, shift); $last =~ s/\\\s*$//g; return "$last $line"; }
Text::Parser
Text::Parser::Errors
Please report any bugs or feature requests on the bugtracker website http://github.com/balajirama/Text-Parser/issues
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
Balaji Ramasubramanian <balajiram@cpan.org>
This software is copyright (c) 2018-2019 by Balaji Ramasubramanian.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Text::Parser, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Text::Parser
CPAN shell
perl -MCPAN -e shell install Text::Parser
For more information on module installation, please visit the detailed CPAN module installation guide.