Zoidberg::StringParser - Simple string parser
my $base_gram = { esc => '\\', quotes => { q{"} => q{"}, q{'} => q{'}, }, }; my $parser = Zoidberg::StringParser->new($base_gram); my @blocks = $parser->split( qr/\|/, qq{ls -al | cat > "somefile with a pipe | in it"} ); # @blocks now is: # ('ls -al ', ' cat > "somefile with a pipe | in it"'); # So it worked like split, but it respected quotes
This module is a simple syntax parser. It originaly was designed to work like the built-in split function, but to respect quotes. The current version is a little more advanced: it uses user defined grammars to deal with delimiters, an escape char, quotes and braces.
split
Yes, I know of the existence of Text::Balanced, but I wanted to do this the hard way :)
All grammars and collections of grammars should be considered PRIVATE when used by a Z::SP object.
None by default.
TODO
FIXME
The collection hash is simply a hash of grammars with the grammar names as keys. When a collection is given all methods can use a grammar name instead of a grammar.
This can be seen as the default grammar, to use it leave the grammar undefined when calling a method. If this base grammar is defined and you specify a grammar at a method call, the specified grammar will overload the base grammar.
new(\%base_grammar, \%collection, \%settings)
Simple constructor. See "Collection", "Base grammar" and "settings" for explanation of the arguments.
split($grammar, $input, $int)
Splits $input as specified by $grammar,
$input
$grammar
$input can be either a string or a reference to an array of strings. Such a array reference is used as provided, so it should be possible to use for example tied arrays here.
$int is an optional arguments specifying the maximum number of parts the input should be splitted in. Remaining strings are joined and returned as the last part. If you use a grammar with named tokens these are not counted as a part of the string.
$int
Blocks will by default be passed as scalar refs (unless the grammar's meta function altered them) and tokens as scalars. To be a little compatible with CORE::split all items (blocks and tokens) are passed as plain scalars if $grammar is or was a Regexp reference. ( This behaviour can be faked by giving your grammr a value called 'was_regexp'. ) This behaviour is turned off by the "no_split_intel" setting.
CORE::split
The %settings hash contains options that control the general behaviour of the parser. Supported settings are:
%settings
If this value is set the parser will not throw an exception if for example an unmatched quote occurs
Boolean that tells the parser not to remove the escape char when an escaped token is encountered. Double escapes won't be replaced either. Usefull when a string needs to go through a chain of parsers.
Boolean, disables "intelligent" behaviour of split() when set.
split()
Jaap Karssenberg || Pardus [Larus] <pardus@cpan.org>
Copyright (c) 2003 Jaap G Karssenberg. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Contains some code derived from Tie-Hash-Stack-0.09 by Michael K. Neylon.
Zoidberg, Text::Balanced
To install Zoidberg, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Zoidberg
CPAN shell
perl -MCPAN -e shell install Zoidberg
For more information on module installation, please visit the detailed CPAN module installation guide.