Test::Text::Sentence - module for splitting text into sentences
use Test::Text::Sentence qw( split_sentences ); use locale; use POSIX qw( locale_h ); setlocale( LC_CTYPE, 'iso_8859_1' ); @sentences = split_sentences( $text );
The Test::Text::Sentence module contains the function split_sentences, which splits text into its constituent sentences, based on a fairly approximate regex. If you set the locale before calling it, it will deal correctly with locale dependant capitalization to identify sentence boundaries. Certain well know exceptions, such as abbreviations, may cause incorrect segmentations.
Test::Text::Sentence
The split sentences function takes a scalar containing ascii text as an argument and returns an array of sentences that the text has been split into.
@sentences = split_sentences( $text );
locale POSIX
https://github.com/neilb/HTML-Summary
Ave Wrigley <wrigley@cre.canon.co.uk>
Copyright (c) 1997 Canon Research Centre Europe (CRE). All rights reserved.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Test::Text, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Test::Text
CPAN shell
perl -MCPAN -e shell install Test::Text
For more information on module installation, please visit the detailed CPAN module installation guide.