README - metacpan.org


            
              1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
              NAME
    "Text::Summarize::En" - Routine to summarize English text.
SYNOPSIS
      use strict;
      use warnings;
      use Text::Summarize::En;
      use Data::Dump qw(dump);
      my $summarizerEn = Text::Summarize::En->new();
      my $text         = 'All people are equal. All men are equal. All are equal.';
      dump $summarizerEn->getSummaryUsingSumbasic(listOfText => [$text]);
DESCRIPTION
    "Text::Summarize" contains routines for ranking the sentences in English
    text for inclusion in a summary using the sumBasic algorithm.
CONSTRUCTOR
  "new"
    The method "new" creates an instance of the "Text::Summarize::En" class
    with the following parameters:
    "endingSentenceTag"
         endingSentenceTag => 'PP'
        "endingSentenceTag" is the part-of-speech tag that should be used to
        indicate the end of a sentence. The default is 'PP'. The value of
        this tag must be a tag generated by the module Lingua::EN::Tagger.
    "listOfPOSTypesToKeep"
         listOfPOSTypesToKeep => [qw(CONTENT_WORDS)]
        The sumBasic algorithm preprocesses the text so that only certain
        parts-of-speech (POS) are retained and used to rank the sentences.
        The module Lingua::EN::Tagger is used to tag the parts-of-speech of
        the text. The parts-of-speech retained can be specified by word
        types, where the type is a combination of 'ALL', 'ADJECTIVES',
        'ADVERBS', 'CONTENT_ADVERBS', 'CONTENT_WORDS', 'NOUNS',
        'PUNCTUATION', 'TEXTRANK_WORDS', or 'VERBS'. The default is
        "[qw(CONTENT_WORDS)]", which equates to "[qw(CONTENT_ADVERBS, VERBS,
        ADJECTIVES, NOUNS)]".
    "listOfPOSTagsToKeep"
         listOfPOSTagsToKeep => [...]
        "listOfPOSTagsToKeep" provides finer control over the
        parts-of-speech to be retained when filtering the tagged text. For a
        list of all the possible tags call "getListOfPartOfSpeechTags()".
METHODS
  "getSummaryUsingSumbasic"
    "getSummaryUsingSumbasic" computes the summary of text using the
    sumBasic algorithm.
    "listOfStemmedTaggedSentences"
         listOfStemmedTaggedSentences => [...]
        "listOfStemmedTaggedSentences" is an array reference containing the
        list of stemmed and part-of-speech tagged sentences from
        Text::StemTagPos. If "listOfStemmedTaggedSentences" is not defined,
        then the text to be processed should be provided via "listOfText".
    "listOfText"
         listOfText => [...]
        "listOfText" is an array reference containing the strings of text to
        be summarized. "listOfText" is only used if
        "listOfStemmedTaggedSentences" is undefined.
    "tokenWeight"
         tokenWeight => {}
        "tokenWeights" is an optional hash reference that can provide the
        weights for the tokens provided by "listOfStemmedTaggedSentences" or
        "listOfText". If "tokenWeights" is not defined then the weight of a
        token is just its frequency of occurrence in the filtered text. If
        "textRankParameters" is defined, then the token weights are computed
        using Text::Categorize::Textrank.
    "textRankParameters"
          textRankParameters => undef
        If "textRankParameters" is defined, then the token weights for the
        sumBasic algorithm are computed using Text::Categorize::Textrank.
        The parameters to use for Text::Categorize::Textrank, excluding the
        "listOfTokens" parameters, can be set using the hash reference
        defined by "textRankParameters". For example, "textRankParameters =>
        {directedGraph => 1}" would make the textrank weights be computed
        using a directed token graph.
INSTALLATION
    Use CPAN module to the module and all its prerequisites.
      perl -MCPAN -e shell
      >install Text::Summarize
BUGS
    Please email bugs reports or feature requests to
    "bug-text-summarize@rt.cpan.org", or through the web interface at
    <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Text-Summarize>. The
    author will be notified and you can be automatically notified of
    progress on the bug fix or feature request.
AUTHOR
     Jeff Kubina<jeff.kubina@gmail.com>
COPYRIGHT
    Copyright (c) 2009 Jeff Kubina. All rights reserved. This program is
    free software; you can redistribute it and/or modify it under the same
    terms as Perl itself.
    The full text of the license can be found in the LICENSE file included
    with this module.
KEYWORDS
    information processing, summary, summaries, summarization, summarize,
    sumbasic, textrank
SEE ALSO
    Log::Log4perl, Text::Categorize::Textrank, Text::Summarize
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)