The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Changes for version 0.04 - 2011-11-28

  • Modified extensive parts of the embedded documentation.
  • Added classes L::D::Variety, L::D::SamplingScheme, and L::D::VOCD, along with corresponding test files.
  • Lingua::Diversity (major refactoring):
  • Methods measure() and measure_per_category() are not abstract anymore: they perform the array validation and unit recoding stuff, and pass the results on to new abstract private method _measure(). This private method is required to return a L::D::Result object, which is directly forwarded as the return value of public method measure() and measure_per_category(). Note that _measure() has the responsability of handling both the case where it is passed a single array by measure() and the case where it is passed two arrays by measure_per_category().
  • Subroutines _validate_size() and _prepend_unit_with_category() have been removed from L::D::Internals and added to this package (L::D). Tests and exception classes have been removed, moved, or renamed accordingly.
  • Attributes min_num_items and max_num_items (with private getters and setters) have been added and can be set from within derived classes if necessary.
  • This module now uses L::D::Variety, L::D::MTLD, and L::D::VOCD.
  • L::D::MTLD:
  • Refactored the code to match the modifications of L::D.
  • Fixed bug in _measure(), namely the case of a single partial factor with a TTR of 1. Now it counts as 1 factor of length 0 (which is not very satisfying but it is hard to come up with a better alternative).
  • L::D::Utils:
  • Fixed bug in split_tagged_text() which caused tags to be used in place of lemmas.
  • L::D::Internals:
  • Added export tag 'all'.
  • Added subroutines _sample_indices(), _count_types(), _count_frequency(), _shannon_entropy(), _perplexity(), _renyi_entropy(), and _get_units_per_category() (along with documentation and tests).
  • Moved subroutines _validate_size() and _prepend_unit_with_category() to the L::D module (along with documentation and tests).
  • Fixed variance precision problem in _get_average().
  • Added shortcut in _get_average() for the case where there's only 1 value.

Modules

measuring the diversity of text units
utility subroutines for classes derived from Lingua::Diversity
'MTLD' method for measuring diversity of text units
storing the result of a diversity measurement
storing the parameters of a sampling scheme
utility subroutines for users of classes derived from Lingua::Diversity
'VOCD' method for measuring diversity of text units
measuring the variety of text units
exception classes for Lingua::Diversity