The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Changes for version 0.014 - 2022-07-08

  • isWORDCHAR_utf8_safe() / toLOWER_utf8_safe() are actually available since Perl v5.26 (Stanislaw Pusep)
  • eg/ improvements (Stanislaw Pusep)


compute cosine similarity between two documents
uses MinHash & SpeedyFx to compare large text data
efficiently count unique tokens from a file


tokenize/hash large amount of strings efficiently