Changes for version 0.006 - 2012-12-27
- Stanislaw Pusep <creaktive@gmail.com>
- minor fixes
- documented multibyte parsing tricks
- break test
- leak test
- more optimizations
- better utf8/latin1 tests
- updated benchmark results
- updated benchmark results
- added cosine_sim utility
- reallocation fixed
- graduated the uniq_wc tool
- make use of PerlIO::mmap layer
- examples cleanup
- optimizations
- implemented variable codetable (+ raw ASCII support)
- variable size arrays
- Dist::Zilla profile update
- Stanislaw Pusep <stanislav.poussep@buscapecompany.com>
- File::Slurp => File::Map
Documentation
compute cosine similarity between two documents
uses MinHash & SpeedyFx to compare large text data
efficiently count unique tokens from a file
Modules
tokenize/hash large amount of strings efficiently