NAME

String::Fuzzy - Python-style fuzzy string matching (fuzzywuzzy port)

SYNOPSIS

use String::Fuzzy qw( fuzzy_substring_ratio extract_best ratio );

# Basic ratio with normalization (default)
my $score = ratio( "Hello", "hello" );  # 100 (normalized)

# Disable normalization for case-sensitive matching
my $raw_score = ratio( "Hello", "hello", normalize => 0 );  # ~80

# Find best match with index
my $best = extract_best( "cat", [ "cat", "category", "dog" ], scorer => \&partial_ratio );
print "Best: $best->[0], Score: $best->[1], Index: $best->[2]\n";

# Get all matches sorted by score
my $all = extract_all( "cat", [ "cat", "category", "dog" ] );
for ( @$all ) { print "Match: $_->[0], Score: $_->[1]\n"; }

# Practical example: Find the best vendor match with a typo
my @vendors = qw( SendGrid Mailgun SparkPost Postmark );
my $input = "SpakPost Invoice";
my $best_score = 0;
my $best_vendor;
for my $vendor ( @vendors ) {
    my $score = fuzzy_substring_ratio( $vendor, $input );
    if( $score > $best_score ) {
        $best_score = $score;
        $best_vendor = $vendor;
    }
}
if( $best_score >= 85 ) {
    print "Matched '$best_vendor' with score $best_score\n";  # SparkPost, 88.89
}

VERSION

v0.1.1

DESCRIPTION

This module provides fuzzy string matching similar to Python's fuzzywuzzy library, faithfully replicating its core functionality and behavior in a Perl context. It supports multiple strategies for comparing strings with typos, extra words, or inconsistent formatting. By default, strings are normalized (lowercased, diacritics removed, punctuation stripped), but this can be disabled with the normalize option.

FUNCTIONS

All functions accept an optional normalize parameter (default: 1) to toggle string normalization.

ratio($a, $b, %opts)

Computes Levenshtein similarity between two strings, returning a score from 0 to 100. Returns a float for precision.

partial_ratio($a, $b, %opts)

Slides the shorter string over the longer one to find the best fixed-length match.

Returns 100 if the shorter string is fully contained in the longer one.

fuzzy_substring_ratio($needle, $haystack, %opts)

Searches for the best fuzzy match of $needle in $haystack across variable-length windows. Useful for OCR noise or embedded typos.

token_sort_ratio($a, $b, %opts)

Ignores word order by sorting tokens before comparison.

token_set_ratio($a, $b, %opts)

Focuses on common word tokens, ignoring duplicates and order.

extract_best($query, \@choices, %opts)

Returns the best match as [$string, $score, $index]. Accepts scorer (default: \&ratio) and limit (default: 1) for top-N results.

extract_all($query, \@choices, %opts)

Returns all matches as [[string, score], ...], sorted by score descending.

Accepts scorer (default: \&ratio).

AUTHOR

Albert (ChatGPT) from OpenAI, with enhancements by Grok 3 from xAI.

Supported by Jacques Deguest <jack@deguest.jp>.

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install String::Fuzzy, copy and paste the appropriate command in to your terminal.

cpanm

cpanm String::Fuzzy

CPAN shell

perl -MCPAN -e shell
install String::Fuzzy

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)