# NAME

Text::WagnerFischer::Amharic - The Wagner-Fischer Algorithm for Amharic.

# SYNOPSIS

```
use utf8;
use Text::WagnerFischer::Amharic qw(distance);
print distance ( "ፀሐይ", "ጸሀይ" ), "\n"; # prints "2"
print distance ( [0,2,3, 1,2,1, 1,1,1, 1], "ፀሐይ", "ጸሀይ" ), "\n"; # prints "2"
my @words = ( "ፀሐይ", "ፀሓይ", "ፀሀይ", "ፀሃይ", "ጸሐይ", "ጸሓይ", "ጸሀይ", "ጸሃይ" );
my @distances = distance ( "ፀሐይ", @words );
print "@distances\n"; # prints "0 1 1 1 1 2 2 2"
@distances = distance ( [0,2,3, 1,1,1, 1,1,1, 2], "ፀሐይ", @words );
print "@distances\n"; # prints "0 1 1 2 1 2 2 3"
```

# DESCRIPTION

This module implements the Wagner-Fischer edit distance algorithm for Ethiopic script under Amharic Amharic character classes.

The edit distance is a measure of the degree of proximity between two strings, based on "edits". Each type of edit is given its own cost (weight). In additional to the three initial Wagner-Fischer weights, the Amharic weight function recognizes 7 additional mismatch types:

```
/ a: x = y (cost for letter match)
w(x,y) = | b: x = - or y = - (cost for insertion/deletion operation)
| c: x != y (cost for letter mismatch)
| x =~ [#y#] and
| d: x =~ [=y=] (cost of decayed labiovelar)
| e: form(x) > 7 || form(y) > 7 (cost of labiovelar mismatch)
| f: else (cost of wrong form)
| form(x) == form(y) and
| g: x =~ [=y=] (cost of grapheme mismatch)
| h: x is a shift-slip of y (cost of shift key mismatch)
| i: else (cost of wrong base)
\ j: x =~ [=y=] (cost of wrong grapheme and form, right phoneme)
```

These costs are given through an array reference as an option first argument of the `distance`

subroutine (see SYNOPSIS).

When two strings have distance 0, they are the same. Note that the distance is calculated to reach the _minimum_ cost, i.e. choosing the most economic operation for each edit.

# BUGS

None presently known.

# AUTHOR

Daniel Yacob, Yacob@EthiopiaOnline.Net

# SEE ALSO

`Text::WagnerFischer`

, `Text::Metaphone::Amharic`

, `Regexp::Ethiopic`

1 POD Error

The following errors were encountered while parsing the POD:

- Around line 140:
Non-ASCII character seen before =encoding in '"ፀሐይ",'. Assuming UTF-8