Author image Neil Bowers
and 1 contributors

NAME

WebService::DetectLanguage - interface to the language detection API at DetectLanguage.com

SYNOPSIS

 use WebService::DetectLanguage;
 my $api = WebService::DetectLanguage->new(key => '...');
 my @possibilities = $api->detect("there can be only one");
 foreach my $poss (@possibilities) {
     printf "language = %s  confidence=%f\n",
            $poss->language->name,
            $poss->confidence;
 }

DESCRIPTION

This module is an interface to the DetectLanguage service, which provides an API for guessing what natural language is used in a sample of text.

This is very much a first cut at an interface, so (a) the interface may well change, and (b) contributions are welcome.

To use the API you must sign up to get an API key, at https://detectlanguage.com/plans. There is a free level which lets you make 1,000 requests per day, and you don't have to provide a card to sign up for the free level.

Example Usage

Let's say you've got a sample of text in a file. You might read it into $text using read_text() from File::Slurper.

To identify the language, you call the detect() method:

 @results = $api->detect($text);

Each result is an instance of WebService::DetectLanguage::Result. If there's only one result, you should look at the is_reliable flag to see whether they're confident of the identification The more text they're given, the more confident they are, in general.

 if (@results == 1) {
   $result = $results[0];
   if ($result->is_reliable) {
     printf "Language is %s!\n", $result->language->name;
   }
   else {
     # Hmm, maybe check with the user?
   }
 }

You might get more than one result though. This might happen if your sample contains words from more than one language, for example.

In that case, the is_reliable flag can be used to check if the first result is reliable enough to go with.

 if (@results > 1 && $results[0]->is_reliable) {
   # we'll go with that!
 }

There will only ever be at most one result with is_reliable set to a true value. If you get multiple results, they're always in decreasing order of reliability.

Each result also includes a confidence value, which looks a bit like a percentage, but their FAQ says that it can go higher than 100.

 foreach my $result (@results) {
   my $language = $result->language;
   printf "language = %s (%s) with confidence %f\n",
       $language->name,
       $language->code,
       $result->confidence;
 }

METHODS

new

You must provide the key that you got from detectlanguage.com.

 my $api = WebService::WordsAPI->new(
               key         => '...',
           );

detect

This method takes a UTF-8 text string, and returns a list of one or more guesses at the language.

Each guess is a data object which has attributes language, confidence, and is_reliable.

 my $text    = "It was a bright cold day in April, ...";
 my @results = $api->detect($text);

 foreach my $result (@results) {
   printf "language = %s (%s)  confidence = %f  reliable = %s\n",
       $result->language->name,
       $result->language->code,
       $result->confidence,
       $result->is_reliable ? 'Yes' : 'No';
 }

Look at the API documentation to see how to interpret each result.

multi_detect

This takes multiple strings and returns a list of arrayrefs; there is one arrayref for each string, returned in the same order as the strings. Each arrayref contains one or more language guess, as for detect() above.

 my @strings = (
   "All happy families are alike; each unhappy family ... ",
   "This is my favourite book in all the world, though ... ",
   "It is a truth universally acknowledged, that Perl ... ",
 );

 my @results = $api->multi_detect(@strings);

 for (my $i = 0; $i < @strings; $i++) {
    print "Text: $strings[$i]\n";
    my @results = @{ $results[$i] };

    # ... as for detect() above
 }

languages

This returns a list of the supported languages:

 my @languages = $api->languages;

 foreach my $language (@languages) {
   printf "%s: %s\n",
          $language->code,
          $language->name;
 }

account_status

This returns a bunch of information about your account:

 my $status = $api->account_status;

 printf "plan=%s  status=%s  requests=%d\n",
   $status->plan,
   $status->status,
   $status->requests;

For the full list of attributes, either look at the API documentation, or WebService::DetectLanguage::AccountStatus.

SEE ALSO

https://detectlanguage.com is the home page for the service; documentation for the API can be found at https://detectlanguage.com/documentation.

REPOSITORY

https://github.com/neilb/WebService-DetectLanguage

AUTHOR

Neil Bowers <neilb@cpan.org>

LICENSE AND COPYRIGHT

This software is copyright (c) 2019 by Neil Bowers <neilb@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.