WWW::Yandex::Catalog::LookupSite - DEPRECATED
DEPRECATION NOTE: Yandex closed it's Yandex.Catalog service in 2017, and replaced it's Thematic Index of Citing (тИЦ) with Quality Index of Site (ИКС) in 2018. Thus, this module is deprecated.
---
Query Yandex Catalog for a website's presence, its Index of Citing, descriptions, and the list of categories it belongs to.
use WWW::Yandex::Catalog::LookupSite; my $site = WWW::Yandex::Catalog::LookupSite->new(); $site->yaca_lookup('www.slovnik.org'); print $site->tic . "\n"; print $site->short_description . "\n"; print $site->long_description . "\n"; print shift @{$site->categories};
The WWW::Yandex::Catalog::LookupSite module retrieves website's Thematic Index of Citing, and checks website's presence in Yandex Catalog, retrieves it's descriptions as recorded in the catalog, and the list of categories it belongs to.
WWW::Yandex::Catalog::LookupSite
This module uses LWP::UserAgent for making requests to Yandex Catalog.
LWP::UserAgent
Thematic Index of Citing (tIC) is technology of Yandex similar to Google's Page Rank. The tIC value's step is 10, so when tIC is under 10, this module will return 0.
Each website in the Yandex Catalog has short description.
Not every website in the Yandex Catalog has long description.
Every website in the Yandex Catalog will belong to at least one category. It may belong to several other categories as well.
Yandex Catalog may know the website by different uri.
We also know the order number (position) of the site in the main catogory where it is listed.
Creates and returns a new WWW::Yandex::Catalog::LookupSite object.
All options are passed on to LWP::UserAgent (please see documentation for this module).
my $site = WWW::Yandex::Catalog::LookupSite->new(); my $site = WWW::Yandex::Catalog::LookupSite->new( agent => 'Mozilla/5.0 (Windows NT 6.0; rv:30.0) Gecko/20100101 Firefox/30.0', cookie_jar => {}, );
Given a URL/URI, strips unnessesary data from it (scheme, authentication, port, and query), fetches Yandex Catalog with it, and parses results for data.
Returns an array ref to: [ tIC, short description, long description, [ categories ], uri, ordinal number ]. Returns undef if couldn't fetch the URI.
[ tIC, short description, long description, [ categories ], uri, ordinal number ]
undef
undef - if there was an error getting or parsing data. Numeric string with tIC value otherwise.
This value (zero or greater, with 10 points step) is always returned. tIC value of zero indicates that eihter site's tIC value is really very low (under 10), or that such site does not exist.
Returned only when site is present in the Catalog (in UTF8 encoding); undef otherwise.
Can be undef when site is present in the Catalog -- not all sites in the catalog have long description. Returned in UTF8 encoding.
Empty list is returned when site is not present in Catalog. At least one entry when site is present in the catalog.
Strings in the array are formatted similar to "Auto & Moto / Motorcycles / Yamaha". The leading "Catalog / " is striped - there are no websites in root of the Catalog.
Auto & Moto / Motorcycles / Yamaha
Catalog /
Note: with recent change Yandex Catalog does not provide all catogories a website is featured in anymore; only main category is available (though site can still be featured in several categories).
Address as stored in the Catalog.
uri stored in the Catalog can be defferent from the input URI. For example, with/without www prefix, or even completely different address (www.narod.ru -> narod.yandex.ru). IDNs are stored in punycode in the Catalog, they can be converted to UTF8 using uri_utf8() convenience method if optional module URI::UTF8::Punycode is installed.
uri
www
www.narod.ru -> narod.yandex.ru
uri_utf8()
URI::UTF8::Punycode
The listing number (ranking) in the main category (index 0 of the categories array).
0
categories
Returned only when site is present in catalog; undef otherwise.
These methods can be called only after $site->yaca_lookup( $uri )
$site->yaca_lookup( $uri )
Returns 1 if any categories has been retrieved; 0 otherwise.
1
print $site->tic . "\n"; if( $site->is_in_catalog ) { print $site->short_description . "\n"; print $site->long_description . "\n"; print "[". $site->order_number ."] ". ( shift @{$site->categories} ) ."\n"; print "$_\n" foreach @{$site->categories}; }
Returns URI in UTF8, instead of punycode. This method requires optional URI::UTF8::Punycode.
Irakliy Sunguryan, slovnik.org
Repository: https://github.com/OpossumPetya/WWW-Yandex-Catalog-LookupSite.
Please report any bugs at GitHub, RT, or via email bug-www-yandex-catalog-lookupsite at rt.cpan.org.
bug-www-yandex-catalog-lookupsite at rt.cpan.org
Copyright 2010-2014 Irakliy Sunguryan.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
To install WWW::Yandex::Catalog::LookupSite, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Yandex::Catalog::LookupSite
CPAN shell
perl -MCPAN -e shell install WWW::Yandex::Catalog::LookupSite
For more information on module installation, please visit the detailed CPAN module installation guide.