WWW::Wikipedia::LangTitles - get interwiki links from Wikipedia.
use utf8; use WWW::Wikipedia::LangTitles 'get_wiki_titles'; my $title = 'Three-phase electric power'; my $links = get_wiki_titles ($title); print "$title is '$links->{de}' in German.\n"; my $film = '東京物語'; my $flinks = get_wiki_titles ($film, lang => 'ja'); print "映画「$film」はイタリア語で「$flinks->{it}」と名付けた。\n";
produces output
Three-phase electric power is 'Dreiphasenwechselstrom' in German. 映画「東京物語」はイタリア語で「Viaggio a Tokyo」と名付けた。
(This example is included as synopsis.pl in the distribution.)
This documents version 0.04 of WWW::Wikipedia::LangTitles corresponding to git commit cd5d0156c401472bc424421159fca7d3c0f769fe released on Thu Jul 20 13:15:53 2017 +0900.
This module retrieves the Wikipedia interwiki link titles from the web site wikidata.org. It can be used, for example, to translate a term in English into other languages, or to get near equivalents.
my $ref = get_wiki_titles ('Helium');
Given a word or phrase as an argument, which is the title of a Wikipedia article, the return value is a hash reference containing keys which are language codes, and values which are the names of the equivalent Wikipedia article in other languages. For example, in the above case of Helium, $ref->{th} will be equal to ฮีเลียม, the Thai title of the Wikipedia article on helium.
$ref->{th}
The language of the original page can be specified like this:
use utf8; my $from_th = get_wiki_titles ('ฮีเลียม', lang => 'th');
The URL is encoded using "uri_escape_utf8" in URI::Escape, so use character, not byte, strings (use "use utf8;" etc.)
As of version 0.04, get_wiki_titles deletes the non-encyclopedia sites like Wikiquote and Wikiversity from the list of returned values.
my $url = make_wiki_url ('helium');
Make a URL for the Wikidata page. You will then need to retrieve the page and parse the JSON yourself. Use a second argument to specify the language of the page:
use utf8; use WWW::Wikipedia::LangTitles 'make_wiki_url'; print make_wiki_url ('ฮีเลียม', 'th'), "\n";
https://www.wikidata.org/w/api.php?action=wbgetentities&sites=thwiki&titles=%E0%B8%AE%E0%B8%B5%E0%B9%80%E0%B8%A5%E0%B8%B5%E0%B8%A2%E0%B8%A1&props=sitelinks/urls|datatype&format=json
(This example is included as thai-url.pl in the distribution.)
If no language is specified, the default is en for English.
en
This method was added in version 0.02 of the module.
This module enables one to convert the language key names given by this module into the English-language names of the languages.
use utf8; use FindBin '$Bin'; use WWW::Wikipedia::LangTitles 'get_wiki_titles'; use Locale::Codes::Language; my $article = 'King Kong'; my $titles = get_wiki_titles ($article); for my $lang (keys %$titles) { my $l2c = code2language ($lang); if (! $l2c) { $l2c = $lang; } my $name = $titles->{$lang}; if ($name ne $article) { print "$name in $l2c.\n"; } }
king.kong in jbo. קינג קונג in Hebrew. Кинг Конг in Bulgarian. キングコング in Japanese. كينغ كونغ in Arabic. Кінг-Конг in Ukrainian. King Kong (hahmo) in Finnish. 金剛 (怪獸) in Chinese. Քինգ Քոնգ in Armenian. คิงคอง in Thai. کینگ کونگ in Persian. Кинг-Конг in Russian. 킹콩 in Korean. კინგ კონგი in Georgian.
(This example is included as locale-codes.pl in the distribution.)
Carp is used to report errors
LWP::UserAgent is used to retrieve the data from Wikidata.
JSON::Parse is used to parse the JSON data from Wikidata.
URI::Escape is used to make the URLs for Wikidata from the input titles.
Nothing is exported by default. The export tag ':all' exports all the functions of the module.
use WWW::Wikipedia::LangTitles ':all';
The default tests of the module do not attempt to connect to the internet. To test using an internet connection, run xt/scrape.t like this:
prove -I lib xt/scrape.t
from the top directory of the distribution.
This module was a collection of small scripts I had been using to scrape multilingual article names related to physics from Wikipedia. I made the scripts into a CPAN module because I thought it could be useful to other people. Specifically, I used my scripts to add some Japanese element names to Chemistry::Elements, and I thought this method might be useful for someone else.
Version 0.02 added the "make_wiki_url" for people who want to retrieve and parse the output themselves.
Ben Bullock, <bkb@cpan.org>
This package and associated files are copyright (C) 2016-2017 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.
To install WWW::Wikipedia::LangTitles, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Wikipedia::LangTitles
CPAN shell
perl -MCPAN -e shell install WWW::Wikipedia::LangTitles
For more information on module installation, please visit the detailed CPAN module installation guide.