NAME
WWW::Wappalyzer - Perl port of Wappalyzer (https://wappalyzer.com)
DESCRIPTION
Uncovers the technologies used on websites: detects content management systems, web shops, web servers, JavaScript frameworks, analytics tools and many more.
Supports only `scriptSrc`, `scripts`, `html`, `meta`, `headers`, 'cookies' and `url` patterns of Wappalyzer specification. Lacks 'version', 'implies', 'excludes' support in favour of speed.
Categories: https://github.com/wappalyzer/wappalyzer/blob/master/src/categories.json Technologies: https://github.com/wappalyzer/wappalyzer/tree/master/src/technologies More info on Wappalyzer: https://github.com/wappalyzer/wappalyzer
SYNOPSIS
use WWW::Wappalyzer;
use LWP::UserAgent;
use List::Util 'pairmap';
my $response = LWP::UserAgent->new->get( 'http://www.drupal.org' );
my %detected = WWW::Wappalyzer->new->detect(
html => $response->decoded_content,
headers => { pairmap { $a => [ $response->headers->header($a) ] } $response->headers->flatten },
);
# %detected = (
# 'Font scripts' => [ 'Google Font API' ],
# 'Caching' => [ 'Varnish' ],
# 'CDN' => [ 'Fastly' ],
# 'CMS' => [ 'Drupal' ],
# 'Video players' => [ 'YouTube' ],
# 'Tag managers' => [ 'Google Tag Manager' ],
# 'Reverse proxies' => [ 'Nginx' ],
# 'Web servers' => [ 'Nginx' ],
# );
EXPORT
None by default.
SUBROUTINES/METHODS
new
my $wappalyzer = WWW::Wappalyzer->new( %params )
Constructor.
Available parameters:
categories - optional additional categories array ref to files list (refer 'add_categories_files' below)
technologies - optional additional technologies array ref to files list (refer 'add_technologies_files' below)
Returns the instance of WWW::Wappalyzer class.
detect
my %detected = $wappalyzer->detect( %params )
Tries to detect CMS, framework, etc for given html code, http headers, URL.
Available parameters:
html - HTML code of web page.
headers - Hash ref to http headers list. The value may be a plain string or an array ref
of strings for a multi-valued field.
Cookies should be passed in 'Set-Cookie' header.
url - URL of web page.
cats - Array ref to a list of trying categories names, defaults to all.
Less categories - less CPU usage.
Returns the hash of detected applications by category:
(
CMS => [ 'Joomla' ],
'Javascript frameworks' => [ 'jQuery', 'jQuery UI' ],
)
get_categories_names
my @cats = $wappalyzer->get_categories_names()
Returns the array of all application categories names.
add_categories_files
$wappalyzer->add_categories_files( @filepaths )
Puts additional categories files to a list of processed categories files. See lib/WWW/wappalyzer_src/categories.json as format sample.
add_technologies_files
$wappalyzer->add_technologies_files( @filepaths )
Puts additional techs files to a list of processed techs files. See lib/WWW/wappalyzer_src/technologies/a.json as format sample.
reload_files
$wappalyzer->reload_files()
Ask to reload data from additional categories and technologies files those may be changed in runtime.
AUTHOR
Alexander Nalobin, <alexander at nalobin.ru>
BUGS
Please report any bugs or feature requests to bug-www-wappalyzer at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=WWW-Wappalyzer. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc WWW::Wappalyzer
You can also look for information at:
GitHub
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
LICENSE AND COPYRIGHT
Copyright 2013-2015 Alexander Nalobin.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.