Sport::Analytics::NHL::Scraper - Scrape and crawl the NHL website for data
Scrape and crawl the NHL website for data
use Sport::Analytics::NHL::Scraper my $schedules = crawl_schedule({ start_season => 2016, stop_season => 2017 }); ... my $contents = crawl_game( { season => 2011, stage => 2, season_id => 0001 }, # game 2011020001 in NHL accounting { game_files => [qw(BS PL)], retries => 2 }, );
Variable @GAME_FILES contains specific definitions for the report types. Right now only the boxscore javascript has any meaningful non-default definitions; the PB feed seems to have become unavailable.
scrape
A wrapper around the LWP::Simple::get() call for retrying and control. Arguments: hash reference containing * url => URL to access * retries => Number of retries * validate => sub reference to validate the download Returns: the content if both download and validation are successful undef otherwise.
crawl_schedule
Crawls the NHL schedule. The schedule is accessed through a minimalistic live api first (only works for post-2010 seasons), then through the general /api/
Arguments: hash reference containing * start_season => the first season to crawl * stop_season => the last season to crawl Returns: hash reference of seasonal schedules where seasons are the keys, and decoded JSONs are the values.
get_game_url_args
Sets the arguments to populate the game URL for a given report type and game Arguments: document name, currently one of qw(BS PB RO ES GS PL) game hashref containing * season => YYYY * stage => 2|3 * season ID => NNNN Returns: a configured list of arguments for the URL.
crawl_game
Crawls the data for the given game Arguments: game data as hashref: * season => YYYY * stage => 2|3 * season ID => NNNN options hashref: * game_files => hashref of types of reports that are requested * force => 0|1 force overwrite of files already present in the system * retries => N number of the retries for every get call
More Hockey Stats, <contact at morehockeystats.com>
<contact at morehockeystats.com>
Please report any bugs or feature requests to contact at morehockeystats.com, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=Sport::Analytics::NHL::Scraper. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
contact at morehockeystats.com
You can find documentation for this module with the perldoc command.
perldoc Sport::Analytics::NHL::Scraper
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
https://rt.cpan.org/NoAuth/Bugs.html?Dist=Sport::Analytics::NHL::Scraper
AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Sport::Analytics::NHL::Scraper
CPAN Ratings
https://cpanratings.perl.org/d/Sport::Analytics::NHL::Scraper
Search CPAN
https://metacpan.org/release/Sport::Analytics::NHL::Scraper
To install Sport::Analytics::NHL, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Sport::Analytics::NHL
CPAN shell
perl -MCPAN -e shell install Sport::Analytics::NHL
For more information on module installation, please visit the detailed CPAN module installation guide.