linktractor - extract links from HTML
% linktractor fileA.html fileB.html % linktractor -f=http://www.perl.com % lwp-request http://www.example.com | linktractor % lwp-request http://www.example.com | linktractor -b=http://www.example.com
This is a small script that uses HTML::SimpleLinkExtractor to pull all the HTML links out of the input HTML. It can take input from files you specify on the command line (or standard input), or fetch a URL.
The -b switch sets the base URL to resolve relative URLs in the input.
-b
Instead of reading from files specified on the command line or standard input, fetch this URL and use it as input.
This source is part of a SourceForge project which always has the latest sources in CVS, as well as all of the previous releases.
http://sourceforge.net/projects/brian-d-foy/
If, for some reason, I disappear from the world, one of the other members of the project can shepherd this module appropriately.
brian d foy, <bdfoy@cpan.org>
<bdfoy@cpan.org>
Copyright (c) 2007 brian d foy. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
You may use HTML::SimpleLinkExtor under the same terms as Perl itself.
To install HTML::SimpleLinkExtor, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML::SimpleLinkExtor
CPAN shell
perl -MCPAN -e shell install HTML::SimpleLinkExtor
For more information on module installation, please visit the detailed CPAN module installation guide.