URL::RegexMatching - A library of utility methods for matching URLs with regex patterns.
#!/usr/bin/perl use strict; use warnings; use URL::RegexMatching qw(url_match_regex http_url_match_regex); my $text = <<SAMPLE; This is some sample text with links like <http://foo.com/blah_blah/> and others like WWW.EXAMPLE.COM and bit.ly/foo. And what about something like a mailto:name\@example.com pattern? SAMPLE my $url_regex = url_match_regex; my $http_regex = http_url_match_regex; print "Using this sample text:\n"; print "$text\n"; print "These strings are probably links:\n"; while ($text =~m{$url_regex}g) { print "\t$1\n"; } print "\nWeb URLs:\n"; while ($text =~m{$http_regex}g) { print "\t$1\n"; } $text =~s{$http_regex}{<a href="$1">$1</a>}g; print "\n\n"; print "Convert only HTTP links to HTML links using http_url_match_regex:\n"; print "$text\n";
This package is based on regular expression patterns initially developed by John Gruber of Daring Fireball fame. This module is simply a packaging of his work to make utilization by the Perl community easier.
This method takes no arguments and returns a compiled regular expression matching pattern. The pattern will liberally match string that appear to be various HTTP, HTTPS and mailto including a best attempt to identify relative URLs.
This method can be exported by request.
This method takes no arguments and returns a compiled regular expression matching pattern. This pattern will liberally match only web URLs -- http, https and relative forms such as www.example.com
Both regular expression patterns are known to fail against URL strings such as:
When using the http_url_match_regex method it is likely to match link strings whose domain/file path looks like a web URL, but uses a different protocol such as 'ftp://www.example.com/foo.txt' where the match would capture all but the 'ftp://' part.
http_url_match_regex
Bugs should be reported via the GitHub project issues tracking system: http://github.com/tima/perl-url-regexmatching/issues
Timothy Appnel <tima@cpan.org>
http://daringfireball.net/2010/07/improved_regex_for_matching_urls
This module is based on the work of John Gruber of Daring Fireball. John writes "this pattern is free for anyone to use, no strings attached. Consider it public domain."
The software is released under the Artistic License. The terms of the Artistic License are described at http://www.perl.com/language/misc/Artistic.html.
Except where otherwise noted, URL::RegexMatching is Copyright 2010, Timothy Appnel, tima@cpan.org. All rights reserved.
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in 'http://example.com/quotes-are-“part”'. Assuming UTF-8
To install URL::RegexMatching, copy and paste the appropriate command in to your terminal.
cpanm
cpanm URL::RegexMatching
CPAN shell
perl -MCPAN -e shell install URL::RegexMatching
For more information on module installation, please visit the detailed CPAN module installation guide.