The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WebService::Validator::HTML::W3C::Fast - Access the W3Cs online HTML validator in a local persistent daemon

SYNOPSIS

    use WebService::Validator::HTML::W3C::Fast;

    my $v = WebService::Validator::HTML::W3C::Fast->new(
                validator_path         => '/path/to/validator/check',
                user                   => $username,
                password               => $password,
                auto_launder_validator => 1,
            );

    if ( $v->validate_markup(<<_HTML_) ) {
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head><title></title></head><body></body></html>
_HTML_
        if ( $v->is_valid ) {
            printf ("%s is valid\n", $v->uri);
        } else {
            printf ("%s is not valid\n", $v->uri);
            foreach my $error ( @{$v->errors} ) {
                printf("%s at line %d\n", $error->msg,
                                          $error->line);
            }
        }
    } else {
        printf ("Failed to validate the supplied markup: %s\n", $v->validator_error);
    }

DESCRIPTION

WebService::Validator::HTML::W3C::Fast provides access a local version of the W3C's Markup validator, via WebService::Validator::HTML::W3C. It loads up a small HTTP::Daemon daemon, listening on a random high port on 127.0.0.1 and loads the check cgi script into a mod_perl type persistent environment for speedy checking of lots of documents.

When running under taint-mode you will need to provide the auto_launder_validator argument, otherwise taint will refuse to allow the module to string eval the cgi script.

To discourage denial of service attacks, the local web server is protected via http basic auth. You can specify the desired user name and password for the server, or it will use srand and rand to generate a simple password.

if validator_path is not supplied, the validator will attempt to guess at where the script is, first looking at '/usr/share/w3c-markup-validator/cgi-bin/check', which is the location of the cgi script used by fedora's w3c-markup-validator package. If this fails and no-one supplies defaults that other operating systems use, the validator will croak().

NOTE for debian. At the moment, debian's version of the check script depends on using the open3 function for /usr/bin/onsgmls. I'm done a quick check on this and am not intending to port to debian, rather, i intend to wait for debian to upgrade their source. If anyone would like to fix this, i would be happy to apply supplied patches.

The local HTTP::Daemon will occansionally check that the parent program is still present. If the parent ever exits, the HTTP::Daemon will terminate as well. This is to prevent a build up of HTTP::Daemons listening on high ports b/c a test script was aborted.

SEE ALSO

AUTHOR

David Dick, <ddick@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009 by David Dick

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.