W3C::LogValidator - The W3C Log Validator - Quality-focused Web Server log processing engine
Checks quality/validity of most popular content on a Web server
W3C::LogValidator is the main module for the W3C Log Validator, a combination of Web Server log analysis and statistics tool and Web Content quality checker.
W3C::LogValidator
The W3C::LogValidator can batch-process a number of documents through a number of quality focus checks, such as HTML or CSS validation, or checking for broken links. It can take a number of different inputs, ranging from a simple list of URIs to log files from various Web servers. And since it orders the result depending on the number of times a document appears in the file or logs, it is, in practice, a useful way to spot the most popular documents that need work.
the perl script logprocess.pl, bundled in the W3C::LogValidator distribution, is a simple way to use the features of W3C::LogValidator. Developers can also use W3C::LogValidator can be used as a perl module to build applications.
The homepage for the Log Validator is at: http://www.w3.org/QA/Tools/LogValidator/
The simple way to use is to edit the sample configuration file (samples/logprocess.conf) and to run the bundled logprocess.pl script with this configuration file, a la:
logprocess.pl -f /path/to/logprocess.conf
The basic task of the W3C::LogValidator module is to parse a configuration file and process relevant logs, passed through a configuration file argument:
use W3C::LogValidator; my $logprocessor = W3C::LogValidator->new("sample.conf"); $logprocessor->process;
Alternatively, it will use default a default config and try to process Web server logs in "well known locations":
my $logprocessor = W3C::LogValidator->new; $logprocessor->process;
Constructs a new W3C::LogValidator processor. You might pass a configuration file name, as well as a hash of attribute-value pairs as parameters to the constructor.
e.g. for mail output:
%conf = ( "UseOutputModule" => "W3C::LogValidator::Output::Mail", "ServerAdmin" => 'webmaster@example.com', "verbose" => "3" ); $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);
Or e.g. for HTML output:
%conf = ( "UseOutputModule" => "W3C::LogValidator::Output::HTML", "OutputTo" => 'path/to/file.html', "verbose" => "0" ); $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);
If given the path to a configuration file, new() will call the W3C::LogValidator::Config module to get its configuration variables. Otherwise, a default set of values is used.
new()
Given a log record and the type of the log (common log format, flat list of URIs, etc), extracts the remote host or ip
Do-it-all method: Read configuration file (if any), parse log files, run them through processing modules, send result to output module.
Creates a configuration hash for a specific module, adding module-specific configuration variables, overriding if necessary
Run the data parsed off the log files through the various processing (validation) modules specified by UseValidationModule in the configuration.
Loops through and parses all log files specified in the configuration
Extracts URIs and number of hits from a given log file, and feeds it to the processor's URI/Hits table
Given a log record and the type of the log (common log format, flat list of URIs, etc), extracts the URI
Given a URI, removes "directory index" suffixes such as index.html, etc so that http://foobar/ and http://foobar/index.html be counted as one resource
Add a URI to the processor's URI/Hits table
Returns the list of URIs in the processor's table, sorted by popularity (hits)
Tests whether a given URI contains a CGI query string
Returns the number of hits for a given URI. Basically a "public" method accessing $hits{$uri};
Public bug-tracking interface at http://www.w3.org/Bugs/Public/
Olivier Thereaux <ot@w3.org> for The World Wide Web Consortium
Up-to-date information on the Log Validator at:
http://www.w3.org/QA/Tools/LogValidator/
Several articles have been written within the W3C Quality Assurance Interest Group on the topic of improving the quality of Web sites, notably by using a step-by-step approach and relying upon the Log Validator to help find the areas to fix in priority.
Available at http://www.w3.org/QA/2002/04/Web-Quality
or how to improve your Web site easily.
Available in several languages at: http://www.w3.org/QA/2003/03/web-kit
Available at http://www.w3.org/QA/2002/09/Step-by-step
To install W3C::LogValidator, copy and paste the appropriate command in to your terminal.
cpanm
cpanm W3C::LogValidator
CPAN shell
perl -MCPAN -e shell install W3C::LogValidator
For more information on module installation, please visit the detailed CPAN module installation guide.