NAME
simple_scan - scan a set of Web pages for strings present/absent
SYNOPSIS
simple_scan [--generate] [--run] {file file file ...}
USAGE
# Run the tests in the files supplied on the command line.
# --run (or -run; we're flexible) is assumed if you give no switches.
% simple_scan file1 file2 file3
# Generate a set of tests and save them, then run them.
% <complex pipe> | simple_scan --generate > pipe_scan.t
# Run one simple test
% echo "http://yahoo.com yahoo Y Look for yahoo.com" | simple_scan -run
DESCRIPTION
simple_scan
reads either files supplied on the command line, or standard input. It creates and runs, or prints, or even both, a Test::WWW::Simple test for the criteria supplied to it.
simple_scan
's input should be in the following format:
<URL> <pattern> <Y|N> <comment>
The URL is any URL; pattern is a Perl regular expression, delimited by slashes; Y|N is Y
if the pattern should match, or N
if the pattern should not match; and comment is any arbitrary text you like (as long as it's all on the same line as everything else).
COMMAND-LINE SWITCHES
We use Getopt::Long to get the command-line options, so we're really very flexible as to how they're entered. You can use either one dash (as in -foo
) or two (as in --bar
). You only need to enter the minimum number or characters to match a given switch.
--run
-
--run
tellssimple_scan
to immediately run the tests it's created. Can be abbreviated to-r
.This option is mosst useful for one-shot tests that you're not planning to run repeatedly.
--generate
-
--generate
tellssimple_scan
to print the test it's generated on the standard output.This option is useful to build up a test suite to be reused later.
Both -r
and -g
can be specified at the same time to run a test and print it simultaneously; this is useful when you want to save a test to be run later as well as right now without having to regenerate the test.
PRAGMAS
Pragmas are ways to influence what simple_scan
does when generating tests. They don't output anything themselves.
Pragmas are specified with %%
in column 1 and the pragma name immediately following. Any arguments aer supplied after a colon, like this:
%%foo: bar baz
This invokes the foo
pragma with the argument bar baz
.
xx
The xx
pragma allows for very simple-minded internationalization. It assumes that you want to substitute each of a list of two-character country codes into a string (most likely somewhere in the URL, but possibly in the comment too). simple_scan
will do this for you, creating a test for each country code you specify. For instance:
%%xx: es au my jp
http://>xx<.mysite.com/ /blargh/ Y look for blargh (>xx<)
This would generate 4 tests, for es.mysite.com
, au.mysite.com
, c<my.mysite.com>, and jp.mysite.com
, all looking to match blargh
somewhere on the page.
agent
The agent
pragma allows you to switch user agents during the test. Test::WWW::Simple
's default is Windows IE 6
, but you can switch it to any of the other user agents supported by WWW::Mechanize
.
http://gemal.dk/browserspy/basic.html /Explorer/ Y Should be Explorer
%%agent: Mac Safari
http://gemal.dk/browserspy/basic.html /Safari/ Y Should be Safari
AUTHOR
Joe McMahon <mcmahon@yahoo-inc.com>
COPYRIGHT AND LICENSE
Copyright (c) 2005 by Yahoo!
This script is free software; you can redistribute it or modify it under the same terms as Perl itself, either Perl version 5.6.1 or, at your option, any later version of Perl 5 you may have available.