App::SimpleScan::Cookbook
This is a documentation-only module that describes how to use simple_scan, and outlines some techniques you can use for some common Web testing problems.
simple_scan
simple_scan reads test specifications from standard input and generates Perl code based on these specifications. It can either
execute them immediately,
print them on standard output without executing them,
or do both: execute them and then print the generated code on standard output.
Test specifications describe
where the page is that you want to check,
some content (in the form of a Perl regular expression) that you want look for
a "success code", which defines whether or not the regex should match, and optionally allows you to run the test as a TODO, or skip it altogether.
simple_report
and a comment about why you care
C<simple_scan> always uses an HTTP GET to access the URL; if you need to do stuff like log in, or other setup that requires any other HTTP action, you'll need to use a plugin (see below).
Note that TODO tests get run whether or not they will pass; we just mind if they currently fail. Skipped tests are not run at all. Use skipped tests if you want to save time; use TODO tests if you want to be alerted of a change (from passing to failing or vice versa).
Some example test specs:
http://foobar.com?q=zorch&xx=yy /\d+ foobars found/ Y Check zorch query http://perl.org/ /Perl/ Y Perl mentioned here http://python.org/ /Perl/ N Not mentioned here
You can use pragmas to control how tests are executed. Pragmas start with '%%' at the beginning of the line, followed by a pragma name and arguments if the pragma takes any. simple_scan itself provides 3 pragmas:
This tells simple_scan what User-Agent string to use. Because remembering all the fiddly bits is a pain, you can simply use shortcut names, like "Safari", "Mozilla", or "IE"; the actual list is the one supported by 's agent_alias() method.
Stacks a call to cache() in the tests built by simple_scan. This tells simple_scan to hang onto the last copy of the page fetched from every URL; if the URL is hit multiple times during a test, simple_scan fetches it only once and then reuses the cached copy for further tests.
Turns off caching by stacking a nocache() call in the tests built. simple_scan will always refetch every URL when nocache() is in force.
Here's a sample test using the base pragmas:
%%agent Safari http://apple.com/html5 /download Safar/ N No Safari warning with Safari %%cache %%agent IE http://apple.com/html5 /download Safar/ Y Safari warning with IE http://apple.com/html5 /HTML5 Showcase/ Y uses cached copy %%nocache http://apple.com/html5 /takes a few/ Y uses a new copy
You can define substitutions much like you'd use a pragma:
%%site apple %%subsite html5 http://<site>.com/<subsite> /HTML5 Showcase/ Y Apple's HTML5 showcase
The twist with simple_scan variables is that they can have multiple values:
%%query foo bar baz http://foobar.com/q=<query> /<query> found/ Y Found <query>
This causes simple_scan to generate code to run three tests, one for each of the values of the 'query' variable. Notice that we can substitute into any part of the test specification; in this case we didn't substitute into the test type, but it's as valid as any other part of the line.
If you have multiple variables with multiple values, simple_scan will generate the Cartesian product of them:
%%foo one two three four %%bar alpha beta gamma delta epsilon %%baz now is the time for all good men http://sample-site.org?q=<foo><bar>baz> /Found:/ Y Looking for <foo>, <bar>, <baz>
This generates 4 * 5 * 8 = 160 tests in just 4 lines.
Pragmas may expand into other pragmas; the previous example could have been written as
%%foo one two three four %%bar alpha beta gamma delta epsilon %%baz now is the time for all good men %%query <foo><bar><baz> http://sample-site.org?q=<query> ...
In this case, the 'query' variable would have been assigned all 160 values, and anything that used the 'query' variable would be expanded with all of them.
Caution is urged in creating complex nested expansions; making these too complicated can make your generated scripts very hard to debug, as there's currently no easy way to track the expansions and debug them.
First: be sure that the non-ASCII character you're seeing on the screen is actually present in the HTML source. You could be looking at an HTML entity that gets rendered as the character in question. For instance a degree symbol is actually &xB0;.
&xB0;
You can match a specific entity with its actual text:
/&x[bB]0;/
(Note that we've made sure that it will work whether the hex "digits" are upper or lowercase.) Or you can match an arbitrary entity:
/&.*?;/
This one will also match things like & and &brkbar; - with great power comes relative imprecision. There's a handy table of Latin-1 entities at http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html.
&
&brkbar;
In some cases (e.g., Yahoo!'s fr.search search results), there will actually be non-Latin1 characters that are not HTML encoded. This is probably not good practice, but it still exists here and there. To deal with pages like this, copy and paste the exact text from a "view source" into the regex you want to use.
Newer versions of simple_scan handle data smoothly without any special action on your part, even if the encoding's off a bit.
Plugins are Perl modules that extend simple_scan's abilities without modification of the core code.
Create a pragmas method in your plugin that returns pairs of pragma names and methods to be called to process the pragma.
pragmas
sub pragmas { return (['mypragma' => \&do_my_pragma], ['another' => \&another]); } sub do_my_pragma { my ($app, $args); # Parse the arguments. You have access to # all of the methods in App::SimpleScan as # well as any subs defined here. You may # want to export methods to the App::SimpleScan # namespace in your import() method. } ...
Create an options method in your plugin that returns a hash of options and variables to capture their values in. You will also want to export accessors for these variables to the App::SimpleScan namespace in your import.
options
App::SimpleScan
import
sub import { no strict 'refs'; *{caller() . '::myoption} = \&myoption; } sub options { return ('myoption' => \$myoption); } sub myoption { my ($self, $value) = @_; $myoption = $value if defined $value; $myoption; }
Create a test_modules method that returns a list of module names to be used by the generated test program.
test_modules
use
sub test_modules { return ('Test::Foo', 'Blortch::Zonk'); }
Create a per_test subroutine. This method gets called with the current App::SimpleScan::TestSpec object.
per_test
App::SimpleScan::TestSpec
sub per_test { $self->app->_stack_test(qw(fail "forced failure accessing bad.com";\n)) if $self->uri =~ /bad.com/; }
Create a filter subroutine. This will get called with an array of strings corresponding to the code that's about to be stacked; you can do whatever additions or alterations you like. Just return your altered code as an array of strings; if you've added any tests to it, use the test_count() method in the app() object to up the test count appropriately.
filter
Currently, there are six simple_scan plugins available on CPAN:
App::SimpleScan::Plugin::Snapshot
To install App::SimpleScan, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::SimpleScan
CPAN shell
perl -MCPAN -e shell install App::SimpleScan
For more information on module installation, please visit the detailed CPAN module installation guide.