The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

App::SimpleScan::Cookbook

DESCRIPTION

This is a documentation-only module that describes how to use simple_scan for some common Web testing problems.

BASICS

simple_scan reads test specifications from standard input and generates Perl code based on these specifications. It can either

  • execute them immediately,

  • print them on standard output without executing them,

  • or do both: execute them and then print the generated code on standard output.

TEST SPECS

Test specifications describe

  • where the page is that you want to check,

  • some content (in the form of a Perl regular expression) that you want look for

  • whether or not it should be there

  • and a comment about why you care

Matching non-ASCII Latin-1 characters

First: be sure that the non-ASCII character you're seeing on the screen is actually present in the HTML source. You could be looking at an HTML entity that gets rendered as the character in question. For instance a degree symbol is actually &xB0;.

You can match a specific entity with its actual text:

  /&x[bB]0;/

(Note that we've made sure that it will work whether the hex "digits" are upper or lowercase.) Or you can match an arbitrary entity:

  /&.*?;/

This one will also match things like & and &brkbar; - with great power comes relative imprecision. There's a handy table of Latin-1 entities at http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html.

In some cases (e.g., Yahoo!'s fr.search search results), there will actually be non-Latin1 characters that are not HTML encoded. This is probably not good practice, but it still exists here and there. To deal with pages like this, copy and paste the exact text from a "view source" into the regex you want to use. simple_scan will try to spot all of the non-ASCII characters and add special tests for them.

    Note to advanced regex wielders: using capturing parentheses along with non-Latin characters will cause test failures if the capturing parens appear in the pattern before the non-ASCII character(s). This is because the pattern transformation needed to accurately match the non-ASCII characters requires that we replaced them with capturing parentheses to, well, capture the non-ASCII characters and test them directly with eq. If you add your own capturing parentheses before the non-ASCII characters, you throw the capture off, and the comparison will fail.

    So: either break the test up into multiple parts, with the parens you add to the pattern in one part and the non-ASCII characters in another, or use non-capturing parentheses for grouping:

      (?:foo|bar|baz)

    This allows you to group alternatives without capturing anything, thus keeping simple_scan's head on straight when it comes to non-ASCII characters.

The best solution? Find the developer concerned and have a chat about encoding entities.

PLUGINS

Plugins are Perl modules that extend simple_scan's abilities without modification of the core code.

Installing a new pragma

Create a pragmas method in your plugin that returns pairs of pragma names and methods to be called to process the pragma.

  sub pragmas {
    return (['mypragma' => \&do_my_pragma],
            ['another'  => \&another]);
  }

  sub do_my_pragma {
    my ($app, $args);
    # Parse the arguments. You have access to
    # all of the methods in App::SimpleScan as
    # well as any subs defined here. You may 
    # want to export methods to the App::SimpleScan
    # namespace in your import() method.
  }

  ...

Installing new command-line options

Create an options method in your plugin that returns a hash of options and variables to capture their values in. You will also want to export accessors for these variables to the App::SimpleScan namespace in your import.

  sub import {
    no strict 'refs';
    *{caller() . '::myoption} = \&myoption;
  }

  sub options {
    return ('myoption' => \$myoption);
  }

  sub myoption {
    my ($self, $value) = @_;
    $myoption = $value if defined $value;
    $myoption;
  }

Installing other modules via plugins

Create a test_modules method that returns a list of module names to be used by the generated test program.

  sub test_modules {
    return ('Test::Foo', 'Blortch::Zonk');
  }

Adding extra code to the test output stack in a plugin

Create a per_test subroutine. This method gets called with the current App::SimpleScan::TestSpec object.

  sub per_test {
    $self->app->_stack_test(qw(fail "forced failure accessing bad.com";\n))
     if $self->uri =~ /bad.com/;
  }