Daisuke Maki
Web::Scraper::Config - Run Web::Scraper From Config Files


    - process:
      - td>ul>li
      - trailers[]
      - scraper:
        - process_first:
          - li>b
          - title
          -  TEXT
        - process_first:
          - ul>li>a[href]
          - url
          - @href
        - process:
          - ul>li>ul>li>a
          - movies[]
          - __callback(process_movie)__

  my $scraper = Web::Scraper::Config->new(
      callbacks => {
        process_movie => sub {
          my $elem = shift;
          return {
            text => $elem->as_text,
            href => $elem->attr('href')


Web::Scraper::Config allows you to harness the power of Web::Scraper from a config file.

The config files can be written in any format that Config::Any understands, as long as it conforms to this module's rules.



Creates a new Web::Scraper::Config instance.

The first arguments is either a hashref that represents a config, or a filename to the config. The config file can be in any format that Config::Any understands as long as it returns a hash that's conformant to the Web::Scraper::Config rules.

The second argument (options) is optional, and is currently only used to provider callbacks to be called from the scraper. When Web::Scraper::Config encounters an element in the form of:


then that is replaced by the corresponding callback specified in the options hash.


Starts scraping. The semantics are exactly the same as Web::Scraper::scrape


