The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WWW::Crawler::Mojo::Job - Single crawler job

SYNOPSIS

    my $job1 = WWW::Crawler::Mojo::Job->new;
    $job1->url('http://example.com/');
    my $job2 = $job1->child;

DESCRIPTION

This class represents a single crawler job.

ATTRIBUTES

context

Either Mojo::DOM or Mojo::URL instance that the job is referrered by.

    $job->context($dom);
    say $job->context;

closed

A flag indecates whether the job is closed or not.

    $job->closed(1);
    say $job->closed;

depth

The depth of the job in referrer series.

    my $job1 = WWW::Crawler::Mojo::Job->new;
    my $job2 = $job1->child;
    my $job3 = $job2->child;
    say $job1->depth; # 0
    say $job2->depth; # 1
    say $job3->depth; # 2

literal_uri

A Mojo::URL instance of the literal URL that has appeared in the referrer document.

    $job1->literal_uri('./index.html');
    say $job1->literal_uri; # './index.html'

referrer

A job instance that has referred the URL.

    $job1->referrer($job);
    my $job2 = $job1->referrer;

redirect_history

An array reference that contains URLs of redirect history.

    $job1->redirect_history([$url1, $url2, $url3]);
    my $history = $job1->redirect_history;

url

A Mojo::URL instance of the resolved URL.

    $job1->url('http://example.com/');
    say $job1->url; # 'http://example.com/'

method

HTTP request method such as GET or POST.

    $job1->method('GET');
    say $job1->method; # GET

tx_params

A hash reference that contains params for Mojo::Transaction.

    $job1->tx_params({foo => 'bar'});
    $params = $job1->tx_params;

METHODS

clone

Clones the job.

    my $job2 = $job1->clone;

close

Closes the job and cuts the referrer series.

    $job->close;

child

Instantiates a child job by parent job. The parent URL is set to child referrer.

    my $job1 = WWW::Crawler::Mojo::Job->new(url => 'http://example.com/1');
    my $job2 = $job1->child(url => 'http://example.com/2');
    say $job2->referrer->url # 'http://example.com/1'

digest

Generates digest string with url, method, tx_params attributes.

    say $job->digest;

redirect

Replaces the resolved URL and history at once.

    my $job = WWW::Crawler::Mojo::Job->new;
    $job->url($url1);
    $job->redirect($url2, $url3);
    say $job->url # $url2
    say $job->redirect_history # [$url1, $url3]

original_url

Returns the original URL of redirected job. If redirected, returns last element of redirect_histroy attribute, otherwise returns url attribute.

    $job1->redirect_history([$url1, $url2, $url3]);
    my $url4 = $job1->original_url; # $url4 is $url3

upgrade

Instanciates a job with string or a Mojo::URL instance.

AUTHOR

Keita Sugama, <sugama@jamadam.com>

COPYRIGHT AND LICENSE

Copyright (C) Keita Sugama.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.