The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTTP::MultiGet - Run many http requests at once!

SYNOPSIS

  use Modern::Perl;
  use HTTP::MultiGet;
  use HTTP::Request;

  my $getter=new HTTP::MultiGet;

  my @requests=(map { HTTP::Request->new(GET=>"http://localhost$_") } (1 .. 1000));
  my @responses=$getter->run_requests(@requests);

  my $id=0;
  foreach my $response (@responses) {
    my $request=$requests[$id++];
    print "Results for: ".$request->uri."\n";
    if($response->is_success) {
      print $response->decoded_content;
    } else {
      print $response->status_line,"\n";
    }
  }

Handling Multiple Large Downloads

  use Modern::Perl;
  use HTTP::MultiGet;
  use HTTP::Request;

  my $req=HTTP::Request->new(GET=>'http://some.realy.big/file/to/download.gz');
  my $req_b=HTTP::Request->new(GET=>'http://some.realy.big/file/to/download2.gz');

  # create a callback 
  my $code=sub {
    my ($getter,$request,$headers,$chunk)=@_;
      # 0: Current HTTP::MultiGet instance
      # 1: HTTP::Request object
      # 2: HTTP::Headers object
      # 3: Chunk of data being downloaded
    if($headers->header('Status')==200) {
      # do something
    } else {
      # do something with $body
    }
  };
  my $getter=new HTTP::MultiGet;
  my ($result,$result_b)=$getter->run_requests([$req,on_body=>$code],[$req_b,on_body=>$code]);

The arguments: on_body=>$code are called called on each chunk downloaded. $result is created when the download is completed, but $result->decoded_content is going to be empty

DESCRIPTION

Created a wrapper for: AnyEvent::HTTP, but provides a more LWP like feel.

Moo Stuff

This is a Moo class the object constructor takes the following arguments, along with the following roles

Role List:

  Log::LogMethods
  Data:::Result::Moo

Arguemnts and object accessors:

  logger:          DOES(Log::Log4perl::Logger)
  request_opts:    See AnyEvent::HTTP params for details
  timeout:         Global timeout for everything ( default 300 )
  max_que_count:   How many requests to run at once ( default 20 )
  max_retry:       How many times to retry if we get a connection/negotiation error 

For internal use only:

  in_control_loop: true when in the control loop
  stack:           Data::Queue object 
  que_count:       Total Number of elements active in the que
  retry:           Anonymous hash used to map ids to retry counts

UNIT TESTING

For Unit testing

  on_create_request_cb: Anonymous code ref to be called 
    when a new request object has been created
        sub { my ($id,$request)=@_ }

Arguments for the call back

  id:  number for the object
  req: a new instance of $self->SENDER_CLASS

Interal blocking control variables

  loop_control: AnyEvent->condvar object
  false_id: internal false id tracker
  fake_jobs: Internal object for handling fake results

Class constants

  • my $class=$self->RESPONSE_CLASS

    Returns the http response class, typically AnyEvent::HTTP::Response.

  • my $class=$self->HEADER_CLASS

    Returns the header class, typically HTTP::Headers.

OO Methods

  • my $id=$self->next_que_id

    Returns the next id.

  • my @ids=$self->add(@requests)

    Adds @requests to the stack, @ids relates as id=>request

  • my @ids=$self->add([$request,key=>value]);

    Wrapping [$request] allows passing additional key value to AnyEvent::HTTP::Request, with one exception, on_body=>$code is wrapped an additional callback.

  • my $id=$self->add_by_id($id=>$request);

    Adds the request with an explicit id.

  • $self->run_fake_jobs

    Runs all current fake jobs

  • $self->run_next

    Internal functoin, used to run the next request from the stack.

  • my $id=$self->add_result($cb)

    Internal function, added for HTTP::MultiGet::Role

  • if($self->has_fake_jobs) { ... }

    Checks if any fake jobs exist

  • my $result=$self->has_request($id)

    Returns a Charter::Result Object.

    When true it contains a string showing where that id is in the list.

    Values are:

      complete: Has finished running
      running:  The request has been sent, waiting on the response
      in_que:   The request has not been run yet
  • if($self->has_running_or_pending) { ... }

    Returns a Charter::Result object

      true:  it contains how many objects are not yet complete.
      false: no incomplete jobs exist
  • if($self->has_any) { ... }

    Returns a Charter::Result object

      true: conains how many total jobs of any state there are
      false: there are no jobs in any state
  • my $result=$self->block_for_ids(@ids)

    Returns a Charter::Result object

    when true: some results were found

      while(my ($id,$result)=each %{$result->get_data}) {
        if($result) {
          my $response=$result->get_data;
          print "Got response: $id, code was: ",$response->code,"\n";
        } else {
          print "Failed to get: $id error was: $result\n";
        }
      }
  • my $class=$self->SENDER_CLASS

    $class is the class used to send requests.

  • my $req=$self->create_request($req,$id)

    Internal method. Returns a new instance of $self->SENDER_CLASS for Teh given $request and $id.

  • my $code=$self->que_function($req,$id);

    Internal method. Creates a code refrence for use in the que process.

  • $self->block_loop

    Internal Function. Does a single timed pass against the current set of data, stops when the requests complete.

  • my @responses=$self->run_requests(@requests);

    Ques runs and blocks for all https requests, and returns the result objects

    Arguments:

      @requests: list of HTTP::Request Objects

    Responses

      @responses: list of HTTP::Result in order of the requests.
  • $self->clean_results

    Used to remove any results that are unclaimed ( Use to prevent memory leaks! ).

  • my @responses=$self->block_for_results_by_id(@ids)

    Blocks on the @ids lsit for list of HTTP::Response objects

  • my @results=$self->get_results(@ids);

    Does not block, just returns a list of HTTP::Response objects based on @ids

Using with AnyEvent

See AnyEvent::HTTP::MultiGet

AUTHOR

Mike Shipper mailto:AKALINUX@CPAN.ORG