The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

POE::Component::Client::UserAgent - LWP and LWP::Parallel based user agent

SYNOPSIS

    use POE;
    use POE::Component::Client::UserAgent;

    POE::Component::Client::UserAgent -> new;

    $postback = $session -> postback ('response');

    $request = HTTP::Request -> new (GET => $url);

    $poe_kernel -> post (useragent => request =>
        request => $request, response => $postback);

    sub response
    {
        my ($request, $response, $entry) = @{$_[ARG1]};
        print $response -> status_line;
        $_[KERNEL] -> post (useragent => 'shutdown');
    }

DESCRIPTION

Note: POE::Component::Client::UserAgent dependencies frequently have problems installing. This module is difficult to maintain when the latest dependencies don't work. As a result, we prefer to maintain and recommend POE::Component::Client::HTTP. That client has fewer, more actively maintained dependencies, and it tends to work better.

POE::Component::Client::UserAgent is based on LWP and LWP::Parallel. It lets other tasks run while making a request to an Internet server and waiting for response, and it lets several requests run in parallel.

PoCoCl::UserAgent session is created using spawn or new method. The two methods are equivalent. They take a few named parameters:

alias

alias sets the name by which the session will be known. If no alias is given, it defaults to useragent. The alias lets several client sessions interact with the UserAgent component without keeping (or even knowing) hard references to them. It is possible to create several UserAgent components with different names.

timeout

The component will return an error response if a connection is inactive for timeout seconds. The default value is 180 seconds or 3 minutes.

The rest of the parameters correspond to various properties of LWP::UserAgent and LWP::Parallel::UserAgent. For details please refer to those modules' documentation.

agent
from
redirect
duplicates
in_order
remember_failures
proxy
parse_head
max_size
max_hosts
max_req
delay

The delay parameter is currently not used.

Client sessions communicate asynchronously with PoCoCl::UserAgent by using an alias and posting events to the component. When a request is complete, the component posts back a response event using a postback the client provided when it made the request.

Requests are posted via the component's request event. The event takes a few named parameters:

request

request is a reference to an HTTP::Request object that the client sets up with all the information needed to initiate the request.

response

response is the postback the component will use to post back a response event. The postback is created by the client using POE::Session's postback() method.

filename

filename is an optional file name. If it is specified, the response will be stored in the file with that name.

callback

callback is an optional subroutine reference. If it is specified, the subroutine will be called as chunks of the response are received.

chunksize

chunksize is an optional number giving a hint for the appropriate chunk size to be passed to the callback subroutine. It should not be specified unless callback is also specified.

redirect

redirect is an optional value specifying the redirection behavior for this particular request. A true value will make the UserAgent follow redirects. A false value will instruct the UserAgent to pass redirect responses back to the client session just like any other responses. If redirect value is not specified then the default value passed to the UserAgent's constructor will be used. That in turn defaults to following redirects.

When a request has completed, whether successfully or not, the UserAgent component calls the postback that was supplied along with the request. Calling the postback results in posting an event to the session it was created on, which normally is the session that posted the request.

The postback event parameter with the index ARG0 is a reference to an array containing any extra values passed to the postback() method when creating the postback. This allows the client session to pass additional values to the response event for each request.

The postback event parameter with the index ARG1 is a reference to an array containing three object references that are passed back by the UserAgent session. These objects are:

HTTP::Request

This is the object that was passed to the request event.

HTTP::Response

This is an object containing the response to the request.

LWP::Parallel::UserAgent::Entry

This is an object containing additional information about the request processing. For details please see the LWP::Parallel::UserAgent module and its documentation.

When the client is done posting request events to the component, it should post a shutdown event, indicating that the component can release its alias. The component will continue to operate until it returns all responses to any pending requests.

EXAMPLE

    #!/usr/bin/perl -w
    # should always use -w flag!

    # this is alpha software, it needs a lot of testing!
    sub POE::Kernel::ASSERT_DEFAULT() { 1 }
    sub POE::Kernel::TRACE_DEFAULT() { 1 }

    use strict;
    use POE;    # import lots of constants
    use POE::Component::Client::UserAgent;

    # more debugging stuff
    my $debuglevel = shift || 0;
    POE::Component::Client::UserAgent::debug $debuglevel => 'logname';

    # create client session
    POE::Session -> create (
        inline_states => {
            _start => \&_start,
            response => \&response
        },
    );

    # now run POE!
    $poe_kernel -> run;

    # this is the first event to arrive
    sub _start
    {
        # create the PoCoCl::UserAgent session
        POE::Component::Client::UserAgent -> new;
        # hand it our request
        $_[KERNEL] -> post (
            # default alias is 'useragent'
            useragent => 'request',
            {
                # request some worthless web page
                request => HTTP::Request -> new (GET => 'http://www.hotmail.com/'),
                # let UserAgent know where to deliver the response
                response => $_[SESSION] -> postback ('response')
            }
        );
        # Once we are done posting requests, we can post a shutdown event
        # to the PoCoCl::UserAgent session. Responses will still be returned.
        $_[KERNEL] -> post (useragent => 'shutdown');
    }

    # Here is where the response arrives. Actually in this example we
    # would get more than one response, as hotmail home page is a mere
    # redirect to some address at passport.com. The component processes
    # the redirect automatically by default.
    sub response
    {
        # @{$_[ARG0]} is the list we passed to postback()
        # after the event name, empty in this example
        # @{$_[ARG1]} is the list PoCoCl::UserAgent is passing back to us
        my ($request, $response, $entry) = @{$_[ARG1]};
        print "Successful response arrived!\n"
            if $response -> is_success;
        print "PoCoCl::UserAgent is automatically redirecting the request\n"
            if $response -> is_redirect;
        print "The request failed.\n"
            if $response -> is_error;
    }

DEBUGGING

PoCoCl::UserAgent has a class method called debug. It can also be called as an object method, but the settings will affect all instances.

The method accepts two parameters. The first parameter is the debug level, ranging from 0 for no debug information to 9 for when you want to fill up your disk quota real quick.

Levels 3 and up enable PoCoCl::UserAgent's debugging output. Levels 5 and up additionally enable LWP's +debug debugging option. Levels 7 and up additionally enable LWP's +trace debugging option. Levels 9 and up additionally enable LWP's +conns debugging option.

The second parameter, if it is specified and the first parameter is greater than 0, gives the name of the file where to dump the debug output. Otherwise the output is sent to standard error.

Additionally you may want to enable POE's own debugging output, using the constant sub declarations shown in the example above. So far I couldn't figure out how to affect it using the debug level parameter. The POE output will also go to the log file you specify.

SEE ALSO

POE

POE or http://poe.perl.org/

LWP

LWP or http://www.linpro.no/lwp/

LWP::Parallel

LWP::Parallel or http://www.inf.ethz.ch/~langhein/ParallelUA/

Also see the test programs in the PoCoCl::UserAgent distribution for examples of its usage.

BUGS

All requests containing a host name block while resolving the host name.

FTP requests block for the entire duration of command connection setup, file request and data connection establishment.

At most one request is sent and one response is received over each TCP connection.

All of the above problems are unlikely to be solved within the current LWP framework. The solution would be to rewrite LWP and make it POE friendly.

The RobotUA variety of UserAgent is not yet implemented.

LWP::Parallel often cannot install due to feature mismatches with recent versions of LWP. This interferes with our ability to maintain and test this module. Please see POE::Component::Client::HTTP, which does not rely on LWP::Parallel.

BUG TRACKER

https://rt.cpan.org/Dist/Display.html?Status=Active&Queue=POE-Component-Client-UserAgent

REPOSITORY

http://github.com/rcaputo/poe-component-client-useragent http://gitorious.org/poe-component-client-useragent

OTHER RESOURCES

http://search.cpan.org/dist/POE-Component-Client-UserAgent/

AUTHOR AND COPYRIGHT

Copyright 2001-2010 Rocco Caputo.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.