NAME

WWW::Mechanize::FireFox - use FireFox as if it were WWW::Mechanize

SYNOPSIS

  use WWW::Mechanize::FireFox;
  my $mech = WWW::Mechanize::FireFox->new();
  $mech->get('http://google.com');

This will let you automate FireFox through the Mozrepl plugin, which you need to have installed in your FireFox.

METHODS

`$mech->new( ARGS )`

Creates a new instance and connects it to Firefox.

Note that Firefox already must be running and must have the mozrepl extension installed.

The following options are recognized:

tab - regex for the title of the tab to reuse. If no matching tab is found, the constructor dies.

log - array reference to log levels, passed through to MozRepl::RemoteObject

events - the set of default Javascript events to listen for while waiting for a reply

repl - a premade MozRepl::RemoteObject instance

pre_events - the events that are sent to an input field before its value is changed. By default this is [focus].

post_events - the events that are sent to an input field after its value is changed. By default this is [blur, change].

`$mech->addTab( OPTIONS )`

Creates a new tab. The tab will be automatically closed upon program exit.

If you want the tab to remain open, pass a false value to the the autoclose option.

`$mech->tab`

Gets the object that represents the FireFox tab used by WWW::Mechanize::FireFox.

This method is special to WWW::Mechanize::FireFox.

`$mech->repl`

Gets the MozRepl::RemoteObject instance that is used.

This method is special to WWW::Mechanize::FireFox.

`$mech->events`

Sets or gets the set of Javascript events that WWW::Mechanize::FireFox will wait for after requesting a new page. Returns an array reference.

This method is special to WWW::Mechanize::FireFox.

`$mech->get(URL)`

Retrieves the URL URL into the tab.

It returns a faked HTTP::Response object for interface compatibility with WWW::Mechanize. It does not yet support the additional parameters that WWW::Mechanize supports for saving a file etc.

Currently, the response will only have the status codes of 200 for a successful fetch and 500 for everything else.

`$mech->synchronize( $event, $callback )`

Wraps a synchronization semaphore around the callback and waits until the event $event fires on the browser. If you want to wait for one of multiple events to occur, pass an array reference as the first parameter.

Usually, you want to use it like this:

  my $l = $mech->xpath('//a[@onclick]');
  $mech->synchronize('DOMFrameContentLoaded', sub {
      $l->__click()
  });

It is necessary to synchronize with the browser whenever a click performs an action that takes longer and fires an event on the browser object.

The DOMFrameContentLoaded event is fired by FireFox when the whole DOM and all iframes have been loaded. If your document doesn't have frames, use the DOMContentLoaded event instead.

If you leave out $event, the value of ->events() will be used instead.

`$mech->document`

Returns the DOM document object.

This is WWW::Mechanize::FireFox specific.

`$mech->docshell`

Returns the docShell Javascript object.

`$mech->content`

Returns the current content of the tab as a scalar.

This is likely not binary-safe.

It also currently only works for HTML pages.

`$mech->update_html $html`

Writes $html into the current document. This is mostly implemented as a convenience method for HTML::Display::MozRepl.

`$mech->res` / `$mech->response`

Returns the current response as a HTTP::Response object.

`$mech->success`

Returns a boolean telling whether the last request was successful. If there hasn't been an operation yet, returns false.

This is a convenience function that wraps $mech->res->is_success.

`$mech->status`

Returns the HTTP status code of the response. This is a 3-digit number like 200 for OK, 404 for not found, and so on.

Currently can only return 200 (for OK) and 500 (for error)

`$mech->reload BYPASS_CACHE`

Reloads the current page. If BYPASS_CACHE is a true value, the browser is not allowed to use a cached page. This is the difference between pressing F5 (cached) and shift-F5 (uncached).

Returns the (new) response.

`$mech->back`

Goes one page back in the page history.

Returns the (new) response.

`$mech->forward`

Goes one page back in the page history.

Returns the (new) response.

`$mech->uri`

Returns the current document URI.

`$mech->base`

Returns the URL base for the current page.

The base is either specified through a base tag or is the current URL.

This method is specific to WWW::Mechanize::FireFox

`$mech->content_type`

Returns the content type of the currently loaded document

`$mech->title`

Returns the current document title.

`$mech->links`

Returns all links in the document.

Currently accepts no parameters.

`$mech->find_link_dom OPTIONS`

A method to find links, like WWW::Mechanize's ->find_links method.

Returns the DOM object as MozRepl::RemoteObject::Instance.

The supported options are:

text - the text of the link

id - the id attribute of the link

name - the name attribute of the link

url - the URL attribute of the link (href, src or content).

class - the class attribute of the link

n - the (1-based) index. Defaults to returning the first link.

single - If true, ensure that only one element is found.

The method croaks if no link is found. If the single option is true, it also croaks when more than one link is found.

`$mech->find_link OPTIONS`

A method quite similar to WWW::Mechanize's method.

Returns a WWW::Mechanize::Link object.

`$mech->find_all_links OPTIONS`

Finds all links in the document.

Returns them as list or an array reference, depending on context.

`$mech->find_all_links_dom OPTIONS`

Finds all matching linky DOM nodes in the document.

Returns them as list or an array reference, depending on context.

`$mech->click`

Has the effect of clicking a button on the current form. The first argument is the name of the button to be clicked. The second and third arguments (optional) allow you to specify the (x,y) coordinates of the click.

If there is only one button on the form, $mech->click() with no arguments simply clicks that one button.

Returns a HTTP::Response object.

`$mech->follow_link`

Follows the given link. Takes the same parameters that find_link uses.

`$mech->set_visible @values`

This method sets fields of the current form without having to know their names. So if you have a login screen that wants a username and password, you do not have to fetch the form and inspect the source (or use the mech-dump utility, installed with WWW::Mechanize) to see what the field names are; you can just say

  $mech->set_visible( $username, $password );

and the first and second fields will be set accordingly. The method is called set_visible because it acts only on visible fields; hidden form inputs are not considered.

The specifiers that are possible in WWW::Mechanize are not yet supported.

`$mech->value NAME [, VALUE] [,PRE EVENTS] [,POST EVENTS]`

Sets the field with the name to the given value. Returns the value.

Note that this uses the name attribute of the HTML, not the id attribute.

By passing the array reference PRE EVENTS, you can indicate which Javascript events you want to be triggered before setting the value. POST EVENTS contains the evens you want to be triggered after setting the value.

By default, the events set in the constructor for pre_events and post_events are triggered.

Set a value without triggering events

  $mech->value( 'myfield', 'myvalue', [], [] );

`$mech->clickables`

Returns all clickable elements, that is, all elements with an onclick attribute.

`$mech->xpath QUERY, %options`

Runs an XPath query in FireFox against the current document.

The options allow the following keys:

node - node relative to which the code is to be executed

Returns the matched nodes.

This is a method that is not implemented in WWW::Mechanize.

In the long run, this should go into a general plugin for WWW::Mechanize.

`$mech->selector css_selector, %options`

Returns all nodes matching the given CSS selector.

In the long run, this should go into a general plugin for WWW::Mechanize.

`$mech->cookies`

Returns a HTTP::Cookies object that was initialized from the live FireFox instance.

Note: ->set_cookie is not yet implemented, as is saving the cookie jar.

`$mech->content_as_png [TAB, COORDINATES]`

Returns the given tab or the current page rendered as PNG image.

This is specific to WWW::Mechanize::FireFox.

Currently, the data transfer between FireFox and Perl is done Base64-encoded. It would be beneficial to find what's necessary to make JSON handle binary data more gracefully.

If the coordinates are given, that rectangle will be cut out. The coordinates should be a hash with the four usual entries, left,top,width,height.

Save top left corner the current page as PNG

  my $rect = {
    left  =>    0,
    top   =>    0,
    width  => 200,
    height => 200,
  };
  my $png = $mech->content_as_png(undef, $rect);
  open my $fh, '>', 'page.png'
      or die "Couldn't save to 'page.png': $!";
  binmode $fh;
  print {$fh} $png;
  close $fh;

`$mech->element_as_png $element`

Returns PNG image data for a single element

`$mech->element_coordinates $element`

Returns the page-coordinates of the $element in pixels as a hash with four entries, left, top, width and height.

This function might get moved into another module more geared towards rendering HTML.

`$mech->highlight_node NODES`

Convenience method that marks all nodes in the arguments with

  background: red;
  border: solid black 1px;
  display: block; /* if the element was display: none before */

This is convenient if you need visual verification that you've got the right nodes.

There currently is no way to restore the nodes to their original visual state except reloading the page.

`$mech->allow OPTIONS`

Enables or disables browser features for the current tab. The following options are recognized:

plugins - Whether to allow plugin execution.

javascript - Whether to allow Javascript execution.

metaredirects - Attribute stating if refresh based redirects can be allowed.

frames, subframes - Attribute stating if it should allow subframes (framesets/iframes) or not.

images - Attribute stating whether or not images should be loaded.

Options not listed remain unchanged.

Disable Javascript

  $mech->allow( javascript => 0 );

`$mech->eval_in_page STR`

Evaluates the given Javascript fragment in the context of the web page. Returns a pair of value and Javascript type.

This allows access to variables and functions declared "globally" on the web page.

The returned result needs to be treated with extreme care because it might lead to Javascript execution in the context of your application instead of the context of the webpage. This should be evident for functions and complex data structures like objects. When working with results from untrusted sources, you can only safely use simple types like string.

This method is special to WWW::Mechanize::FireFox.

Also, using this method opens a potential security risk.

`$mech->unsafe_page_property_access ELEMENT`

Allows you unsafe access to properties of the current page. Using such properties is an incredibly bad idea.

This is why the function dies. If you really want to use this function, edit the source code.

FireFox cookies will be read through HTTP::Cookies::MozRepl. This is relatively slow currently.

INCOMPATIBILITIES WITH WWW::Mechanize

As this module is in a very early stage of development, there are many incompatibilities. The main thing is that only the most needed WWW::Mechanize methods have been implemented by me so far.

Link attributes

In FireFox, the name attribute of links seems always to be present on links, even if it's empty. This is in difference to WWW::Mechanize, where the name attribute can be undef.

Unsupported Methods

->find_all_inputs

This function is likely best implemented through $mech->selector.
->find_all_submits

This function is likely best implemented through $mech->selector.
->images

This function is likely best implemented through $mech->selector.
->find_image

This function is likely best implemented through $mech->selector.
->find_all_images

This function is likely best implemented through $mech->selector.
->forms

This function is likely best implemented through $mech->selector.
->form_number

This function is likely best implemented through $mech->xpath.
->form_name

This function is likely best implemented through $mech->selector.
->form_id

This one certainly would be easier done by $mech->xpath
->form_with_fields
->field
->select
->set_fields

This is basically a loop over $mech->value.
->tick
->untick
->click
->submit

Functions that will likely never be implemented

These functions are unlikely to be implemented because they make little sense in the context of FireFox.

->add_header
->delete_header
->clone
->credentials( $username, $password )
->get_basic_credentials( $realm, $uri, $isproxy )
->clear_credentials()
->put

I have no use for it
->post

I have no use for it

TODO

Implement autodie
Implement "reuse tab if exists, otherwise create new"
Rip out parts of Test::HTML::Content and graft them onto the links() and find_link() methods here. FireFox is a conveniently unified XPath engine.

Preferrably, there should be a common API between the two.
Spin off XPath queries (->xpath) and CSS selectors (->selector) into their own Mechanize plugin(s).

REPOSITORY

The public repository of this module is http://github.com/Corion/www-mechanize-firefox.

AUTHOR

Max Maischein corion@cpan.org

COPYRIGHT (c)

LICENSE

This module is released under the same terms as Perl itself.

To install WWW::Mechanize::FireFox, copy and paste the appropriate command in to your terminal.

cpanm

cpanm WWW::Mechanize::FireFox

CPAN shell

perl -MCPAN -e shell
install WWW::Mechanize::FireFox

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

METHODS

$mech->new( ARGS )

$mech->addTab( OPTIONS )

$mech->tab

$mech->repl

$mech->events

$mech->get(URL)

$mech->synchronize( $event, $callback )

$mech->document

$mech->docshell

$mech->content

$mech->update_html $html

$mech->res / $mech->response

$mech->success

$mech->status

$mech->reload BYPASS_CACHE

$mech->back

$mech->forward

$mech->uri

$mech->base

$mech->content_type

$mech->title

$mech->links

$mech->find_link_dom OPTIONS

$mech->find_link OPTIONS

$mech->find_all_links OPTIONS

$mech->find_all_links_dom OPTIONS

$mech->click

$mech->follow_link

$mech->set_visible @values

$mech->value NAME [, VALUE] [,PRE EVENTS] [,POST EVENTS]

Set a value without triggering events

$mech->clickables

$mech->xpath QUERY, %options

$mech->selector css_selector, %options

$mech->cookies

$mech->content_as_png [TAB, COORDINATES]

Save top left corner the current page as PNG

$mech->element_as_png $element

$mech->element_coordinates $element

$mech->highlight_node NODES

$mech->allow OPTIONS

Disable Javascript

$mech->eval_in_page STR

$mech->unsafe_page_property_access ELEMENT

COOKIE HANDLING

INCOMPATIBILITIES WITH WWW::Mechanize

Link attributes

Unsupported Methods

Functions that will likely never be implemented

TODO

SEE ALSO

REPOSITORY

AUTHOR

COPYRIGHT (c)

LICENSE

Module Install Instructions

`$mech->new( ARGS )`

`$mech->addTab( OPTIONS )`

`$mech->tab`

`$mech->repl`

`$mech->events`

`$mech->get(URL)`

`$mech->synchronize( $event, $callback )`

`$mech->document`

`$mech->docshell`

`$mech->content`

`$mech->update_html $html`

`$mech->res` / `$mech->response`

`$mech->success`

`$mech->status`

`$mech->reload BYPASS_CACHE`

`$mech->back`

`$mech->forward`

`$mech->uri`

`$mech->base`

`$mech->content_type`

`$mech->title`

`$mech->links`

`$mech->find_link_dom OPTIONS`

`$mech->find_link OPTIONS`

`$mech->find_all_links OPTIONS`

`$mech->find_all_links_dom OPTIONS`

`$mech->click`

`$mech->follow_link`

`$mech->set_visible @values`

`$mech->value NAME [, VALUE] [,PRE EVENTS] [,POST EVENTS]`

`$mech->clickables`

`$mech->xpath QUERY, %options`

`$mech->selector css_selector, %options`

`$mech->cookies`

`$mech->content_as_png [TAB, COORDINATES]`

`$mech->element_as_png $element`

`$mech->element_coordinates $element`

`$mech->highlight_node NODES`

`$mech->allow OPTIONS`

`$mech->eval_in_page STR`

`$mech->unsafe_page_property_access ELEMENT`