The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Mojo::DOM::Role::Analyzer - miscellaneous methods for analyzing a DOM

SYNOPSIS

  use strict;
  use warnings;
  use Mojo::DOM;

  my $html = '<html><head></head><body><p class="first">A paragraph.</p><p class="last">boo<a>blah<span>kdj</span></a></p><h1>hi</h1></body></html>';
  my $analyzer = Mojo::DOM->with_roles('+Analyzer')->new($html);

  # return the number of elements inside a dom object
  my $count = $analyzer->at('body')->element_count;

  # get the smallest containing dom object that contains all the paragraph tags
  my $containing_dom = $analyzer->common_ancestor('p');

  # compare DOM objects to see which comes first in the document
  my $tag1 = $analyzer->at('p.first');
  my $tag2 = $analyzer->at('p.last');
  my $result = $analyzer->compare($tag1, $tag2);

  # ALTERNATIVELY

  $analyzer->at('p.first')->compare('p.last');    # 'p.last' is relative to root

  # get the depth level of a dom object relative to root
  # root node returns '1'
  my $depth = $analyzer->at('p.first')->depth;

  # get the deepest depth of the documented
  my $deepest = $analyzer->deepest;

  # SEE DESCRIPTION BELOW FOR MORE METHODS

DESCRIPTION

Operators

cmp

  my $result = $dom1 cmp $dom2;

Compares the selectors of two $dom objects to determine which comes first in the dom. See compare method below for return values.

Methods

closest_down

  my $closest_down_dom = $dom->at('h1')->closest_down('p');

Returns the node closest to the tag node of interest by searching downward through the DOM.

Note that "closest" is defined as the node highest in the DOM that is still below the tag node of interest (or, in the case of closeest_up lowest in the DOM but still above the tag node of interest), not by the shortest distance (number of "hops") to the other node.

For example, in the code below, the <h1> tag containing "Heading 1" is five hops away from the <p> tag, while the other <h1> tag is only two hops away. But despite being more hops away, the <h1> tag containing "Header 1" is considered to be closer.

    <p>Paragraph</p>
    <div><div><div><div><h1>Heading 1</h1></div></div></div></div>
    <h1>Heading 2</h2>

closest_up

  my $closest_up_dom = $dom->at('p')->closest_up('h1');

Returns the node closest to the tag node of interest by searching upward through the DOM.

See the closest_down method for the meaning of the "closest" node and how it is calculated.

common

$dom->at($tag1)->common($tag2)

$dom->common($dom1, $dom2)

$dom->common($selector_str1, $selector_str2)

$dom->at($tag1)->common

  # Find the common ancestor for two nodes
  my $common $dom->at('div.bar')->common('div.foo');    # 'div.foo' is relative to root

  # OR

  # Pass in two $dom objects
  my $dom1 = $dom->at('div.bar');
  my $dom2 = $dom->at('div.foo');
  my $common = $dom->common($dom1, $dom2);

  # OR

  # Pass in two selectors
  my $common = $dom->common($dom->at('p')->selector, $dom->at('h1')->selector);

  # OR

  # Find the common ancestor for all paragraph nodes with class "foo"
  # This syntax is a wrapper for the Mojo::Collection::Role::Extra->common method
  my $common = $dom->at('p.foo')->common;

Returns the lowest common ancestor node between two nodes or between a node and a group of nodes sharing the same selector.

See "common" in Mojo::Collection::Role::Extra for a similar method that invoked on Mojo::Collection objects.

compare

$dom->at($tag1)->compare($tag2)

compare($dom1, $dom2)

$dom1 cmp $dom2

  $dom->at('p.first')->compare('p.last');    # 'p.last' is relative to root

  # OR

  my $dom1 = $dom->at('p.first');
  my $dom2 = $dom->at('p.last');
  my $result = $dom->compare($dom1, $dom2);

  # OR with overloaded 'cmp' operator

  my $result = $dom1 cmp $dom2;

Compares the selectors of two $dom objects to see which comes first in the DOM.

  • Returns a value of '-1' if the first argument comes before (is less than) the second.

  • Returns a value of '0' if the first and second arguments are the same.

  • Returns a value of '1' if the first argument comes after (is greater than) the second.

deepest

  my $deepest_depth = $dom->deepest;

Finds the deeepest nested level within a node.

depth

  my $depth = $dom->at('p.first')->depth;

Finds the nested depth level of a node. The root node returns 1.

distance

$dom->at($selector)->distance($selector)

$dom->at($selector)->distance($dom)

$dom->distance($dom1, $dom2)

Returns the distance, aka number of "hops," between two nodes.

The value is calculated by first finding the lowest common ancestor node for the two nodes and then getting the distance between the lowest common ancestor node and each of the two nodes. The two distances are then added togethr to determine the total distance between the two nodes.

element_count

  $count = $dom->element_count;

Returns the number of elements in a dom object, including children of children of children, etc.

is_ancestor_to

  $is_ancestor = $s->at('h1')->is_ancestor_to('p.foo');

Returns true if a node is an ancestor to another node, false otherwise.

tag_analysis

  @enclosing_tags = $dom->tag_analysis('p');

Searches through a DOM for tag nodes that enclose tags matching the given selector (see common_ancestor method) and returns an array of hash references with the following information for each of the enclosing nodes:

  {
    "all_tags_have_same_depth" => 1,   # whether enclosed tags within the enclosing node have the same depth
    "avg_tag_depth" => 8,              # average depth of the enclosed tags
    "selector" => "body:nth-child(2)", # the selector for the enclosing tag
    "size" => 1                        # total number of tags of interest that are descendants of the enclosing tag
  }

VERSION

version 0.015

SUPPORT

Perldoc

You can find documentation for this module with the perldoc command.

  perldoc Mojo::DOM::Role::Analyzer

Websites

The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources.

Source Code

The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :)

https://github.com/sdondley/Mojo-DOM-Role-Analyzer

  git clone git://github.com/sdondley/Mojo-DOM-Role-Analyzer.git

SEE ALSO

Mojo::DOM Mojo::Collection::Role::Extra

AUTHOR

Steve Dondley <s@dondley.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2020 by Steve Dondley.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.