The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

xpathify - output HTML document as a flat XPath/content list

VERSION

version 0.019

SYNOPSIS

    xpathify [options] (HTML file | URL | -)

DESCRIPTION

Represents a typical HTML document in a very verbose two-column mode. The first column is a XPath which locates each element inside the HTML tree. The second column is a respective content (if any).

    /html/head/title/text() test 1
    /html/body/h1/text()    test 2
    /html/body/p[1]/text()  Lorem ipsum dolor sit amet, consectetur adipiscing elit.

OPTIONS

--help

This.

--encoding=name

Specify the HTML document encoding (latin1, utf8). UTF-8 is assumed by default.

--[no]color

Enable syntax highlight for XPath. By default, enabled automatically on interactive terminals.

--16

Use 16 system colors. By default, try to use 256-color ANSI palette.

--[no]html

Disables the --color option and highlights using HTML/CSS.

--[no]shrink

Shrink the XPath to the minimal unique identifier. For example:

    /html/body[@id='cpansearch']/form[@class='searchbox']/input[@name='query']

Could be shortened as:

    //input[@name='query']

The shrinking is enabled by default.

--[no]strict

Strict mode disables grouping by id, class or name attributes. The grouping is enabled by default.

--[no]weight

Print XPath weight on a second column.

EXAMPLES

    xpathify http://metacpan.org
    curl http://www.msn.com | xpathify -c --strict -
    xpathify --nocolor --noshrink t/test.html

AUTHOR

Stanislaw Pusep <stas@sysd.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Stanislaw Pusep.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.