NAME

WWW::Crawl4AI::Result - normalized result of a WWW::Crawl4AI strategy chain

VERSION

version 0.001

SYNOPSIS

my $result = $crawler->markdown('https://example.com');

if ( $result->ok ) {
  print $result->markdown;
  print $result->backend;     # which strategy won
  print $result->cost_class;  # cheap / browser / stealth / paid
}
else {
  warn $result->why_failed;   # 'bot_wall_detected'
  print $result->attempts_json;
}

DESCRIPTION

The single, uniform object every WWW::Crawl4AI crawl returns, regardless of which backend won (or that they all lost). It carries the winning content plus the full WWW::Crawl4AI::Attempt history, so callers — especially agents — can see why a backend was chosen or why everything failed.

ok

True if some strategy produced a good page.

url

The original requested URL.

final_url

The URL after redirects, when known.

status

HTTP status code of the winning (or last) attempt.

markdown

Content of the winning page.

html

Raw HTML of the winning page.

title

Page title of the winning page.

backend

Name of the strategy that won, e.g. crawl4ai_stealth.

cost_class

Cost tier of the winning backend: cheap, browser, stealth, paid.

signals

The "signals" in WWW::Crawl4AI::Detect of the winning (or last) page.

why_failed

When ok is false, the failure token of the last attempt.

error

A WWW::Crawl4AI::Error or string when the chain failed outright.

attempts

Arrayref of WWW::Crawl4AI::Attempt objects, in execution order.

links

The links Crawl4AI extracted from the winning page, as { internal => [...], external => [...] }. Each entry is a hashref with href, text and title. For just the URLs, use "urls".

from_attempt

WWW::Crawl4AI::Result->from_attempt($attempt, attempts => \@all)

Builds a result from a winning attempt, copying its page content over.

attempt_count

Number of attempts made.

internal_links

Arrayref of the winning page's same-site links (each { href, text, title }).

external_links

Arrayref of the winning page's off-site links (each { href, text, title }).

urls

The deduplicated, absolute http/https URLs found on the winning page, internal links first then external. Relative hrefs are resolved against "final_url"; javascript:, mailto:, tel:, data: and bare anchors are dropped. This is the list to feed back into a crawl to go deeper.

to_hash

TO_JSON

JSON-safe plain-hash view, including every attempt via "to_hash" in WWW::Crawl4AI::Attempt.

attempts_json

The attempt history encoded as a JSON string.

SUPPORT

Issues

Please report bugs and feature requests on GitHub at https://github.com/Getty/p5-www-crawl4ai/issues.

CONTRIBUTING

Contributions are welcome! Please fork the repository and submit a pull request.

AUTHOR

Torsten Raudssus <torsten@raudssus.de> https://raudss.us/

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

To install WWW::Crawl4AI, copy and paste the appropriate command in to your terminal.

cpanm

cpanm WWW::Crawl4AI

CPAN shell

perl -MCPAN -e shell
install WWW::Crawl4AI

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)