HTML::ExtractContent - An HTML content extractor with scoring heuristics River stage one • 1 direct dependent • 1 total dependent

HTML::ExtractContent is a module for extracting content from HTML with scoring heuristics. It guesses which block of HTML looks like content according to scores depending on the amount of punctuation marks and the lengths of non-tag texts. It also gu...

TARAO/HTML-ExtractContent-0.12 - 30 Nov 2015 08:32:54 GMT - Search in distribution

HTML::ExtractMain - Extract the main content of a web page River stage one • 1 direct dependent • 1 total dependent

ANIRVAN/HTML-ExtractMain-0.63 - 19 May 2013 15:39:27 GMT - Search in distribution

2 results (0.035 seconds)