WebSource::Extract - Extract parts of the input
An Extract operator allows to extract sub parts of its input. There exists different flavors of such an operator. The main one consists in querying the input using an XPath expression.
Such an operator is described by a DOM Node having the following form :
<ws:extract name="opname" forward-to="ops"> <path>//an/xpath/expression</path> </ws:extract>
The operator queries any input with the expression found in the path sub-element an returns the found results.
To use a different flavor of the Extract operator (for example xslt) it is necessary to add a
type attribut to the
ws:extract element. The parameters (sub-elements of
ws:extract) depend on the type of operator used.
Each flavor of the Extract operator is implemented by a perl module named WebSource::Extract::flavor (eg. WebSource::Extract::xslt). See the corresponding man page for a full description.
Current existing flavors include :
- xslt : apply an XSL stylesheet to the input
- form : extract form data
- regexep : extract data using a regular expression
$exop = WebSource::Extract->new(wsdnode => $desc);
WebSource, WebSource::Extract::xslt, WebSource::Extract::form, WebSource::Extract::regexp