DocSet::Doc - A Base Document Class
# e.g. a subclass would do use DocSet::Doc::HTML2HTML (); my $doc = DocSet::Doc::HTML2HTML->new(%args); $doc->scan(); my $meta = $doc->meta(); my $toc = $doc->toc(); $doc->render(); # internal methods $doc->src_read(); $doc->src_filter();
This super class implement core methods for scanning a single document of a given format and rendering it into another format. It provides sub-classes with hooks that can change the default behavior. Note that this class cannot be used as it is, you have to subclass it and implement the required methods listed later.
scan the document into a parsed tree and retrieve its meta and toc data if possible.
render the output document and write it to its final destination.
Fetches the source of the document. The source can be read from different media, i.e. a file://, http://, relational DB or OCR :) (but these are left for subclasses to implement :)
A subclass may implement a "source" filter. For example if the source document is written in an extended POD the source filter may convert it into a standard POD. If the source includes some template directives these can be pre-processed as well.
The document's content is coming out of this class ready for parsing and converting into other formats.
a simple set/get-able accessor to the meta attribute.
a simple set/get-able accessor to the toc attribute
my $doc_src_path = $self->transform_src_doc($path);
search for the source doc with path of
$pathat the search paths defined by the configuration file search_paths attribute (similar to the
@INCsearch in Perl) and if found resolve it to a relative to
abs_doc_rootpath and return it. If not found return the
These methods must be implemented by the sub-classes:
Retrieve and set the meta data that describes the input document into the meta object attribute. Various documents may provide different meta information. The only required meta field is title.
These methods can be implemented by the sub-classes:
A subclass may want to preprocess the source document before it'll be processed. This method is called after the source has been read. By default nothing happens.
Stas Bekman <stas (at) stason.org>