The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

OODoc::Parser::Markov - Parser for the MARKOV syntax

INHERITANCE

 OODoc::Parser::Markov
   is a OODoc::Parser
   is a OODoc::Object

SYNOPSIS

DESCRIPTION

The Markov parser is named after the author, because the author likes to invite other people to write their own parser as well: every one has not only their own coding style, but also their own documentation wishes.

The task for the parser is to strip Perl package files into a code part and a documentation tree. The code is written to a directory where the module distribution is built, the documenation tree is later formatted into manual pages.

OVERLOADED

METHODS

Constructors

OODoc::Parser::Markov->new(OPTIONS)

     Option            Defined in  Default
     additional_rules              []     

    . additional_rules ARRAY

      Reference to an array which contains references to match-action pairs, as accepted by rule(). These rules get preference over the existing rules.

Inheritance knowledge

$obj->extends([OBJECT])

Parsing a file

$obj->currentManual([MANUAL])

    Returns the manual object which is currently being filled with data. With a new MANUAL, a new one is set.

$obj->findMatchingRule(LINE)

    Check the list of rules whether this LINE matches one of them. This is an ordered evaluation. Returned is the matched string and the required action. If the line fails to match anything, an empty list is returned.

    Example:

      if(my($match, $action) = $parser->findMatchingRule($line))
      {  # do something with it
         $action->($parser, $match, $line);
      }

$obj->inDoc([BOOLEAN])

    When a BOOLEAN is specified, the status changes. It returns the current status of the document reader.

$obj->parse(OPTIONS)

     Option        Defined in       Default   
     distribution                   <required>
     input                          <required>
     output                         devnull   
     version                        <required>

    . distribution STRING

    . input FILENAME

    . output FILENAME

    . version STRING

$obj->rule((STRING|REGEX), (METHOD|CODE))

    Register a rule which will be applied to a line in the input file. When a STRING is specified, it must start at the beginning of the line to be selected. You may also specify a regular expression which will match on the line.

    The second argument is the action which will be taken when the line is selected. Either the named METHOD or the CODE reference will be called. Their arguments are:

     $parser->METHOD($match, $line, $file, $linenumber);
     CODE->($parser, $match, $line, $file, $linenumber);

$obj->setBlock(REF-SCALAR)

    Set the scalar where the next documentation lines should be collected in.

Formatting text pieces

$obj->cleanup(FORMATTER, MANUAL, STRING)

Producing manuals

$obj->cleanupPod(FORMATTER, MANUAL, STRING)

$obj->cleanupPodL(FORMATTER, MANUAL, LINK)

    The L markups for OODoc::Parser::Markov have the same syntax as standard POD has, however most standard pod-laters do no accept links in verbatim blocks. Therefore, the links have to be translated in their text in such a case. The translation itself is done in by this method.

$obj->cleanupPodM(FORMATTER, MANUAL, LINK)

$obj->decomposeL(MANUAL, LINK)

    Decompose the L-tags. These tags are described in perlpod, but they will not refer to items: only headers.

$obj->decomposeM(MANUAL, LINK)

Commonly used functions

#-------------------------------------------

$obj->cleanupHtml(FORMATTER, MANUAL, STRING, [IS_HTML])

    Some changes will not be made when IS_HTML is true, for instance, a "<" will stay that way, not being translated in a "&lt;".

$obj->cleanupHtmlL(FORMATTER, MANUAL, LINK)

$obj->cleanupHtmlM(FORMATTER, MANUAL, LINK)

$obj->filenameToPackage(FILENAME)

OODoc::Parser::Markov->filenameToPackage(FILENAME)

$obj->mkdirhier(DIRECTORY)

OODoc::Parser::Markov->mkdirhier(DIRECTORY)

Manual Repository

$obj->addManual(MANUAL)

$obj->mainManual(NAME)

$obj->manual(NAME)

$obj->manuals

$obj->manualsForPackage(NAME)

$obj->packageNames

Commonly used functions

DIAGNOSTICS

Warning: =cut does not terminate any doc in $file line $number

There is no document to end here.

Warning: Debugging remains in $filename line $number

The author's way of debugging is by putting warn/die/carp etc on the first position of a line. Other lines in a method are always indented, which means that these debugging lines are clearly visible. You may simply ingnore this warning.

Warning: Manual $manual links to unknown entry "$item" in $manual

Error: The formatter type $class is not known for cleanup

Text blocks have to get the finishing touch in the final formatting phase. The parser has to fix the text block segments to create a formatter dependent output. Only a few formatters are predefined.

Warning: You may have accidentally captured code in doc file $fn line $number

Some keywords on the first position of a line are very common for code. However, code within doc should start with a blank to indicate pre-formatted lines. This warning may be false.

Error: cannot read document from $input: $!

The document file can not be processed because it can not be read. Reading is required to be able to build a documentation tree.

Error: chapter $name before package statement in $file line $number

A package file can contain more than one package: more than one name space. The docs are sorted after the name space. Therefore, each chapter must be preceeded by a package statement in the file to be sure that the correct name space is used.

Error: compilation problems for module $link in $module: $@

If the report is about a syntax error involving 'require', then you may have created a link to a module with a name which is not acceptable to Perl. It is not easy to find the location of that problem.

Error: default for option $name outside subroutine in $file line $number

A default is set, however there is not subroutine in scope (yet). It is plausible that the option does not exist either, but that will be checked later.

Warning: default line incorrect in $file line $number: $line

The shown $line is not in the right format: it should contain at least two words being the option name and the default value.

Error: diagnostic $type outside subroutine in $file line $number

It is unclear to which subroutine this diagnostic message belongs.

Warning: doc did not end in $input

When the whole $input was parsed, the documentation part was still open. Probably you forgot to terminate it with a =cut.

Warning: empty L link in $manual

Error: example outside chapter in $file line $number

An example can belong to a subroutine, chapter, section, and subsection. Apparently, this example was found before the first chapter started in the file.

Error: manual definition requires manual object

A call to addManual() expects a new manual object (a OODoc::Manual), however an incompatible thing was passed. Usually, intended was a call to manualsForPackage() or mainManual().

Warning: module $name is not on your system, but linked to in $manual

The module can not be found. This may be an error at your part (usually a typo) or you didn't install the module on purpose. This message will also be produced if some defined package is stored in one file together with an other module or when compilation errors are encountered.

Warning: no diagnostic message supplied in $file line $number

The start of a diagnostics message was indicated, however not provided on the same line.

Error: no input file to parse specified

The parser needs the name of a file to be read, otherwise it can not work.

Warning: no manual for $package (correct casing?)

The manual for $package cannot be found. If you have a module named this way, this may indicate that the NAME chapter of the manual page in that module differs from the package name. Often, this is a typo in the NAME... probably a difference in used cases.

Warning: option "$name" unknow for $name() in $package, found in $manual

Error: option $name outside subroutine in $file line $number

An option is set, however there is not subroutine in scope (yet).

Warning: option line incorrect in $file line $number: $line

The shown $line is not in the right format: it should contain at least two words being the option name and an abstract description of possible values.

Error: section $name outside chapter in $file line $number

Sections must be contained in chapters.

Warning: subroutine $name is not defined by $package, found in $manual

Error: subroutine $name outside chapter in $file line $number

Subroutine descriptions (method, function, tie, ...) can only be used within a restricted set of chapters. You have not started any chapter yet.

Error: subsection $name outside section in $file line $number

Subsections are only allowed in a chapter when it is nested within a section.

Warning: unknown markup in $file line $number: $line

The standard pod and the extensions made by this parser define a long list of markup keys, but yours is not one of these predefined names.

DETAILS

General Description

The Markov parser has some commonalities with the common POD syntax. You can use the same tags as are defined by POD, however these tags are "visual style", which means that OODoc can not treat it smart. The Markov parser adds many logical markups which will produce nicer pages.

Furthermore, the parser will remove the documentation from the source code, because otherwise the package installation would fail: Perl's default installation behavior will extract POD from packages, but the markup is not really POD, which will cause many complaints.

The version of the module is defined by the OODoc object which creates the manual page. Therefore, $VERSION will be added to each package automatically.

Disadvantages

The Markov parser removes all raw documentation from the package files, which means that people sending you patches will base them on the processed source: the line numbers will be wrong. Usually, it is not much of a problem to manually process the patch: you have to check the correctness anyway.

A second disadvantage is that you have to backup your sources separately: the sources differ from what is published on CPAN, so CPAN is not your backup anymore.

Structural tags

Heading

 =chapter    STRING
 =section    STRING
 =subsection STRING

These text structures are used to group descriptive text and subroutines. You can use any name for a chapter, but the formatter expects certain names to be used: if you use a name which is not expected by the formatter, that documentation will be ignored.

Subroutines

Perl has many kinds of subroutines, which are distinguished in the logical markup. The output may be different per kind.

 =i_method  NAME PARAMETERS   (instance method)
 =c_method  NAME PARAMETERS   (class method)
 =ci_method NAME PARAMETERS   (class and instance method)
 =method    NAME PARAMETERS   (short for i_method)
 =function  NAME PARAMETERS
 =tie       NAME PARAMETERS
 =overload  STRING

The NAME is the name of the subroutine, and the PARAMETERS an argument indicator.

Then the subroutine description follows. These tags have to follow the general description of the subroutines. You can use

 =option    NAME PARAMETERS
 =default   NAME VALUE
 =requires  NAME PARAMETERS

If you have defined an =option, you have to provide a =default for this option anywhere. Use of =default for an option on a higher level will overrule the one in a subclass.

Include examples

Examples can be added to chapters, sections, subsections and subroutines. They run until the next markup line, so can only come at the end of the documentation pieces.

 =example
 =examples

Include diagnostics

A subroutine description can also contain error or warning descriptions. These diagnostics are usually collected into a special chapter of the manual page.

 =error this is very wrong
 Of course this is not really wrong, but only as an example
 how it works.

 =warning wrong, but not sincerely
 Warning message, which means that the program can create correct output
 even though it found sometning wrong.

Compatibility

For comfort, all POD markups are supported as well

 =head1 Heading Text          (same as =chapter)
 =head2 Heading Text          (same as =section)
 =head3 Heading Text          (same as =subsection)
 =head4 Heading Text
 =over indentlevel
 =item stuff
 =back
 =cut
 =pod
 =begin format
 =end format
 =for format text...

Text markup

Next to the structural markup, there is textual markup. This markup is the same as POD defines in the perlpod manual page. For instance, C<some code> can be used to create visual markup as a code fragment.

One kind is added to the standard list: the M.

The M-link can not be nested inside other text markup items. It is used to refer to manuals, subroutines, and options. You can use an L-link to manuals as well, however then the POD output filter will modify the manual page while converting it to other manual formats.

Syntax of the M-link: M < OODoc::Object > M < OODoc::Object::new() > M < OODoc::Object::new(verbose) > M < new() > M < new(verbose) >

These links refer to a manual page, a subroutine within a manual page, and an option of a subroutine respectively. And then two abbreviations are shown: they refer to subroutines of the same manual page, in which case you may refer to inherited documentation as well.

The standard POD defines a L markup tag. This can also be used with this Markov parser.

The following syntaxes are supported: L < manual > L < manual/section > L < manual/"section" > L < manual/subsection > L < manual/"subsection" > L < /section > L < /"section" > L < /subsection > L < /"subsection" > L < "section" > L < "subsection" > L < unix-manual > L < url >

In the above, manual is the name of a manual, section the name of any section (in that manual, by default the current manual), and subsection a subsection (in that manual, by default the current manual).

The unix-manual MUST be formatted with its chapter number, for instance cat(1), otherwise a link will be created. See the following examples in the html version of these manual pages:

 M E<lt> perldoc E<gt>              illegal: not in distribution
 L E<lt> perldoc E<gt>              manual perldoc
 L E<lt> perldoc(1perl) E<gt>       manual perldoc(1perl)

 M E<lt> OODoc::Object E<gt>        OODoc::Object
 L E<lt> OODoc::Object E<gt>        OODoc::Object
 L E<lt> OODoc::Object(3pm) E<gt>   manual OODoc::Object(3pm)

Grouping subroutines

Subroutine descriptions can be grouped in a chapter, section, or subsection. It is very common to have a large number of subroutines, so some structure has to be imposed here.

If you document the same routine in more than one manual page with an inheritance relationship, the documentation location shall not conflict. You do not need to give the same level of detail about the exact location of a subroutine, as long as it is not conflicting. This relative freedom is created to be able to regroup existing documentation without too much effort.

For instance, in the code of OODoc itself (which is of course documented with OODoc), the following happens:

 package OODoc::Object;
 ...
 =chapter METHODS
 =section Initiation
 =c_method new OPTIONS

 package OODoc;
 use base 'OODoc::Object';
 =chapter METHODS
 =c_method new OPTIONS

As you can see in the example, in the higher level of inheritance, the new method is not put in the Initiation section explicitly. However, it is located in the METHODS chapter, which is required to correspond to the base class. The generated documentation will show new in the Initiation section in both manual pages.

Caveats

The markov parser does not require blank lines before or after tags, like POD does. This means that the change to get into parsing problems have increased: lines within here documents which start with a = will cause confusion. However, I these case, you can usually simply add a backslash in front of the printed =, which will disappear once printed.

Examples

You may also take a look at the raw code archive for OODoc (the text as is before it was processed for distribution).

Example: how subroutines are documented

 =chapter FUNCTIONS

 =function countCharacters FILE|STRING, OPTIONS
 Returns the number of bytes in the FILE or STRING,
 or undef if the string is undef or the character
 set unknown.

 =option  charset CHARSET
 =default charset 'us-ascii'
 Characters in, for instance, utf-8 or unicode encoding
 require variable number of bytes per character.  The
 correct CHARSET is needed for the correct result.

 =examples

   my $count = countCharacters("monkey");
   my $count = countCharacters("monkey",
       charset => 'utf-8');

 =error unknown character set $charset

 The character set you can use is limited by the sets
 defined by manual Encode.  The characters of the input can
 not be seperated from each other without this definition.

 =cut

 # now the coding starts
 sub countCharacters($@) {
    my ($self, $input, %options) = @_;
    ...
 }

REFERENCES

See the OODoc website at http://perl.overmeer.net/oodoc/ for more details.

COPYRIGHTS

Module version 0.90. Written by Mark Overmeer (mark@overmeer.net). See the ChangeLog for other contributors.

Copyright (c) 2003 by the author(s). All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.