Treex::PML::Schema - Perl implements a PML schema.
This class implements PML schemas. PML schema consists of a set of type declarations of several kinds, represented by objects inheriting from a common base class Treex::PML::Schema::Decl.
Treex::PML::Schema::Decl
This class inherits from Treex::PML::Schema::Template.
Some methods use so called 'attribute paths' to navigate through nested and referenced type declarations. An attribute path is a '/'-separated sequence of steps, where step can be one of the following:
!
'!' followed by name of a named type (this step can only occur as the very first step
name (of a member of a structure, element of a sequence or attribute of a container), specifying the type declaration of the specified named component
#content
the string '#content', specifying the content type declaration of a container
LM
specifying the type declaration of a list
AM
specifying the type declaration of an alt
[
]
where NNN is a decimal number (ignored) are an equivalent of LM or AM
Steps of the form LM, AM, and [NNN] (except when occuring at the end of an attribute path) may be omitted.
This module exports constants for declaration types.
Export constant symbols (exported by default, too).
See Treex::PML::Schema::Constants.
NOTE: Don't call this constructor directly, use Treex::PML::Factory->createPMLSchema() instead!
Parses an XML representation of a PML Schema from a string, filehandle, local file, or URL, processing the modular instructions as described in
L<http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing>
and returns the corresponding Treex::PML::Schema object.
Treex::PML::Schema
One of the following options must be given:
string
a XML string to parse
filename
a file name or URL
fh
a file-handle (IO::File, IO::Pipe, etc.) open for reading
The following options are optional:
base_url
base URL for referred schemas (usefull when parsing from a file-handle or a string)
use_resources
if this option is used with a true value, the parser will attempt to locate referred schemas also in Treex::PML resource paths.
revision
minimal_revision
maximal_revision
constraints to the revision number of the schema.
validate
if this option is used with a true value, the parser will validate the schema on the fly using a RelaxNG grammar given using the relaxng_schema parameter; if relaxng_schema is not given, the file 'pml_schema_inline.rng' searched for in Treex::PML resource paths is assumed.
relaxng_schema
a particular RelaxNG grammar to validate against. The value may be an URL or filename for the grammar in the RelaxNG XML format, or a XML::LibXML::RelaxNG object representation. The compact format is not supported.
An obsolete alias for Treex::PML::Schema->new({%$opts, filename=>$filename}).
This method serializes the Treex::PML::Schema object to XML. See Treex::PML::Schema::XMLNode->write for implementation.
IMPORTANT: The resulting schema is simplified, that is all modular instructions are processed and removed from it, see http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing
a scalar reference to which the XML is to be stored as a string
a file name
a file-handle (IO::File, IO::Pipe, etc.) open for writing
One of the following options are optional:
no_backups
if this option is used with a true value, the writer will not attempt to create backup (tilda) files when overwriting an existing file.
no_indent
if this option is used with a true value, the writer will not add additional newlines and indentatin white-space to the result XML.
Return location of the PML schema file.
Set location of the PML schema file.
Return PML version the schema conforms to.
Return PML schema revision.
Return PML schema description.
Return the root type declaration (see Treex::PML::Schema::Root).
Treex::PML::Schema::Root
Like $schema->get_root_decl->get_content_decl.
Return the constant PML_SCHEMA_DECL (for compatibility with the Treex::PML::Schema::Decl interface).
Return the string 'schema' (for compatibility with the Treex::PML::Schema::Decl interface).
Return name of the root element for PML instance.
Return names of all named type declarations.
This method returns a list of HASHrefs containing information about a named references to PML instances (each hash will currently have the keys 'name' and 'readas').
This method retrieves information about a specific named instance reference as a hash (currently with keys 'name' and 'readas').
This function compares two schema revision strings according to the ruls described in the PML specification. Returns -1 if revision $A precedes revision $B, 0 if the revisions are equal (equivalent), and 1 if revision $A follows revision $B.
This method traverses all nested declarations and sub-declarations and calls a given subroutine passing the sub-declaration object as a parameter.
Check that schema revision satisfies given constraints. The following options are suported:
revision: exact revision number to match
minimal_revision: minimal revision number to match
maximal_revision: maximal revision number to match
revision error: an optional error message format string with %f mark for the schema filename or URL and %e for the error string. Defaults to 'Error: wrong schema revision of %f: %e';
revision error
Compatibility method building the schema object from a nested hash structure created by XML::Simple which was used in older implementations. This is useful for upgrading objects stored in old binary dumps.
Locate a declaration specified by attribute-path starting from declaration decl. If decl is undefined the root type declaration is used. (Note that attribute paths starting with '/' are always evaluated startng from the root declaration and paths starting with '!' followed by a name of a named type are evaluated starting from that type.) All references to named types are transparently resolved in each step.
attribute-path
decl
The caller should pass a true value in noresolve to enforce Member, Attribute, Element, Type, or Root declaration objects to be returned rather than declarations of their content.
noresolve
Attribute path is a '/'-separated sequence of steps (member, attribute, element names or strings matching [\d*]) which identifying a certain nested type declaration. A step of the aforementioned form [\d*] is match the content declaration of a List or Alt. Note however, that named stepsdive into List or Alt declarations automatically, too.
Return a list of declarations (objects derived from Treex::PML::Schema::Decl) that have role equal to role.
role
If start_decls is specified, it must be an ARRAY reference of declarations; in that case, only declarations nested below the listed ones are considered.
start_decls
WARINING: this function can be very slow, esp. if the type declarations are recursive.
Return a list of attribute paths leading to nested type declarations of decl with role equal to role.
This is equivalent to
$schema->find_decl($decl,sub{ $_[0]->{role} eq $role },$opts);
Please, see the documentation for find_dec for more information.
find_dec
Return a list of attribute paths leading to nested type declarations of decl for which a given callback returns a true value. The tested type declaration is passed to the callback as the first (and only) argument.
If start_decls is specified, it must be an ARRAY reference of declarations; in that case, only declarations nested or referred to from the listed ones are considered.
In array context return all matching nested declarations are returned. In scalar context only the first one is returned (with early stopping).
The last argument opts can be used to pass some flags to the algorithm. Currently only the flag no_childnodes is available. If true, then the function never recurses into content declaration of declarations with the role #CHILDNODES.
opts
no_childnodes
Return a list of all type declarations with the role #NODE.
#NODE
Return the declaration of the named type with a given name (see Treex::PML::Schema::Type).
Treex::PML::Schema::Type
Validates the data content of the given object against a specified type declaration. The type_decl argument must either be an object derived from the Treex::PML::Schema::Decl class or the name of a named type.
An array reference may be passed as the optional 3rd argument log to obtain a detailed report of all validation errors.
log
The flags argument can specify flags that influance the validation. The following constants can binary-OR'ed to obtain the fags:
flags
PML_VALIDATE_NO_TREES - do not validate nested data with roles #CHIDLNODES or #TREES and do not require that objects with the role #NODE implement the Treex::PML::Node role.
PML_VALIDATE_NO_CHILDNODES - do not validate nested data with the role #CHIDLNODES.
Returns: 1 if the content conforms, 0 otherwise.
This method is similar to validate_object, but in this case the validation is restricted to the data substructure of object specified by the attr-path argument.
validate_object
object
attr-path
type is the type of object specified either by the name of a named type, or as a Treex::PML::Type, or a type declaration.
type
This method returns a list of all non-periodic canonical paths leading from given types to atomic values. Currently only the following options are supported:
no_childnodes => $bool
If true, the method does not descent to member types with the role #CHILDNODES.
no_nodes => $bool
If true, the method does not descent to member types with the role #NODE (except for the starting types).
with_LM => $bool
If true, the paths will include a LM step for each List type on the path.
with_AM => $bool
If true, the paths will include a AM step for each Alt type on the path.
with_Seq_brackets => $bool
If true, the paths will append a [0] after each step representing a sequence element
This function tries to emulate the behavior of Treex::PML::FSFormat->attributes to some extent.
Treex::PML::FSFormat->attributes
Return attribute paths to all atomic subtypes of given type declarations. If no type declaration objects are given, then types with role #NODE are assumed. This function never descends to subtypes with role #CHILDNODES.
#CHILDNODES
Auxiliary method used internally by the PML Schema parser. It simplifies the schema and for each declaration object creates back references to its parent declaration and schema and pre-computes the type attribute path returned by $decl->get_decl_path().
Prague Markup Language (PML) format: http://ufal.mff.cuni.cz/jazz/PML/
Tree editor TrEd: http://ufal.mff.cuni.cz/tred
Related packages: Treex::PML, Treex::PML::Schema::Template, Treex::PML::Schema::Decl, Treex::PML::Instance,
Copyright (C) 2006-2010 by Petr Pajas
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.
To install Treex::PML, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Treex::PML
CPAN shell
perl -MCPAN -e shell install Treex::PML
For more information on module installation, please visit the detailed CPAN module installation guide.