The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

ODF::lpOD::Element - Common features available with any ODF element

DESCRIPTION

This manual page describes the odf_element class.

Note that odf_element is the super-class of various specialized objects, so the present manual page introduces only the basic methods that apply to every odf_element. The odf_element class is not always explicitly used in a typical application; it's generally used through more specialized, user-oriented specialized objects.

odf_element is an alias for ODF::lpOD::Element package.

Every XML element (loaded from an existing ODF document or created by any lpOD-based application) is a odf_element. This class is the base class for any document element; its features are inherited by other, more specialized element classes.

An element may be explicitly created using the odf_create_element class constructor (or a the constructor of any derivative of odf_element (such as odf_paragraph, odf_table, etc), then inserted somewhere is an document. However, in most cases, elements are either retrieved in the existing structure or implicitly created ad put in place as children of existing elements through various element-based set_xxx methods (where "xxx" depends on the kind of newly created data).

Among the odf_element methods, we distinguish element methods from context methods, while both kinds of methods belong to odf_element objects. An element method gets or sets one or more properties of the calling element, while a context method uses the calling element as its operating context and may produce effects regarding other elements located somewhere in the hierarchy of the calling element (generally below, and sometimes above). As examples, set_attribute is an element method (it changes an attribute of the current element), while get_element (in its element-based version, that is not the same as its part-based one) retrieves an element somewhere below the current one.

Constructor and retrieval tools

odf_create_element(data)

Creates an odf_element from an argument which may be an arbitrary tag, a well-formed XML string, a file handle, or a web link. Beware that it's a generic, low level constructor, allowing the user to create arbitrary elements. Its explicit use should be exceptional.

The following instruction creates a regular XML element whose tag is "foo":

        $e = odf_create_element('foo');

If the given argument is a string whose first character is "<", it's parsed as an XML element definition. As an example, the instruction below creates a "foo" element with a "bar" attribute whose value is "baz" while the text of the element is "xyz":

        $e = odf_create_element('<foo bar="baz">xyz</foo>');

If the given string starts with 'http:', then it's regarded as the URL of an XML resource available through the web. If LWP::Simple is installed and if the remote resource is available, then this resource is parsed as XML.

If the argument is a reference (and not a string), it's regarded as a text file handle; if it's really a file handle, the content of this file is loaded and parsed as XML.

The use of file handles or HTTP links allows the applications to easily import ODF element definitions from remote locations and/or to reuse element definitions stored in application-specific XML files or databases.

The new element is not attached to a document; it's free for later use.

odf_create_element() is an alias for one of the following instructions, which are equivalent:

        odf_element->create();
        ODF::lpOD::Element->create();

The same principle applies for every subclass of ODF::lpOD::Element; as a a consequence, whatever "xxx" in the name of a particular lpOD element constructor, the two following instructions are equivalent:

        $e = odf_create_xxx(@args);
        $e = odf_xxx->create(@args);

get_element(tag [options])

This method returns the first element (if any) matching the given XML tag. It's the most generic context-based retrieval method.

The given tag may be replaced by a regular expression, so the search space will include all the elements whose tags match the expression.

For example, the following instruction (assuming $context is a previously retrieved element) returns the first element that is either a paragraph or a heading (knowing that the corresponding tags are text:p and text:h):

        my $text_element = $context->get_element(qr'text:(p|h)');

The allowed options are:

  • position: The sequential zero-based position of the element among the set of elements matching the given tag; negative positions are counted backward from the end.

  • attribute: The name of an attribute used as a selection criterium; if this option is set, the value option is required.

  • value: the value of the selection attribute.

  • content: a search string (or a regexp) restricting the search space to the elements with matching content.

The example below (that combines all the options) returns the 4th level 1 heading before the end of the current context:

        $context->get_element(
                'text:h',
                attribute       => 'outline level',
                value           => 1,
                position        => -4
                );

Caution: the get_element method of odf_part is not the same as the get_element method of odf_element.

get_elements(tag)

Returns the full list of elements matching the given tag, whose tags match the given regexp.

The attribute and value options are allowed in order to restrict the search.

The next example returns the list of paragraphs whose style is "Standard":

        my @std_paragraphs = $context->get_elements(
                'text:p',
                attribute       => 'style name',
                value           => 'Standard'
                );

get_parent

This method returns the immediate parent of the calling element. Of course, it returns undef if the context element is itself a root, or if it's not included yet in a document.

get_root

Returns the top level element of the document part that contains the calling element.

get_document

Returns the odf_document instance to which the element belongs. Returns undef if the element is not attached to a odf_document.

Context import

The import_children() method allows the user to directly append elements coming from an external context. As an example, the following instruction appends appends all the children of a given $source element in the context of the calling $destination element:

        $destination->import_children($source);

In this example, both $destination and $source are elements. The source context may belong to the same document as the destination context, or not, so this method may be used for content replication across documents. Knowing that the imported elements are clones of the original ones, the source contexte is unchanged. The imported elements are appended at the end of the calling context in the same order as in the source context.

An optional filter may be specified as the 2nd argument, in order to import a particular kind of the source context only. So, the instruction below imports only the paragraphs of the source context (remember that 'text:p' is the ODF tag for paragraphs):

        $destination->import_children($source, 'text:p');

The substitute_children() method, that takes the same arguments as import_children(), removes all the children of the calling element (if any), then imports the children of a source element. The following sequence replaces all the content of a given document body by the content of another document body (see ODF::lpOD::Document for details about get_body()):

        $doc_destination->get_body->import_children(
                $doc_source->get_body
                );

Top level contexts

As introduced in ODF::lpOD::Document, the odf_part handlers provide methods that automatically return high level elements that may be the preferred contexts in most cases. The most common one is the root element; its context is the whole document part. The body element, that is sometimes the same as the root one, is a bit more restricted in the document content part (it includes only the content objects, and excludes other objects such as style definitions). Both the root and the body may be got using the part-based get_root and get_body methods.

The following sequence, starting from the creation of a document instance, selects a part, then the root element of the part, than selects the list of table styles defined in the part:

        my $doc = odf_document->get("/home/jmg/report.odt");
        my $content = $doc->get_part(CONTENT);
        my $context = $content->get_root;
        my @table_styles = $context->get_element_list(
                'style:style',
                attribute       => 'family',
                value           => 'table'
                );

Note that the sequence above is shown in order to illustrate a principle but that it should not be needed in a real application, knowing that lpOD provides more user-friendly style retrieval tools.

Child element creation methods

The methods described in this section allows the user to insert elements (previously existing or not) as children of the calling element.

insert_element(element [options])

Insert the given odf_element at a given position, that is defined according to a position parameter, whose possible values are:

  • FIRST_CHILD: the odf_element will be the first child (default).

  • LAST_CHILD: the odf_element will be the last child.

  • NEXT_SIBLING: the odf_element will be inserted just after.

  • PREV_SIBLING: the odf_element will be inserted just before.

  • WITHIN: the odf_element will be inserted as a child within the text content; if position is WITHIN, then the offset parameter is required.

  • offset: specifies the position in the text of the context element where the new child element must be inserted (the position is zero-based).

  • before: the value of this option, if set, must be another child odf_element of the calling one; the new element will be inserted as the previous sibling of this child element.

  • after: like before, but the new element will be inserted after the value of this option.

The WITHIN option splits the text content of the container in two parts and inserts the elements between them, at a given offset. So if position is WITHIN, the offset optional parameter is used. By default, if no offset argument is provided, or if the calling element doesn't contain any text, WITHIN produces the same result as FIRST_CHILD. The offset argument must be an integer; it specifies the position of the inserted child element within the text content of the calling element. A zero offset means that the element must be inserted before the 1st character. A negative offset value means that the insert position must be counted down from the end of the text, knowing that -1 is the position just before the last character. Of course, if the insertion must be done after the end of the text, the simplest way is to select LAST_CHILD instead of WITHIN.

If before or after is provided, the other options are ignored. Of course, before and after are mutually exclusive.

The following example inserts a previously existing element between the 4th and the 5th characters of the text of the calling element:

        $context->insert_element(
                $alien_element,
                position        => WITHIN,
                offset          => 4
                );

The next example inserts a new empty paragraph before the last paragraph of the calling context:

        my $last_p = $context->get_element('text:p', position => -1);
        $context->insert_element(
                'text:p',
                before          => $last_p
                );

(Note that smarter methods, described elsewhere, may produce the same results).

insert_element(tag)

Like the first version of insert_element, but the argument is an XML tag (i.e. technically a text string instead of a odf_element instance); in such a case a new element is created then inserted according to the same rules and options.

append_element(element/tag)

Like insert_element, but without options; appends the element as the last child of the calling element. So these tow lines are equivalent:

        $context->insert_element($elt, position => LAST_CHILD);
        $context->append_element($elt);

Element methods

The methods introduced in this section are accessors that get or set the own properties of the calling element. However, in some cases they may have indirect consequence on other elements.

Note that odf_element is a subclass of XML::Twig::Elt. So, beyond the methods specifically described below, programmers familiar with XML::Twig are allowed to directly (but cautiously) call any XML::Twig::Elt method from any element, if the methods described below don't meet all their needs.

clear

Erases the text of an element and all its children. Beware that this method is overiden by some specialized element classes.

clone

Returns a copy of the calling element, with all its attributes, its text, and its children elements. Allows the user to copy a high-level structured element (like a section or a table) as well as a single paragraph. The copy is a free element, that may be inserted somewhere in the same document as the prototype, or in another document.

delete

Removes the calling element with all its descendants.

del_attribute(name)

Deletes the attribute whose name is given in argument. Nothing is done if the attribute doesn't exist. The argument may be the exact XML name of the attribute, or an "approximative" name according to the same logic as get_attribute below.

get_attribute(name)

Returns the string value of the attribute having this name. The argument may be the exact XML name of the attribute. However, if a name without name space prefix is provided, the prefix is automatically supposed to be the same as the prefix of the context element. In addition, any white space or underscore character in the given name is interpreted as a "-". As a consequence, some attributes may be designated without care of the exact XML syntax. As an example, assuming $p is a paragraph, the two instructions below are equivalent, knowing that the name space prefix of a paragraph is 'text':

        $style = $p->get_attribute('text:style-name');
        $style = $p->get_attribute('style name');

The attribute values are returned in a character set that depends on the global configuration. See ODF::lpOD::Common for details about the character set handling.

get_attributes

Returns all the attributes of the calling element as a hash ref where keys are the full XML names of the attributes.

get_style

Returns the name of the style used by the calling element (this accessor makes sense for objects that may be displayed according to a layout). Returns undef if no style is used.

Note: if your style names contain non-ASCII characters and if your preferred output character set is not utf8, see ODF::lpOD::Common for details about character sets handling.

get_tag

Returns the XML tag of the element with its name space prefix.

get_text(recursive => FALSE)

Returns the text contents of the element as a string. By default this method is not recursive (i.e. it just returns the own text of the element, not the text belonging to children and descendant elements). However, if the optional recursive parameter is provided and set to TRUE, then the method returns the concatenated contents of all the descendants of the given element.

In a default configuration, the character set of the output is utf8. If that is not convenient for you, see the character set handling section in ODF::lpOD::Common.

get_url

Returns the URL if the element owns a hyperlink property, or undef otherwise.

serialize

Returns an XML export of the calling element, allowing the lpOD applications to store and/or transmit particular pieces of documents, and not only full documents. The pretty option is allowed, like with the serialize method of odf_part objects, described in ODF::lpOD::Document.

Note that this XML export is not affected by the content encoding/decoding mechanism that works for user content, so it's character doesn't depend on the custom text output character set possibly selected through the set_output_charset() method introduced in ODF::lpOD::Common.

set_attribute(attribute => value)

Sets or changes an attribute of the calling element. The attribute is created if it didn't exist. If the provided value is undef, the attribute is deleted (if it didn't exist, nothing is done). The attribute name may be specified according to the same rules as with get_attribute.

About the character set of the input values, the same rules as with any text input apply; see the character set handling section in ODF::lpOD::Common.

set_attributes(attr_hash_ref)

Sets several attributes at a time. The attributes to change or create must be passed as a hash ref (like the hash ref returned by get_attributes). The attribute names may be provided in simplified form like with set_attribute.

set_comment(text)

Intended for debugging purposes, this method puts a XML comment before the calling element. This comment produces a "<!--xyz-->" tag, where "xyz" is the given text, in the XML output if the document is later serialized. Beware that such comments are not always preserved if the document is changed by an office application software.

set_child(tag, text, attributes)

Synonym of set_first_child() (see below).

set_first_child(tag, text, attributes)

Makes sure that the calling element contains at least one element with the given XML tag. If there is no compliant child, a new element is created with the given tag and inserted as the first child of the calling element. If one or more compliant child exist, the first one is selected and its text content (if any) is deleted.

The second argument (optional) is a string that becomes the new text content of the created or selected child. The remainder of the argument list, if any, is a hash specifying attribute/value pairs for this element.

The return value is the selected or created element.

set_last_child(tag, text, attributes)

Same as set_first_child() but, in case of creation, the new element is inserted as the last child. If compliant children already exist, the result is the same as set_first_child().

set_parent(tag, text, attributes)

Makes sure that the current element is a child of an element whose tag is specified by the first argument. If the calling element is free or if its immediate parent has not the given tag, a new element with the given tag is inserted as the same place of the calling element, that becomes the first child of the new element.

The return value is the new element.

The other arguments are the same as with set_first_child().

set_style(style_name)

Changes or sets the style name of the calling object. Caution: a lot of ODF elements should not have any style, so don't use this accessor unless you know that the calling object needs a style.

Note: if your style names contain non-ASCII characters and if your preferred input character set is not utf8, see ODF::lpOD::Common for details about character sets handling.

set_tag(new_tag)

Changes the XML tag of the calling element. Not for usual business; it's a low level technical feature.

set_text(text_string)

Sets the text content of the calling element, replacing any previous content. Example:

        my $paragraph = $context->get_element('text:p', position => 15);
        $paragraph->set_text("The new content");

The character set of the provided string must comply to the currently active input character set (default is utf8). See the character set handling section in ODF::lpOD::Common if you get troubles about text encoding.

If set_text is called with an empty string, its effect is the same as clear.

set_url(url)

Sets a URL attribute (xlink:href) with the argument.

Custom element classes

The ODF::lpOD::Element package provides a associate_tag class method allowing developers to create custom subclasses and associate them to particular elements. The code example below defines CustomParagraph as a subclass of odf_pragraph (introduced in ODF::lpOD::TextElement) and specifies that every text:p XML element must be mapped to this new class instead of ODF::lpOD::Paragraph:

        package CustomParagraph;
        use ODF::lpOD;
        use base 'ODF::lpOD::Paragraph';
        __PACKAGE_->associate_tag('text:p');

        sub custom_method {
                #...
        }

        1;

This extensibility mechanism must be used very cautiously if the specified tag is already associated with a lpOD class, knowing that a wrongly overriden method could produce destructive side effects.

AUTHOR/COPYRIGHT

Developer/Maintainer: Jean-Marie Gouarne http://jean.marie.gouarne.online.fr Contact: jmgdoc@cpan.org

Copyright (c) 2010 Ars Aperta, Itaapy, Pierlis, Talend. Copyright (c) 2011 Jean-Marie Gouarné.

This work was sponsored by the Agence Nationale de la Recherche (http://www.agence-nationale-recherche.fr).

License: GPL v3, Apache v2.0 (see LICENSE).

1 POD Error

The following errors were encountered while parsing the POD:

Around line 561:

Non-ASCII character seen before =encoding in 'Gouarné.'. Assuming UTF-8