NAME

Email::MIME::XPath - access MIME documents via XPath queries

VERSION

Version 0.005

SYNOPSIS

  use Email::MIME;
  use Email::MIME::XPath;

  my $email = Email::MIME->new($data);

  # find just the first text/plain node, no matter how many there are
  my ($part) = $email->xpath_findnodes('//plain');

  # find the only text/html node, and die if there is more than one
  $part = $email->xpath_findnode('//html');

  # look for a png by filename
  $part = $email->xpath_findnode('//png[@filename="image.png"]');

  # retrieve a part by previously-stored address
  my $address = $part->xpath_address;
  # ... later ...
  $part = $email->xpath_findnode(qq{//*[@address="$address"]});

DESCRIPTION

Dealing with MIME messages can be complicated. Frequently you want to display certain parts of a message, while alluding to (linking, summarizing, whatever) other parts in a way that makes them easy to get to later. Sometimes this can go several levels deep, if you're dealing with forwarded messages, bounces, or reports of some kind.

It is especially referring back to sub-parts of an arbitrarily deep MIME message that is tedious and that this module attempts to make easier.

Most of this module's functionality is provided by Tree::XPathEngine. Refer to its documentation for details. In particular, each of these methods is just a wrapper around the method of the same name with xpath_ removed:

xpath_findnodes

xpath_findnodes_as_string

xpath_findvalue

xpath_exists

xpath_matches

xpath_find

Two other useful methods are made available by Email::MIME::XPath:

xpath_findnode

This is a wrapper around xpath_findnodes that dies if more than one node is matched.

TODO: should this also die if no nodes are found?

xpath_address

This method returns a per-message unique address for a particular part. This address is also available as the 'address' attribute in XPath queries; see "Attributes".

DOM

XPath expects to work on a tree that is DOM-like. MIME documents are trees, and this module fakes up enough structure to make XPath useful.

Elements (MIME parts) are given a name that corresponds to the second part of their Content-Type, e.g.

  multipart/mixed = 'mixed'
  text/plain      = 'plain'

I am open to changing this. In particular, I would have just used the entire Content-Type, but using '/' in names would have been problematic and I didn't want to replace it with something else. Most of names should be unique, anyway; I've never seen 'multipart/png' or 'image/html'. Feel free to enlighten me.

Attributes

subject

from

to

cc

content_type

All of these attributes are pulled directly from the headers.

filename

For parts with a Content-Disposition header, the filename is pulled from it.

address

This attribute is assigned by Email::MIME::XPath as it crawls through the MIME structure (see "GUTS"). For any given top-level MIME document, the address attribute for each subpart will be stable over time. If you do your XPath queries from somewhere other than the top-level MIME part, the addresses will be different and probably not very useful.

Do not depend on any particular value for any particular address; it should only be used for temporary reference, not permanent storage. In particular, it may change between versions of Email::MIME::XPath, though such changes will be announced ahead of time. In the future, it may be possible to specify how addresses should be assigned on a per-application basis; presumably then they could be depended on.

GUTS

This module does a few odd things to work around unfriendly behavior in Email::MIME. For example, Email::MIME lets MIME parts be used in several larger MIME documents at once. Not only do individual parts not know what their parent is, they *can't* know, because a single part could be in multiple trees at once. Email::MIME::XPath tries to impose a tree structure on relevant MIME objects without getting in the way, but there are undoubtedly bugs and unexpected behavior that will arise.

TODO

Some of the XPath supported by Tree::XPathEngine doesn't work yet, in particular doing anything with siblings. Other syntax may work, but in general it is not yet thoroughly tested.

SEE ALSO

Tree::XPathEngine, Email::MIME

AUTHOR

Hans Dieter Pearcey, <hdp at cpan.org>

BUGS

Please report any bugs or feature requests to bug-email-mime-xpath at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Email-MIME-XPath. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Email::MIME::XPath

You can also look for information at:

ACKNOWLEDGEMENTS

Thanks to Listbox.com, who sponsored the development of this module.

COPYRIGHT & LICENSE

Copyright 2007 Hans Dieter Pearcey, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.