The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTML::Microformats::Documentation::Notes - misc usage and design notes

NOTES

Byzantine Internals

The internals of HTML::Microformats are pretty complicated - best to steer clear of them. Here are three usage patterns that avoid dealing with the internals:

  • Parse a page and use it as a single RDF graph.

    A page can be parsed into an RDF::Trine::Model and queried using SPARQL.

            use HTML::Microformats;
            use LWP::Simple qw[get];
            use RDF::Query;
            
            my $page  = 'http://example.net/';
            my $graph = HTML::Microformats
                           ->new_document(get($page), $page)
                           ->assume_all_profiles
                           ->parse_microformats
                           ->model;
            
            my $query = RDF::Query->new(<<SPARQL);
            PREFIX foaf: <http://xmlns.com/foaf/0.1/>
            SELECT DISTINCT ?friendname ?friendpage
            WHERE {
                    <$page> ?p ?friendpage .
                    ?person foaf:name ?friendname ;
                            foaf:page ?friendpage .
                    FILTER (
                            isURI(?friendpage)
                            && isLiteral(?friendname) 
                            && regex(str(?p), "^http://vocab.sindice.com/xfn#(.+)-hyperlink")
                    )
            }
            SPARQL
            
            my $results = $query->execute($graph);
            while (my $result = $results->next)
            {
                    printf("%s <%s>\n",
                            $result->{friendname}->literal_value,
                            $result->{friendpage}->uri,
                            );
            }
  • Use the data method on each object.

    The data method on microformat objects returns a hashref of useful data.

            use HTML::Microformats;
            use LWP::Simple qw[get];
            
            my $page     = 'http://example.net/';
            my @xfn_objs = HTML::Microformats
                           ->new_document(get($page), $page)
                           ->assume_all_profiles
                           ->parse_microformats
                           ->objects('XFN');
            
            while (my $xfn = shift @xfn_objs)
            {
                    printf("%s <%s>\n",
                            $xfn->data->{title},
                            $xfn->data->{href},
                            );
            }

    (If you're wondering why the second example's simpler it's because it returns somewhat dumber data.)

Things that would be nice

Convert an hCard to a vCard; hCalendar to iCalendar; hAtom to Atom and so forth.

Ideal way would be to create a separate vCard-RDF to vCard module, and then have HTML::Microformats::Format::hCard hook into that. And equivalent for other formats.

Stuff that's b0rked

The get_foo, set_foo, add_foo, clear_foo methods defined in HTML::Microformats::Format work unreliably and are poorly documented. You're better off using the data method and inspecting the returned structure for the data you need. This will be fixed in the future.

Here be monsters

There are several parts of the code which are incredibly complicated and desperately need refactoring. This will be done at some point, so don't rely too much on their current behaviour.

stringify and _stringify_helper in HTML::Microformats::Utilities. The whole of HTML::Microformats::Mixin::Parser.

SEE ALSO

HTML::Microformats.

AUTHOR

Toby Inkster <tobyink@cpan.org>.

COPYRIGHT

Copyright 2008-2010 Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.