NAME

Google::SiteMap - Perl extension for managing Google SiteMaps

SYNOPSIS

  use Google::SiteMap;

  my $map = Google::SiteMap->new(file => 'sitemap.gz');

  # Main page, changes a lot because of the blog
  $map->add(Google::SiteMap::URL->new(
    loc        => 'http://www.jasonkohles.com/',
    lastmod    => '2005-06-03',
    changefreq => 'daily',
    priority   => 1.0,
  ));

  # Top level directories, don't change as much, and have a lower priority
  $map->add({
    loc        => "http://www.jasonkohles.com/$_/",
    changefreq => 'weekly',
    priority   => 0.9, # lower priority than the home page
  ) for qw(
    software gpg hamradio photos scuba snippets tools
  );

  $map->write;

DESCRIPTION

The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.

This module allows you to create and modify sitemaps.

METHODS

new()

Creates a new Google::SiteMap object.

  my $map = Google::SiteMap->new(
    file        => 'sitemap.gz',
  );
read()

Read a sitemap in to this object. If a filename is specified, it will be read from that file, otherwise it will be read from the file that was specified with the file() method. Reading of compressed files is done automatically if the filename ends with .gz.

write([$file])

Write the sitemap out to the file. If a filename is specified, it will be written to that file, otherwise it will be written to the file that was specified with the file() method. Writing of compressed files is done automatically if the filename ends with .gz.

urls()

Return the Google::SiteMap::URL objects that make up the sitemap.

add(item,[item...])

Add the Google::SiteMap::URL items listed to the sitemap.

If you pass hashrefs instead of Google::SiteMap::URL objects, it will turn them into objects for you. If the first item you pass is a simple scalar that matches \w, it will assume that the values passed are a hash for a single object. If the first item passed matches m{^\w+://} (i.e. it looks like a URL) then all the arguments will be treated as URLs, and Google::SiteMap::URL objects will be constructed for them, but only the loc field will be populated.

This means you can do any of these:

  # create the Google::SiteMap::URL object yourself
  my $url = Google::SiteMap::URL->new(loc => 'http://www.jasonkohles.com/');
  $map->add($url);

  # or
  $map->add(
    { loc => 'http://www.jasonkohles.com/' },
    { loc => 'http://www.jasonkohles.com/software/google-sitemap/' },
    { loc => 'http://www.jasonkohles.com/software/geo-shapefile/' },
  );

  # or
  $map->add(
    loc       => 'http://www.jasonkohles.com/',
    priority  => 1.0,
  );

  # or even something funkier
  $map->add(qw(
    http://www.jasonkohles.com/
    http://www.jasonkohles.com/software/google-sitemap/
    http://www.jasonkohles.com/software/geo-shapefile/
    http://www.jasonkohles.com/software/text-fakedata/
  ));
  foreach my $url ($map->urls) { $url->changefreq('daily') }
    
xml()

Return the xml representation of the sitemap

file()

Get or set the filename associated with this object. If you call read() or write() without a filename, this is the default.

xmlns()

Get or set the XML namespace to be used for the urlset. Default is http://www.google.com/schemas/sitemap/0.84

pretty()

Set this to a true value to enable 'pretty-printing' on the XML output. If false (the default) the XML will be more compact but not as easily readable for humans (Google and other computers won't care what you set this to).

SEE ALSO

https://www.google.com/webmasters/sitemaps/docs/en/protocol.html

AUTHOR

Jason Kohles, <email@jasonkohles.com>

COPYRIGHT AND LICENSE

Copyright (C) 2005 by Jason Kohles

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.