The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Mail::Webmail::Yahoo - Enables bulk download of yahoo.com -based webmail.

SYNOPSIS

  use Mail::Webmail::Yahoo;
  $yahoo = Mail::Webmail::Yahoo->new(%options);
  @folders = $yahoo->get_folder_list();
  @messages = $yahoo->get_mail_messages('Inbox', 'all');
  # Write messages to disk here, or do something else.

DESCRIPTION

This module grew out of the need to download a large archive of web mail in bulk. As of the module's creation Yahoo did not provide a simple method of performing bulk operations.

This module is intended to make up for that shortcoming.

METHODS

$yahoo = new Mail::Webmail::Yahoo(...)

Creates a new Mail::Webmail::Yahoo object. Pass parameters in key => value form, and these must include, at a minimum:

  username
  password

You may also pass an optional cookie file as cookie_file => '/path/to/file'.

$yahoo->connect();

Connects the application with the site. Really this is not necessary, but it's in here for hysterical raisins.

$yahoo->login();

Method which performs the 'login' stage of connecting to the site. This method can take a while to complete since there are at least several re-directs when logging in to Yahoo.

Returns 0 if already logged in, 1 if successful, otherwise sets $@ and returns undef.

@headers = $yahoo->get_mail_headers($folder);

***REMOVED***

@messages = $yahoo->get_mail_messages($folder);

Returns an array of message headers for the $folder folder. These are mostly in Mail::Internet format, which is nice but involves constructing them from what Yahoo provides -- which ain't much. When an individual message is requested, we can get more info via turning on the headers, so this method requests each method in turn (caching for future use, unless cache_messages is turned off) and builds a Mail::Internet object from each message.

You can get the 'raw' headers from get_folder_index().

Note that for reasons of efficiency this method collects headers and the full text of the message, and this is cached to avoid having to go back to the network each time. To force a refresh, set the Snagmail object's cache to 0 with

  $yahoo->cache_messages(0);
  $yahoo->cache_headers(0);

Note: There used to be a $callback parameter to this method, but since it was never used it has been removed.

my $msg = $yahoo->_process_message($page, $yahoo_msg_id);

Extracts and returns as a Mail::Internet object the headers and message body from the provided HTML ($page).

my $msg = $yahoo->_extract_headers($page, $yahoo_msg_id);

Performs the actual extraction of the message headers from the given HTML in $page. Pushes the $yahoo_msg_id into the headers as 'X-Yahoo-MsgId'. Also adds a version header.

my $ok = $yahoo->_extract_body($mhdr, $page);

Extracts and adds to the Mail::Internet object in $mhdr the message body, including any attachments parsed out of $page. Returns 1 to indicate success, although no error conditions are currently checked for/ handled.

$page = $yahoo->download_attachment($download_uri, $mailmsg);

Downloads an attachment from the specified URI. $mailmsg is a reference to a Mail::Internet object. The downloaded attachment is added to the mailmsg via add_attachment_to_message()

$yahoo->add_attachment_to_message($msg, $attachment, $filename);

Adds the $attachment to $msg, adjusting Content-Type and MIME-Version as necessary.

$yahoo->make_multipart_boundary()

Currently does nothing useful. So far all messages have had correct types.

$yahoo->get_folder_action_link($mbox, $linktype, $force);

Returns and stores the 'action link' for the given $linktype. This is a URI that will cause an action to be performed on a message set, such as DELETE or MOVE.

@message_headers = $yahoo->get_folder_index($folder);

Returns a list of all the messages in the specified folder. These messages are stored as URIs. Logs the user in if necessary.

@messages = $yahoo->_get_message_links($page)

(Private instance method)

Returns the actual links (as an array) needed to pull down the messages. This method is used internally and is not intended to be used from applications, since the messages returned are not in a very friendly form. This method returns only the messages referenced on a given page, and is called from get_folder_index() to build up a complete list of all messages in a folder.

@folders = $yahoo->get_folder_list();

Returns a list of folders in the account. Logs the user in if necessary. Also stores the two special folders ('Trash' and 'Bulk') so they can be emptied later.

$ok = $yahoo->send($to, $subject, $body, $cc, $bcc, $flags);

Attempts to send a message to the recipients listed in $to, $cc, and $bcc, with the specified subject and body text. $to,$cc, and $bcc can be scalars or arrayrefs containing lists of recipients.

Logs the user in if necessary.

$flags may contain any combination of the constants exported by this package. Currently, these constants are:

  SAVE_COPY_TO_SENT_FOLDER  :    saves a copy of a sent message
  ATTACH_SIG                :    attaches the sender's Yahoo signature
  SEND_AS_HTML              :    sends the message in HTML format.

cc and bcc come after subject and body in the parameter list (instead of with 'to') since it is expected that

  send(to, subject, body)

will be more common than sending to Cc or BCc recipients - at least, this is how it is in my experience.

As of this version, address-book lookups are not supported.

As of this version, mail attachments are not supported.

$resp = $yahoo->_get_a_page($uri, $method, $params);

(Private instance method)

Requests and returns a page found at the specified $uri via the specified $method. If $params (an arrayref) is present it will be formatted according to the method.

If method is empty or undefined, it defaults to GET. The ordering of the parameters, while seemingly counter-intuitive, allows one of the great virtues of programming (laziness) by not requiring that the method be passed for every call.

Returns the response object if no error occurs, undef on error.

$current_trace_level = $yahoo->trace($new_trace_level);

if $new_trace_level exists, sets the new level for tracing the operation of the object. Returns the current trace level (i.e. before setting a new one).

Trace levels are:

   0   no tracing output; warning messages only.
 > 0   informative messages ("what I am doing")
 > 1   URIs being fetched
 > 2   request response codes
 > 3   request parameters
 > 4   any other 'extra' debugging info.
 > 9   request response content
$yahoo->debug(...);

Sends debugging messages to STDERR, appended with a newline.

$yahoo->make_host($uri) or Yahoo::make_host($uri)

Returns a string consisting of just the scheme, host, and port parts of the URI. The URI::URL::as_string method returns the full URI (including path) but leaves out the port number, which is why it's unsuitable here.

EXPORTS

Nothing but a few constants. The module is intended to be object-based, and functions should be called as such.

CAVEATS

There is an issue somewhere that prevents https redirects from succeeding. Until this is fixed, the login procedure WILL expose the username and password in plain text over the network.

The user interface of Yahoo webmail is fairly configurable. It is possible the module may not work out-of-the-box with some configurations. It should, however, be possible to tweak the settings at the top of the file to allow conformance to any configuration.

AUTHOR

  Simon Drabble  E<lt>sdrabble@cpan.orgE<gt>

SEE ALSO