Tim Skirvin


News::Web - a News<->Web gateway, for a web-based newsreader


  use News::Web;

See 'news.cgi', included with this distribution, to see how this actually works.


News::Web is the basis for web-based newsreaders. It's essentially a collection of functions called by CGI scripts appropriately.


Basic Functions

These functions create and deal with the object itself.

new ( ITEM )

ITEM is a reference to an object similar to Net::NNTP - that is, this was designed to work directly with Net::NNTP, but will also work with other similar classes such as News::Archive. Returns a reference to the new object.

connect ()
nntp ()

These functions return the actual NNTP connection (or whatever was passed into new()), to be manipulated directly by other functions.

HTML functions

These functions create the HTML tables used by the various CGI scripts.

html_article ( ARGHASH )

Returns an HTML-formatted version of a news article, either by passing in a full article, a message ID of an article to retrieve, or a newsgroup and message number to pass. Also includes several linkback()'d sets of actions we can perform on the article.

Arguments we take from ARGHASH:

  article       A full news article, read directly into the 
                News::Article object.  
  mid           The Message-ID of an article to read.
  group    \    Together, the group and article number to read in
  number   /    from the NNTP connection to get the article.
  fullhead      Should we use full headers, or a limited set (as 
                specified in @News::Web::DEFAULTHEAD)?  Defaults to 0.
  clean         Just print the article, and not the linkbacks.  
                Defaults to 0.
  plaintext     Should we print this as plaintext or as HTML?  
                Defaults to 0 (HTML).
  default       A hashref of defaults; this isn't actually used here,
                but is passed on where necessary to other functions.

Current linkbacks (we'll work out a way to better format and select these later).

  Follow Up        Respond to this article (with html_makearticle()).
  Full Headers     Show all of the headers, not just the limited set.
  Original Format  The message in its original format (only if we're 
                   not already doing so).
  First in Thread  The first article in the thread (only if there 
                   is one).
  Immediate Parent The article that this message responded to (if
                   there is one).

  Next Article     \  Links to the next/previous article in each 
  Previous Article /  group (not based on thread, sadly)

I'm not entirely happy with the format of this yet, but it works for now. I would also to add "next/previous in thread", rot13, lists of children, and so forth. And the linkback list may at some point be more consistent and available programmatically, which would let it be tied into other functions, such as moderation 'bots.

html_makearticle ( ARGHASH )

Creates an HTML form to write new articles, based on a previous article if the proper information is passed in with ARGHASH. This is put into a table that has three major sections - the header section, the body, and the signature.

If a previous article is indicated (with 'mid'), then we base the new message off of that article with News::Article::Response.

Arguments we take from ARGHASH: mid The Message-ID of a message we're responding to; group The newsgroup we're posting to prefix A prefix to the new message-ID (see News::Article) domain The domain of the new message-ID (see News::Article). columns The number of columns to format the body and signature. Defaults to $News::Web::COLUMNS or 80. rows The number of rows for the textarea box of the body; defaults to $News::Web::ROWS or 30. sigrows The number of rows for the textarea box of the signature box; defaults to $News::Web::SIGROWS or 30. nosignature Don't offer use the signature box. wraptype How should we wrap the quoted material? See News::Article::Reference. Defaults to 'overflow'. params A hashref of extra parameters to pass into html_post() default A hashref of defaults; this isn't actually used here, but is passed on where necessary to other functions.

Current linkbacks:

  Post          Meant to invoke html_post()
  Preview       Meant to invoke html_post() with the preview flag 

We're not really using CSS at all yet, which is a mistake.

html_post ( ARGHASH )

Actually posts the message. Gets the article from passed in arguments through ARGHASH, adds some headers, and does the work.

Arguments we take from ARGHASH:

  params        CGI parameters that were passed in
    header_*    The headers of the message
    body        The body of the message, separated by newlines
    signature   The signature of the message

  trace         The content to set 'X-Local-Trace' to, which is 
                currently set by the CGI (but should probably be 
                done locally).
  default       A hashref of defaults; this isn't actually used here,
                but is passed on where necessary to other functions.

Extra headers are pulled out of the first lines of the body of the message. Adds 'X-Newsreader' and 'X-Local-Trace', drops 'Approved' and 'Date'. Runs html_makearticle() if necessary because the article didn't, or wouldn't, post.

html_overview ( ARGHASH )

Generate an HTML-formatted table of the overview entries of a given newsgroup (see News::Overview). This table consists of nexttable(), tableheaders(), lines for each entry, then nexttable() again.

COUNT is the number of articles we should get; it should be the number of articles that we actually return, but this isn't done yet. The subject is linkback()'d to the actual message.

Arguments we take in ARGHASH:

  count         The number of articles that we should return.  
                Currently, this is actually the number that we ask 
  last          The last article we should get.  With 'count', FIRST =
                COUNT - LAST + 1.  
  first         The first article we should get.  With 'count' and no 
                'last', LAST = first + count - 1
  sort          The sorting method for the articles, as set in 
  fields        The fields from the overview DB to add columns for; 
                defaults to News::Overview's defaults.  These
  default       A hashref of defaults; this isn't actually used here,
                but is passed on where necessary to other functions.

Creates the 'next table' bits for html_overview(). GROUPINFO is an arrayref that is the response of Net::NNTP->group(), and is used to determine what articles exist so we know what to link to.

PARAMHASH and DEFAULTHASH are passed to linkback() (with different 'sort' options).

We don't have any CSS hooks right now, again a mistake.

Returns as an array of HTML lines.


Creates the table headers for html_overview(). FIELDARRAYREF is the list of fields that will be printed in the table body. Each of these is printed as two linkback()s, one to sort based on this field and the other to sort the same but backwards. These links are parsed by html_overview(). PARAMHASH and DEFAULTHASH are passed to linkback() (with different 'sort' options).

Stylesheet hooks:

  groupinfo     TH style to describe the headers

Returns as an array of <th> lines.

html_grouplist ( [PATTERN] )

Lists all of the active newsgroups, based on PATTERN (defaults to '*'), with descriptions. Returns the text to be printed, joined by newlines.

If PATTERN is not passed in, then we will instead get the default list of groups out of 'subscriptions'.

Possible refinements: should we list the number of messages (estimated or real)? The posting status of the group (moderated, no-posting, etc)?

Stylesheet hooks:

  grouplist_head        TR and TD style, for the headers of the table.
  grouplist             TR style, for the actual table content lines.
  grouplist_1           TD style, alternating between the two styles, 
  grouplist_2           to allow the lines to look different (and 
                        and therefore be easier to follow).
html_hierarchies ( LEVELS, PATTERN )

Gives a set of linkback()'d group listings, based on the newsgroups available on the server. PATTERN is the WILDMAT pattern to decide which groups to match; LEVELS defines how many levels down to go down when matching the pattern ('news' would be one level, 'news.admin' would be two, etc). Doesn't match actual newsgroups, just hierarchies.

Returns the list of linkbacks in an array context, or a line of them combined with ' | ' as a scalar.


Returns an HTML link back to the same program, based on the hash references HASHREF and DEFAULT. TEXT is the string that appears in the link. The key/value pairs in HASHREF are the options passed in the URL; however, if the DEFAULT hash matching value matches HASHREF, then we assume that we don't need that argument (and we should try to keep the URL short anyway).

This probably needs more refinement, but it more or less works.


Cleans up the news information for the best distribution. Mostly useful for creating new articles and parsing article inforation properly; not so useful for actually printing articles, where the original formatting may have been generally useful. HEADER choices that are currently supported:

  subject       Formats the Subject: line to have quote characters 
                at the start (based on the number of entries in 
                References: within C<ENTRY>) and trim the total
                length of the string.
  from          Formats the From: line consistently; by default it 
                gets the actual author and drops the email address.
                Also trims the total length of the string (see 
                the arguments section).
  date          Format the date consistently with Format::Date's
                str2time() command.  
  message-id    Make sure that the passed ID is properly formatted
                with '<' and '>' characters.

Arguments we take:

  subjwidth     Width to trim the Subject: line to; if less than 0, 
                then we don't trim the header at all.  Defaults to 55.
  fromwidth     Width to trim the From: line to; if less than 0, then 
                we don't trim the header at all.  Defaults to 25.
  fromtype      The formatting method for the From: line.  Possible
                options: 'name', 'nameemail', 'email', 'emailname'.
                Defaults to 'name'.

This should be replaced with something from News::Article::Ref, or moved into there.

html_clean ( LINE [, LINE [...]] )

Cleans up LINE(s) for the web - ie, fixes special HTML characters, and sets up links for http:// and ftp:// links. Returns a string containing the modified lines, joined with newlines.

There's probably a lot more that can be done here.

html_markup ( HEADER, TEXT [, ARGS] )

Marks up HEADER and TEXT to be printed in HTML. TEXT is put through html_clean() and an additional set of fixes:

  newsgroups            linkback() to the given group

HEADER is bolded. The final layout - "HEADER: TEXT".


News::Overview, Net::NNTP, News::Article, News::Article::Response and News::Article::Ref (both part of NewsLib), IO::File, CGI.pm, Date::Parse


News::Overview, News::Web::CookieAuth, News::Article::Ref, News::Article::Response


I'm not really done with this thing yet. This is just something that generally *works*, and has something resembling documentation. I've got a lot of work to do to make this what I really want to do, but I'm happy with the start.


Still have a ways to go with stylesheets.

Should really use the Tr() type functions for making tables.

Various user interface improvements in html_article(), as well as backend improvements.

$count should really offer the number of articles we asked for, no matter what, rather than estimating things.

mod_perl-ify this stuff.

It'd be nice if the documentation were a bit more transparent, enough so that someone could recreate the actual gateway .cgi files without having to refer to them.


Tim Skirvin <tskirvin@killfile.org>


Copyright 2003 by Tim Skirvin <tskirvin@killfile.org>. This code may be distributed under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1118:

You forgot a '=back' before '=head1'