Anders Ardö

NAME

combineRank - calculates various Ranks for a Combine crawled database

SYNOPSIS

combineRank <action> --jobname <name> --verbose

where action can be one of PageRank, PageRankBL, NetLocRank, and exportLinkGraph. Results on STDOUT.

OPTIONS AND ARGUMENTS

jobname is used to find the appropriate configuration (mandatory)

verbose enables printing of ranks to STDOUT as SQL INSERT statements

Actions calculating variants of PageRank

PageRank

calculate standard PageRank

PageRankBL

calculate PageRanks with backlinks added for each link

NetLocRank

calculate SiteRank for each site and a local DocRank for documents within each site. Global ranks are then calulated as SiteRank * DocRank

exportLinkGraph

export linkgraph from Combine database

DESCRIPTION

Implements calculation of different variants of PageRank.

Results are written to STDOUT and can be huge for large databases.

Linkgraph is exported in ASCII as a sparse matrix, one row per line. First integer is the ID (urlid) of a page with links. The rest of integers on the line are IDs for pages linked to. Ie 121 5624 23416 51423 267178 means that page 121 links to pages 5624 23416 51423 267178

EXAMPLES

combineRank --jobname aatest --verbose PageRankBL

calculate PageRank with backlinks, result on STDOUT

combineRank --jobname aatest --verbose exportLinkGraph

export the linkgraph to STDOUT

SEE ALSO

combine

Combine configuration documentation in /usr/share/doc/combine/.

AUTHOR

Anders Ardö, <anders.ardo@it.lth.se>

COPYRIGHT AND LICENSE

Copyright (C) 2006 Anders Ardö

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.

See the file LICENCE included in the distribution at http://combine.it.lth.se/

1 POD Error

The following errors were encountered while parsing the POD:

Around line 345:

Non-ASCII character seen before =encoding in 'Ardö,'. Assuming ISO8859-1