The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MediaWiki::DumpFile::Compat - Compatibility with Parse::MediaWikiDump

SYNOPSIS

  use MediaWiki::DumpFile::Compat;

  $pmwd = Parse::MediaWikiDump->new;
  

ABOUT

This is a compatibility layer with Parse::MediaWikiDump; instead of "use Parse::MediaWikiDump;" you "use MediaWiki::DumpFile::Compat;". The Parse::MediaWikiDump module itself is well documented so it will not be reproduced here. The benefit of using the new compatibility module is an increased processing speed - see the MediaWiki::DumpFile main documentation for benchmark results.

Compatibility is verified by using the existing Parse::MediaWikiDump test suite with the following adjustments:

Parse::MediaWikiDump::Pages

  • Parse::MediaWikiDump did not need to load all revisions of an article into memory when processing dump files that contain more than one revision but this compatibility module does. The API does not change but the memory requirements for parsing those dump files certainly do. It is, however, highly unlikely that you will notice this as most of the documents with many revisions per article are so large that Parse::MediaWikiDump would not have been able to parse them in any reasonable timeframe.

  • The order of the results from namespaces() is now sorted by the namespace ID instead of being in document order

  • Order of values from next() is now in identical order as SQL file.

BUGS

  • The value of current_byte() wraps at around 2 gigabytes of input XML; see http://rt.cpan.org/Public/Bug/Display.html?id=56843

LIMITATIONS

  • This compatibility layer is not yet well tested.