- ONLINE CODES
- ONLINE CODES IN MORE DETAIL
- IMPLEMENTATION DETAILS
- RELEASE NOTE FOR V0.03
- SEE ALSO
- COPYRIGHT AND LICENSE
Net::OnlineCode - A rateless forward error correction scheme
use strict; # Base class can export routines for doing xor use Net::OnlineCode ':xor'; # Use the constructor to examine what algorithm parameters # would be generated for a given number of message blocks my $o = Net::OnlineCode->new( mblocks => 100_000, # required parameter e => 0.01, q => 3 # optional/default values ); my $e = $o->get_e; # The 'e' and 'q' values may be my $q = $o->get_q; # updated for some values of mblocks my $f = $o->get_f; # calculated from 'mblocks', 'e' and 'q' # At this point, you would normally destroy this object and # re-create an Encoder and/or Decoder object using the same # parameters as extracted above. # some strings to demonstrate use of *_xor_* routines my @strings = ("abcde", " ", "ABCDE", "\0\0\0\0\0"); # xor routines take a reference to a destination string, which is # modified by xoring it with all the other strings passed in. The # calculated value is also returned. # "safe" xor routine is a pure Perl implementation my $result1 = safe_xor_strings(\$strings, @strings[1..3]); # "fast" xor routine is implemented in C my $result2 = fast_xor_strings(\$strings, @strings[1..3]);
This section will not make much sense to you unless you know what Online Codes are and how they work. If this is your first time reading this page, I advise you to skip ahead to the next section.
to find out what parameters are generated for a given number of message blocks
to access the exported XOR routines
The constructor takes a required 'mblocks' parameter and, optionally, values for 'e' and 'q' (described below). It then checks whether this combination of parameters makes sense. If it doesn't, it will change the 'e' or 'q' values to something more appropriate.
There is usually no need to call the constructor directly since it will be called during the creation of Net::OnlineCode::Encoder or Net::OnlineCode::Decoder objects. However, if you wish to examine what changes were made to the supplied parameters, it is more efficient to call the base class's constructor directly since it avoids some costly initialisation code included in the Encoder/Decoder classes.
The remaining two exported functions can be used to XOR pairs of strings. Since the Online Code scheme is based on XORing blocks of data, the base class provides two implementations to assist with this. One implementation (safe_xor_strings) is a slow, pure Perl sub, while the other (fast_xor_strings) uses a call to a faster C routine. This module (and the derived Encoder and Decoder module) only deal with the details necessary to implement the Online Code algorithms, but do not store or XOR blocks automatically: those tasks are left for you to implement in whatever way you want.
The remainder of this document will give a brief overview of what Online Codes are and how they work. For a programmer's view of how to use this collection of modules, refer to the man pages for the Encoder and Decoder modules.
Briefly, Online Codes are a scheme that allows a sender to break up a message (eg, a file) into a series of "check blocks" for transmission over a lossy network. When the receiver has received and decoded enough check blocks, they will ultimately be able to recover the original message in its entirety.
Online Codes differ from traditional "forward error correcting" schemes in two important respects:
they are fast to calculate (on both sending and receiving end); and
they are "rateless", meaning that the sender can send out a (practically) infinite stream of check blocks. The receiver typically only has to correctly receive a certain number of them (usually a only small percentage more than the number of original message blocks) in order to decode the full message.
When using a traditional error-correction scheme, the sender usually has to set up the encoder parameters to take account of the expected packet loss rate, and if the loss rate is greater than this, the file generally cannot be recovered at all. In contrast, with Online Codes, the sender effectively doesn't care about what the packet loss rate: it just keeps sending new check blocks until either the receiver(s) acknowledge the message as having been decoded, or until it has a reasonable expectation that the message should have been decoded.
The fundamental idea used in Online Codes is to xor some number of message blocks together to form either auxiliary blocks (which are internal to the algorithm) or check blocks (which are sent across the network). Each check block is sent along with a block ID, which encodes information about which message blocks (or auxiliary blocks) comprise that check block. Initially, the only check blocks that a receiver can use are those that are comprised of only a single message block, but as more message blocks are decoded, they can be xored (or, in algebraic terms, "substituted") into each pending (unsolved) check block. Eventually, given enough check blocks, the receiver will be able to solve each of the message blocks.
Encoding consists of two parts:
Before transmission begins, some number of auxiliary blocks are created by xoring a random selection of message blocks together.
For each check block that is to be transmitted, a random selection of auxiliary and/or message blocks (collectively referred to as "composite blocks") are xored together.
Decoding follows the same steps as in the Encoder, except in reverse. Each received check block can potentially solve one unknown auxiliary or message block directly. Further, every time an auxiliary or message block becomes solved, that value can be "substituted in" to any check block that has not yet been fully decoded. Those check blocks may then be able to solve more message or auxiliary blocks.
When the receiver receives a check block, it needs to know which message and/or auxiliary blocks it is composed of. Likewise, it needs to know which message blocks each auxiliary block is composed of. This is achieved by having both the sender and receiver side use an identical pseudo-random number generator algorithm. Since both sides are using an identical PRNG, they can both use it to randomly select which message blocks comprise each auxiliary block, and which composite blocks comprise each check block.
Naturally, for this to work, not only should the sender and receiver both be using the same PRNG algorithm, but they also need to be using the same PRNG seeds. This is where Block IDs (and also, during the Encoder/Decoder construction phase , File IDs) come in. For check blocks, the sender picks a (truly) random Block ID and uses it to seed the PRNG. Then, using the PRNG, it pseudo-randomly selects some number of composite blocks. It then sends the Block ID along with the xor of all the selected composite blocks. The sender then uses the Block ID to seed its own PRNG, so when it pseudo-randomly selects the list of composite blocks, it should be the same as that selected by the sender.
A similar scheme is used at the start of the transmission to determine which message blocks are xored to create the auxiliary blocks.
The upshot of this is that Block (and File) IDs only need to be a fixed size, regardless of how many message blocks there are, how many composite blocks are included in a check block, and so on. This makes it much easier to design a packet format and process it at the sending and receiving sides.
The module is a fairly faithful implementation of Maymounkov and Maziere's paper describing the scheme. There are some slight variations:
in the case where the epsilon value would produce a max degree (F) value greater than the number of message blocks, it (ie, epsilon) is increased until F <= # of message blocks. The code to do this is based on Python code by Gwylim Ashley.
the graph decoding algorithm also includes an "auxiliary rule", which allows decoding (solving) an auxliary block that comprises only solved message blocks
message blocks are decoded immediately: rather, the calling application makes the choice about when to do the xors (this may be more time-efficient, particularly if the application is storing check blocks to secondary storage, but in any event, decoupling the Online Code algorithm from the XOR step allows much more flexibility)
Apart from that, the original paper does not specify a PRNG algorithm. This module implements one using SHA-1, since it is portable across platforms and readily available.
This version of the code is a major advance from the previous version. Most of the algorithms and API are the same, but this version is much more optimised, both in terms of speed and memory usage. The various changes that have been implemented are documented in the RELEASES file at the top level of distribution tarball.
I consider this version to be stable. While performance has been improved considerably, it may still be too slow for some applications. As a result, I intend to re-implement critical parts of it in C for the next release.
Github repository for Gwylim Ashley's Online Code implementation (various parts of my code are based on this)
There are various test/demo programs in the tests directory in this distribution that can serve as a reference for how to use these modules. In particular, the following can be useful examples to base your own code on:
mindecoder.pl is a minimal decoder that doesn't do any XORing of data
codec.pl does both encoding and decoding of a real message string
probdist.pl shows the effects that various message block counts have on the 'e' and 'q' parameters
This module is part of the GnetRAID project. For project development page, see:
I am also moving over to using github for all future development:
Declan Malone, <email@example.com>
Copyright (C) 2013-2015 by Declan Malone
This package is free software; you can redistribute it and/or modify it under the terms of the "GNU General Public License" ("GPL").
The C code at the core of this Perl module can additionally be redistributed and/or modified under the terms of the "GNU Library General Public License" ("LGPL"). For the purpose of that license, the "library" is defined as the unmodified C code in the clib/ directory of this distribution. You are permitted to change the typedefs and function prototypes to match the word sizes on your machine, but any further modification (such as removing the static modifier for non-exported function or data structure names) are not permitted under the LGPL, so the library will revert to being covered by the full version of the GPL.
Please refer to the files "GNU_GPL.txt" and "GNU_LGPL.txt" in this distribution for details.
This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the "GNU General Public License" for more details.