The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

AI::NeuralNet::SOM - Perl extension for Kohonen Maps

SYNOPSIS

  use AI::NeuralNet::SOM;
  my $nn = new AI::NeuralNet::SOM::Rect (output_dim => "5x6",
                                         input_dim  => 3);
  $nn->initialize;
  $nn->train (30, 
    [ 3, 2, 4 ], 
    [ -1, -1, -1 ],
    [ 0, 4, -3]);

  print $nn->as_data;

  my $nn = new AI::NeuralNet::SOM::Hexa (output_dim => 6,
                                         input_dim  => 4);
  $nn->initialize ( [ 0, 0, 0, 0 ] );  # all get this value
  $nn->value (3, 2, [ 1, 1, 1, 1 ]);   # change value for a neuron

DESCRIPTION

This package is a stripped down implementation of the Kohonen Maps (self organizing maps). It is NOT meant as demonstration or for use together with some visualisation software. And while it is not (yet) optimized for speed, some consideration has been given that it is not overly slow.

Particular emphasis has been given that the package plays nicely with others. So no use of files, no arcane dependencies, etc.

Scenario

The basic idea is that the neural network consists of a 2-dimensional array of N-dimensional vectors. When the training is started these vectors may be complete random, but over time the network learns from the sample data, also N-dimensional vectors.

Slowly, the vectors in the network will try to approximate the sample vectors fed in. If in the sample vectors there were clusters, then these clusters will be neighbourhoods within the rectangle.

Technically, you have reduced your dimension from N to 2.

INTERFACE

Constructor

The constructor takes arguments:

input_dim : (mandatory, no default)

A positive integer specifying the dimension of the sample vectors (and hence that of the vectors in the grid).

learning_rate: (optional, default 0.1)

This is a magic number which influence how strongly the vectors in the grid can be influenced. Higher movement can mean faster learning if the clusters are very pronounced. If not, then the movement is like noise and the convergence is not good. To mediate that effect, the learning rate is reduced over the iterations.

sigma0: (optional, defaults to radius)

A non-negative number representing the start value for the learning radius. It starts big, but then narrows as learning time progresses. This makes sure that the network finally has only localized changes.

Subclasses will (re)define some of these parameters and add others:

Example:

    my $nn = new AI::NeuralNet::SOM::Rect (output_dim => "5x6",
                                           input_dim  => 3);

Methods

initialize

$nn->initialize

You need to initialize all vectors in the map before training. There are several options how this is done:

providing data vectors

If you provide a list of vectors, these will be used in turn to seed the neurons. If the list is shorter than the number of neurons, the list will be started over. That way it is trivial to zero everything:

  $nn->initialize ( [ 0, 0, 0 ] );
providing no data

Then all vectors will get randomized values (in the range [ -0.5 .. 0.5 ].

TODO: Eigenvectors

train

$nn->train ( $epochs, @samples )

The training uses the sample vectors to make the network learn. Each vector is simply a reference to an array of values.

The epoch parameter controls how often the process is repeated.

Example:

   $nn->train (30, 
               [ 3, 2, 4 ],
               [ -1, -1, -1 ], 
               [ 0, 4, -3]);

TODO: @@@@@@ expose error

bmu

($x, $y, $distance) = $nn->bmu ($vector)

This method finds the best matching unit, i.e. that neuron which is closest to the vector passed in. The method returns the coordinates and the actual distance.

neighbors

$ns = $nn->neighbors ($sigma, $x, $y)

Finds all neighbors of (X, Y) with a distance smaller than SIGMA. Returns a list reference of (X, Y, distance) triples.

radius

$radius = $nn->radius

Returns the radius of the map. Different topologies interpret this differently.

map

$m = $nn->map

This method returns a reference to the map data. See the appropriate subclass of the data representation.

value

Set or get the current vector value for a particular neuron. The neuron is addressed via its coordinates.

as_string

print $nn->as_string

This methods creates a pretty-print version of the current vectors.

as_data

print $nn->as_data

This methods creates a string containing the raw vector data, row by row. This can be fed into gnuplot, for instance.

SEE ALSO

http://www.ai-junkie.com/ann/som/som1.html

AUTHOR

Robert Barta, <rho@devc.at>

COPYRIGHT AND LICENSE

Copyright (C) 2007 by Robert Barta

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.