NAME

Algorithm::LibLinear::FeatureScaling

SYNOPSIS

use Algorithm::LibLinear::DataSet;
use Algorithm::LibLinear::FeatureScaling;

my $scale = Algorithm::LibLinear::FeatureScaling->new(
  data_set => Algorithm::LibLinear::DataSet->new(...),
  lower_bound => -10,
  upper_bound => 10,
);
my $scale = Algorithm::LibLinear::FeatureScaling->load(
  filename => '/path/to/file',
);

my $scaled_feature = $scale->scale(feature => +{ 1 => 30, 2 => - 25, ... });
my $scaled_labeled_data = $scale->scale(
  labeled_data => +{ feature => +{ 1 => 30, ... }, label => 1 },
);
my $scaled_data_set = $scale->scale(
  data_set => Algorithm::LibLinear::DataSet->new(...),
);

say $scale->as_string;
$scale->save(filename => '/path/to/another/file');

DESCRIPTION

Support vector classification is actually just a calculation of inner product of feature vector and normal vector of separation hyperplane. If some elements in feature vectors have greater dynamic range than others, they can have stronger influence on the final calculation result.

For example, consider a normal vector to be { 1 1 1 } and feature vectors to be classified are { -2 10 5 }, { 5 -50 0 } and { 10 100 10 }. Inner products of these normal vector and feature vectors are 13, -45 and 120 respectively. Obviously 2nd element of the feature vectors have wider dynamic range than others and dominate calculation result.

To avoid such a problem, normalizing range of elements of feature vectors is very important. This module provides such vector scaling functionality. You can see this is a library version of LIBLINEAR's svm-scale command.

METHODS

new(data_set => $data_set | min_max_values => \@min_max_values [, lower_bound => 0.0] [, upper_bound => 1.0])

Constructor. You can set some named parameters below. At least data_set or min_max_values is required.

data_set

An instance of Algorithm::LibLinear::DataSet. This is used to compute dynamic ranges of each vector element.

min_max_values

Pre-calculated dynamic ranges of each vector element. Its structure is like:

my @min_max_values = (
  [ -10, 10 ],  # Dynamic range of 1st elements of vectors.
  [ 0, 1 ],     # 2nd.
  [ -1, 1 ],    # 3rd.
  ...
);

lower_bound

upper_bound

The min/max values of elements to be scaled (inclusive). Default values are 0.0 and 1.0 respectively.

load(filename => $path | fh => \*FH | string => $content)

Class method. Creates new instance from dumped scaling parameter file.

Please note that this method can parse only a subset of svm-scale's file format at present.

as_string

Dumps the scaling parameter as svm-scale's format.

save(filename => $path | fh => \*FH)

Writes result of as_string out to a file.

scale(data_set => $data_set | feature => \%feature | labeled_data => \%labeled_data)

Scale the given feature, labeled data or data set.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)