analogize - classify data with AM from the command line
version 3.12
analogize --format <format> [--exemplars <file>] [--test <file>] [--project <dir>] [--print <config_info,statistical_summary, analogical_set_summary,gang_summary,gang_detailed>] [--help]
Classify data with analogical modeling from the command line. Required arguments are format and either exemplars or project. You can use old AM::Parallel projects (a directory containing data and test files) or specify individual data and test files. By default, only the accuracy of the predicted outcomes is printed. More detail may be printed using the print option.
data
test
specify either commas or nocommas format for exemplar and test data files (= should be used for "null" variables). See "dataset_from_file" in Algorithm::AM::DataSet for details on the two formats.
=
exemplars
train
path to the file containing the examplar/training data
project
path to an AM::Parallel-style project (ignores 'outcome' file); this should be a directory containing a file called data containing known exemplars and test containing test exemplars. If the test file does not exist, then a leave-one-out scheme is used for testing using the exemplars in the data file.
path to the file containing the test data. If none is specified, performs leave-one-out classification with the exemplar set.
print
reports to print, separated by commas (be careful not to add spaces between report names!). For example, --print analogical_set_summary,gang_summary would print analogical sets and gang summaries.
--print analogical_set_summary,gang_summary
Available options are:
config_info
Describes the configuration used and some simple information about the data, i.e. cardinality, etc.
statistical_summary
A statistical summary of the classification results, including all predicted outcomes with their scores and percentages and the total score for all outcomes. Whether the predicted class is correct, incorrect, or a tie is also included, if the test item had a known class.
analogical_set_summary
The analogical set, showing all items that contributed to the predicted outcome, along with the amount contributed by each item (score and percentage overall).
gang_summary
A summary of the gang effects on the outcome prediction.
gang_detailed
Same as gang_summary, but also includes lists of exemplars for each gang.
include_given
Allow a test item to be included in the data set during classification. If false (default), test items will be removed from the dataset during classification.
include_nulls
Treat null variables in a test item as regular variables. If false (default), these variables will be excluded and not considered during classification.
linear
Calculate scores using occurrences (linearly) instead of using pointers (quadratically).
help
?
print help message
This distribution comes with a sample dataset in the datasets/soybean directory. Data exemplars are in data and a single test exemplar is in test. The files are in the commas format. The following two commands are equivalent and will analyze the test exemplar and output a summary of gang effects to gang.txt:
datasets/soybean
commas
gang.txt
analogize --exemplars datasets/soybean/data --test datasets/soybean/test --format commas --print gang_summary > gang.txt analogize --project datasets/soybean --format commas --print gang_summary > gang.txt
The resulting files are best viewed in a text editor with word wrap turned off.
Theron Stanford <shixilun@yahoo.com>, Nathan Glenn <garfieldnate@gmail.com>
This software is copyright (c) 2021 by Royal Skousen.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Algorithm::AM, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Algorithm::AM
CPAN shell
perl -MCPAN -e shell install Algorithm::AM
For more information on module installation, please visit the detailed CPAN module installation guide.