++ed by:
Kenneth Ingham

# NAME

IDS::Algorithm::MM - Learn or test using a first-order Markov Model (MM).

# SYNOPSIS

A usage synopsis would go here. Since it is not here, read on.

In section 4.2 in Kruegel and Vigna's paper, they ignored the probability information that the MM provided, and produced a binary result. In effect, they were using the constructed MM as a {N,D}FA.

# DESCRIPTION

Someday more will be here.

Ideally, we would be using the algorithm from stolcke94bestfirst. Constructing a DFA rather than a NFA in effect has performed most of the state merging that stolcke93hidden do.

Consider also a java or C/C++ implementaion: http://www.ghmm.org/ http://www.run.montefiore.ulg.ac.be/~francois/software/jahmm/

Useful information: http://www.cs.brown.edu/research/ai/dynamics/tutorial/Documents/HiddenMarkovModels.html http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html L R Rabiner and B H Juang, `An introduction to HMMs', IEEE ASSP Magazine, 3, 4-16.

printvcg
printvcg(filehandle)

Print in a form usable by VCG for printing the DFA.

If the filehandle is specified, print there; otherwise, print to STDOUT.

This code was stolen from DFA, and does not know about the probabilities.

load(filehandle)

Load a MM from a file; this is the inverse of "print", and the format we expect is that used in \$self->print.

test(tokensref, string, instance)

Test the string of tokens and calculate the probability of the string being seen. At each stage, we get a p in [0,1]. The result is the product of these probabilities.

Note that if a transition cannot be made, we return a 0 probability.

add(tokensref, string, instance)

The collection of tokens (in the list referenced by tokensref) is a complete example of a list that should be accepted by the DFA.

string and instance are IDS::Test framework arguments that we ignore because we do not need them.

WE add the transition from the last token to the '(ACCEPT)' state.

add_transition(from, token)

Add a transition from one state to another when the specified token is received. It is not an error to try to add an existing transition. In that event, this function quietly returns. If no such transition exists, we look for a transition on the token; if so, we add an edge to the destination node for the existing edge. Finally, if there is no other choice, we create a new state and add the edge.

generalize()

Reduce the number of states in the model.

Our building a DFA rather than a NFA has in effect performed most of the state merging that would have occurred.

XXX We should still be doing some checks for additional merge possibilities.

XXX A proof that the DFA is effectively the NFA with merged states would be useful.

# AUTHOR INFORMATION

Copyright 2005-2007, Kenneth Ingham. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Address bug reports and comments to: ids_test at i-pi.com. When sending bug reports, please provide the versions of IDS::Test.pm, IDS::Algorithm.pm, IDS::DataSource.pm, the version of Perl, and the name and version of the operating system you are using. Since Kenneth is a PhD student, the speed of the response depends on how the research is proceeding.

# BUGS

Please report them.

# SEE ALSO

Best-first Model Merging for Hidden Markov Model Induction by A. Stolcke and S. M. Omohundro, Technical Report TR-94-003, 1994. http://citeseer.ist.psu.edu/stolcke94bestfirst.html

Anomaly detection of web-based attacks by Christopher Kruegel and Giovanni Vigna in Proceedings of the 10th ACM conference on Computer and Communications Security, 2003, pages 251--261, ISBN 1-58113-738-9. http://doi.acm.org/10.1145/948109.948144