GraphViz2::Marpa::PathUtils - Provide various analyses of Graphviz dot files
All scripts and input and output files listed here are shipped with the distro.
Either pass parameters in to new():
GraphViz2::Marpa::PathUtils -> new ( input_file => 'data/clusters.in.09.gv', output_file => 'out/clusters.out.09', # Actually a prefix. See FAQ. report_clusters => 1, ) -> find_clusters;
Or call methods to set parameters:
my($parser) = GraphViz2::Marpa::PathUtils -> new; $parser -> input_file('data/clusters.in.09.gv'); $parser -> output_file('out/clusters.out.09'); # Actually a prefix. See FAQ. $parser -> report_clusters(1);
And then:
$parser -> find_clusters;
Command line usage:
shell> perl scripts/find.clusters.pl -h
Or see scripts/find.clusters.sh, which hard-codes my test data values.
GraphViz2::Marpa::PathUtils -> new ( allow_cycles => 0, input_file => 'data/fixed.length.paths.in.01.gv', output_file => 'out/fixed.length.paths.out.01.gv', path_length => 3, report_paths => 1, start_node => 'Act_1', ) -> find_fixed_length_paths;
Or call methods to set parameters;
my($parser) = GraphViz2::Marpa::PathUtils -> new; $parser -> allow_cycles(0); $parser -> input_file('data/fixed.length.paths.in.01.gv'); $parser -> output_file('out/fixed.length.paths.in.01.gv'); $parser -> path_length(3); $parser -> report_paths(1); $parser -> start_node('Act_1');
$parser -> find_fixed_length_paths;
shell> perl scripts/find.fixed.length.paths.pl -h
Or see scripts/fixed.length.paths.sh, which hard-codes my test data values.
GraphViz2::Marpa::PathUtils parses Graphviz dot files and processes the output in various ways.
Features:
Nodes within such groups are connected to each other, but have no links to nodes outside the cluster.
Graphviz uses 'cluster' is a slightly-different sense. Here, you could say 'sub-tree' instead of 'cluster'.
See "find_clusters()".
This algorithm does not handle edges into or out of subgraphs.
See "find_fixed_length_paths()".
Sample output: http://savage.net.au/Perl-modules/html/graphviz2.marpa.pathutils/index.html.
This class is a descendent of GraphViz2::Marpa, and hence inherits all its keys to new(), and all its methods.
All scripts are in the scripts/ directory. This means they do not get installed along with the package.
Input data files are in data/, output data files are in out/, and html and svg files are in html/.
This is only for use by the author.
Try shell> perl find.clusters.pl -h
This runs the "find_clusters()" method.
This runs find.clusters.pl with hard-coded parameters, and is what I use for testing new code.
This runs the "find_fixed_length_paths()" method.
Try shell> perl find.fixed.length.paths.pl -h
This runs find.fixed.length.paths.pl with hard-coded parameters, and is what I use for testing new code.
Uses the Text::Xslate template htdocs/assets/templates/graphviz2/marpa/pathutils/pathutils.tx to generate html/index.html.
Runs generate.demo.pl.
Then copies html/*.html and html/*.svg to my web server's dir, $DR/Perl-modules/html/graphviz2.marpa.pathutils/.
Converts all *.pm files to *.html, and copies them in my web server's dir structure.
See also t/test.t.
This module is available as a Unix-style distro (*.tgz).
See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.
Install GraphViz2::Marpa::PathUtils as you would for any Perl module:
Perl
Run:
cpanm GraphViz2::Marpa::PathUtils
or run:
sudo cpan GraphViz2::Marpa::PathUtils
or unpack the distro, and then either:
perl Build.PL ./Build ./Build test sudo ./Build install
or:
perl Makefile.PL make (or dmake or nmake) make test make install
new() is called as my($obj) = GraphViz2::Marpa::PathUtils -> new(k1 => v1, k2 => v2, ...).
new()
my($obj) = GraphViz2::Marpa::PathUtils -> new(k1 => v1, k2 => v2, ...)
It returns a new object of type GraphViz2::Marpa::PathUtils.
GraphViz2::Marpa::PathUtils
This class is a descendent of GraphViz2::Marpa, and hence inherits all its parameters.
In particular, see "Constructor and Initialization" in GraphViz2::Marpa for these parameters:
See the "FAQ" for the 2 interpretations of this parameter.
Further, these key-value pairs are accepted in the parameter list (see corresponding methods for details [e.g. "path_length($integer)"]):
Specify whether or not cycles are allowed in the paths found.
Values for $integer:
This is the default.
Note: allow_cycles => 1 is not implemented yet in V 2.
Default: 0.
This option is only used when calling "find_fixed_length_paths()".
Specify the length of all paths to be included in the output.
Here, length means the number of edges between nodes.
This parameter is mandatory, and must be > 0.
Specify whether or not to print a report of the clusters found.
Default: 0 (do not print).
This option is only used when calling "find_clusters()".
Specify whether or not to print a report of the paths found.
Specify the name of the node where all paths must start from.
Default: ''.
This parameter is mandatory.
The name can be the empty string, but must not be undef.
This class is a descendent of GraphViz2::Marpa, and hence inherits all its methods.
In particular, see "Methods" in GraphViz2::Marpa for these methods:
See the "FAQ" for the 2 interpretations of this method.
Further, these methods are implemented.
Here the [] indicate an optional parameter.
Get or set the value determining whether or not cycles are allowed in the paths found.
'allow_cycles' is a parameter to "new()". See "Constructor and Initialization" for details.
Returns a hashref. The keys are integers and the values are sets of type Set::Tiny.
Each set contains the nodes of each independent cluster within the input file. Here, independent means there are no edges joining one cluster to another.
Each set is turned into a tree during a call to "find_clusters()". You can retrieve these trees by calling "cluster_trees()".
Both "find_clusters()" and "find_fixed_length_paths()" populate this hashref, but the former adds stand-alone nodes because they are clusters in their own right.
So why doesn't "find_fixed_length_paths()" add stand-alone nodes? Because you can't have a path of length 0, so stand-alone nodes cannot participate in the search for fixed length paths.
Returns a hashref. The keys are integers and the values are Tree::DAG_Node trees.
This is only populated by "find_clusters()".
The input comes from calling "cluster_sets()", so the output is one tree per cluster. This means there is 1 tree per set returned by "cluster_sets()".
This is one of the 2 methods which do all the work, and hence must be called. The other is "find_fixed_length_paths()".
The program ignores port and compass suffixes on node names. See http://savage.net.au/Perl-modules/html/graphviz2.marpa.pathutils/clusters.in.10.html for a sample.
See the "Synopsis" and scripts/find.clusters.pl.
Returns 0 for success and 1 for failure.
This is one of the 2 methods which do all the work, and hence must be called. The other is "find_clusters()".
Notes:
The code does not handle these cases, which will be difficult to implement.
Just specify the start node without the port+compass details.
See the "Synopsis" and scripts/find.fixed.length.paths.pl.
Returns the arrayref of paths found. Each element is 1 path, and paths are stored as an arrayref of objects of type Tree::DAG_Node.
See the source code of "output_fixed_length_gv()", or "report_fixed_length_paths()", for sample usage.
See "Constructor and Initialization" for details on the parameters accepted by "new()".
Output a set of files, one per cluster, if new() was called as new(output_file => '...').
Output a DOT file for given $tree, with the given $title, if new() was called as new(output_file => '...').
Get or set the length of the paths to be searched for.
'path_length' is a parameter to "new()". See "Constructor and Initialization" for details.
This prints the clusters found, if new() was called as new(report_clusters => 1).
Get or set the option which determines whether or not the clusters found are printed.
'report_clusters' is a parameter to "new()". See "Constructor and Initialization" for details.
This prints the fixed length paths found, if new() was called as new(report_paths => 1).
Get or set the option which determines whether or not the fixed length paths found are printed.
'report_paths' is a parameter to "new()". See "Constructor and Initialization" for details.
Get or set the name of the node from where all paths must start.
Specify the start node without the port+compass details.
'start_node' is a parameter to "new()". See "Constructor and Initialization" for details.
Just specify the start node without the port+compass details, and it will work.
Check GraphViz2::Marpa::PathUtils::Demo line 82. The corresponding data is data/fixed.length.paths.in.04.gv, out/* and html/* files.
If your input and output file names are Path::Tiny objects, do this:
new(input_file => "$input_file", output_file => "$output_file", ...)
output_file
This section discusses input/output DOT file naming. HTML file names follow the same system.
Each cluster's DOT syntax is written to a different file. So output_file must be the prefix of these output files.
Hence, in scripts/find.clusters.sh, GV2=clusters.out.$1 (using $1 = '09', say) means output clusters' DOT files will be called clusters.out.09.001.gv up to clusters.out.09.007.gv.
GV2=clusters.out.$1
clusters.out.09.001.gv
clusters.out.09.007.gv
So, the prefix has ".$n.gv" attached as a suffix, for clusters 1 .. N.
Since there is only ever 1 DOT file output here, output_file is the name of that file.
Hence, in scripts/find.fixed.length.path.sh, the value supplied is out/fixed.length.paths.out.$FILE.gv, where $FILE will be '01', '02' or '03'. So the output file, using $FILE='01', will be out/fixed.length.paths.out.01.gv.
out/fixed.length.paths.out.$FILE.gv
out/fixed.length.paths.out.01.gv
The most likely explanation is that you're calling find_fixed_path_lengths() and you've specified all nodes, or at least some, to have a label like "\N".
This escape sequence triggers special processing in AT&T's Graphviz to generate labels for nodes, overriding the code I use to re-name nodes in the output of find_fixed_path_lengths().
See _prepare_fixed_length_output() for the gory details.
The purpose of my re-numbering code is to allow a node to appear in the output multiple times but to stop Graphviz automatically regarding all such references to be the same node. Giving a node different names (which are un-seen) but the same label (which is seen) makes Graphviz think they are really different nodes.
The 3 samples in part 2 of the demo page should make this issue clear.
Graphviz. Marpa.
See also Jeffrey's annotated blog.
They are simply counted in the order discovered in the input file.
This is controlled by the allow_cycles option to new(), or the corresponding method "allow_cycles($integer)".
The code keeps track of the number of times each node is entered. If new(allow_cycles => 0) was called, nodes are only considered if they are entered once. If new(allow_cycles => 1) was called, nodes are also considered if they are entered a second time.
Sample code: Using the input file data/fixed.length.paths.in.01.gv we can specify various combinations of parameters like this:
allow_cycles path_length start node solutions 0 3 Act_1 4 1 3 Act_1 ? 0 4 Act_1 5 1 4 Act_1 ?
Yes, as long as they are unique in the input. Something like this produces 8 identical solutions (starting from A, of path length 3) because each node B, C, D, can be entered in 2 different ways, and 2**3 = 8.
digraph G { A -> B -> C -> D; A -> B -> C -> D; }
See data/fixed.length.paths.in.03.gv and html/fixed.length.paths.out.03.svg.
Agreed. Remember that this code calls GraphViz2::Marpa's run() method, which expects a large number of options.
Error: <stdin>:1: syntax error near line 1 context: digraph >>> Graph <<< {
Graphviz reserves some words as keywords, meaning they can't be used as an ID, e.g. for the name of the graph.
The keywords are: digraph, edge, graph, node, strict and subgraph. Compass points are not keywords.
See keywords in the discussion of the syntax of DOT for details.
So, don't do this:
strict graph graph{...} strict graph Graph{...} strict graph strict{...} etc...
Likewise for non-strict graphs, and digraphs. You can however add double-quotes around such reserved words:
strict graph "graph"{...}
Even better, use a more meaningful name for your graph...
Problems:
If the edge is pointing to a subgraph, all nodes in that subgraph are heads of the edge.
Likewise, if the start node is inside a subgraph, the edge can be outside the subgraph.
This has been ignored in the re-write for V 2.
Search the source for 'TODO' and 'allow_cycles' to find where patches must be made.
Combinatorial Algorithms for Computers and Calculators, A Nijenhuis and H Wilf, p 240.
This books very clearly explains the backtracking parser I used to process the combinations of nodes found at each point along each path. Source code in the book is in Fortran.
The book is available as a PDF from http://www.math.upenn.edu/~wilf/website/CombAlgDownld.html.
Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.
The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.
https://github.com/ronsavage/GraphViz2-Marpa-PathUtils
Email the author, or log a bug on RT:
https://rt.cpan.org/Public/Dist/Display.html?Name=GraphViz2::Marpa::PathUtils.
GraphViz2::Marpa::PathUtils was written by Ron Savage <ron@savage.net.au> in 2012.
Home page: http://savage.net.au/index.html.
Australian copyright (c) 2012, Ron Savage.
All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html
To install GraphViz2::Marpa::PathUtils, copy and paste the appropriate command in to your terminal.
cpanm
CPAN shell
perl -MCPAN -e shell install GraphViz2::Marpa::PathUtils
For more information on module installation, please visit the detailed CPAN module installation guide.