Parse::Taxonomy::Cookbook - cookbook for Parse-Taxonomy
This file is a cookbook holding usage examples -- recipes -- for various Parse::Taxonomy subclasses.
The documentation herein presumes that you have already studied the documentation in Parse::Taxonomy, Parse::Taxonomy::Path, Parse::Taxonomy::Index, etc.
You have a CSV file which you have been told is a taxonomy-by-path. You want to confirm its validity.
Let's say the file holds these records:
$> cat ./proposed_taxonomy.csv "path","vertical","is_actionable" "|Alpha","Auto",,"0" "|Alpha|Epsilon|Kappa","Auto","0" "|Alpha|Epsilon|Kappa","Auto","1" "|Alpha|Zeta","Auto","0" "|Alpha|Zeta|Lambda","Auto","1" "|Alpha|Zeta|Mu","Auto","0"
Try to create a Parse::Taxonomy::Path object using the file interface.
file
local $@; eval { $obj = Parse::Taxonomy::Path->new( { file => './proposed_taxonomy.csv', } ); }; print STDERR "$@\n";
If $obj is created successfully, the taxonomy meets the requirements described in Parse::Taxonomy. This particular file, however, will throw an exception. Examination of the content of $@ will show that two records have the same materialized path, i.e., the same value in the path column.
$obj
$@
path
You have a CSV file which you have been told is a taxonomy-by-index. You want to confirm its validity.
$> cat ./proposed_taxonomy_by_index.csv "id","parent_id","name","vertical","is_actionable" "1","","Alpha","Auto","0" "2","1","Epsilon","Auto","0" "3","2","Kappa","Auto","1" "4","2","Kappa","Auto","1" "5","1","Zeta","Auto","0" "6","5","Lambda","Auto","1" "7","5","Mu","Auto","1"
Try to create a Parse::Taxonomy::Index object using the file interface.
local $@; eval { $obj = Parse::Taxonomy::Index->new( { file => './proposed_taxonomy_by_index.csv', } ); }; print STDERR "$@\n";
If $obj is created successfully, the taxonomy meets the requirements described in Parse::Taxonomy. This particular file, however, will throw an exception. Examination of the content of $@ will show that two records with the same parent_id have the same name.
parent_id
name
You have a taxonomy file from which you have successfully created a Parse::Taxonomy::Path object. From that you know that it is valid with respect to the requirements for a taxonomy imposed by this library. But you have additional business requirements which a taxonomy must fulfill before you can use the taxonomy in production.
Suppose that you have a taxonomy file with this data:
$> cat local_requirement.csv "path","is_actionable" "|Alpha","0" "|Beta","0" "|Alpha|Epsilon","0" "|Alpha|Epsilon|Kappa","1" "|Alpha|Zeta","0" "|Alpha|Zeta|Lambda","1" "|Alpha|Zeta|Mu","0" "|Beta|Eta","1" "|Beta|Theta","1" "|Beta|Iota","0"
Suppose further that you have a business requirement that all nodes which are "pure" leaf nodes -- all nodes which have no children of their own -- have a true value for is_actionable.
is_actionable
Use Parse::Taxonomy:::Path accessor methods to get at the data in the taxonomy and write your own functions to conduct local validations.
In this case:
$obj = Parse::Taxonomy::Path->new( { file => 'local_requirement.csv', } ); $hashified = $obj->hashify(); $child_counts = $obj->child_counts();
Use hashify() to turn the taxonomy into a hash. Use child_counts() to get the number of children each node has. Then iterate over the hash checking whether an element has no children and, if so, whether the node's is_actionable setting is true.
hashify()
child_counts()
@non_actionable_leaf_nodes = (); for my $node (keys %{$hashified}) { if ( ($child_counts->{$node} == 0) && (! $hashified->{$node}->{is_actionable}) ) { push @non_actionable_leaf_nodes, $node; } } warn "leaf node '$_' is non-actionable" for @non_actionable_leaf_nodes;
Output will resemble:
leaf node '|Alpha|Zeta|Mu' is non-actionable at ... leaf node '|Beta|Iota' is non-actionable at ...
You can then decide how to handle this per your business requirements.
You have a file which holds a validated taxonomy-by-path and you want to create a file which holds the equivalent taxonomy-by-index.
Suppose you have a file with this data:
"path","is_actionable" "|Alpha","0" "|Beta","0" "|Alpha|Epsilon","0" "|Alpha|Epsilon|Kappa","1" "|Alpha|Zeta","0" "|Alpha|Zeta|Lambda","1" "|Alpha|Zeta|Mu","1" "|Beta|Eta","1" "|Beta|Theta","1" "|Beta|Iota","1"
Use the indexify() and write_indexified_to_csv() methods.
indexify()
write_indexified_to_csv()
$indexified = $obj->indexify(); $file_taxonomy_by_index = $obj->write_indexified_to_csv($indexified);
The file's whose path is stored in $file_taxonomy_by_index will look like this:
$file_taxonomy_by_index
id,parent_id,name,is_actionable 1,,Alpha,0 2,,Beta,0 3,1,Epsilon,0 4,1,Zeta,0 5,2,Eta,1 6,2,Theta,1 7,2,Iota,1 8,3,Kappa,1 9,4,Lambda,1 10,4,Mu,1
In a relational database (RDB), you have hierarchical data stored in a flat table by way of id, parent_id and name columns. You need to communicate the current status of that taxonomy to someone who is familiar with CSV-formatted data and who would like to see the structure in that taxonomy expressed in a single column.
id
First you need to get the data out of the RDB and into a text file. For that you might use a command-line language appropriate for that in RDB. For example, in psql, the command-line language associated with PostgreSQL, you would say:
$> \copy (SELECT id, parent_id, name, is actionable FROM my_table) TO /path/to/taxonomy.csv WITH CSV HEADERS
The CSV file would then contain data like this:
"id","parent_id","name","is_actionable" "1","","Alpha","0" "2","","Beta","0" "3","1","Epsilon","0" "4","3","Kappa","1" "5","1","Zeta","0" "6","5","Lambda","1" "7","5","Mu","0" "8","2","Eta","1" "9","2","Theta","1"
First, create a Parse::Taxonomy::Index object from this source file, and then apply the pathify() method to it with the as_string option set to a true value.
pathify()
as_string
$source = "/path/to/taxonomy.csv"; $obj = Parse::Taxonomy::Index->new( { file => $source, } ); $rv = $obj->pathify( { as_string => 1 } );
That returns a Perl reference to an array of array references:
[ ["path", "is_actionable"], ["|Alpha", 0], ["|Beta", 0], ["|Alpha|Epsilon", 0], ["|Alpha|Epsilon|Kappa", 1], ["|Alpha|Zeta", 0], ["|Alpha|Zeta|Lambda", 1], ["|Alpha|Zeta|Mu", 0], ["|Beta|Eta", 1], ["|Beta|Theta", 1], ]
TK
To install Parse::Taxonomy, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::Taxonomy
CPAN shell
perl -MCPAN -e shell install Parse::Taxonomy
For more information on module installation, please visit the detailed CPAN module installation guide.