The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::Taxonomy::Cookbook - cookbook for Parse-Taxonomy

DESCRIPTION

This file is a cookbook holding usage examples -- recipes -- for various Parse::Taxonomy subclasses.

The documentation herein presumes that you have already studied the documentation in Parse::Taxonomy, Parse::Taxonomy::Path, Parse::Taxonomy::Index, etc.

RECIPES

Validate a taxonomy-by-path

Problem

You have a CSV file which you have been told is a taxonomy-by-path. You want to confirm its validity.

Let's say the file holds these records:

    $> cat ./proposed_taxonomy.csv
    "path","vertical","is_actionable"
    "|Alpha","Auto",,"0"
    "|Alpha|Epsilon|Kappa","Auto","0"
    "|Alpha|Epsilon|Kappa","Auto","1"
    "|Alpha|Zeta","Auto","0"
    "|Alpha|Zeta|Lambda","Auto","1"
    "|Alpha|Zeta|Mu","Auto","0"

Solution

Try to create a Parse::Taxonomy::Path object using the file interface.

    local $@;
    eval {
        $obj = Parse::Taxonomy::Path->new( {
            file => './proposed_taxonomy.csv',
        } );
    };
    print STDERR "$@\n";

If $obj is created successfully, the taxonomy meets the requirements described in Parse::Taxonomy. This particular file, however, will throw an exception. Examination of the content of $@ will show that two records have the same materialized path, i.e., the same value in the path column.

Validate a taxonomy-by-index

Problem

You have a CSV file which you have been told is a taxonomy-by-index. You want to confirm its validity.

Let's say the file holds these records:

    $> cat ./proposed_taxonomy_by_index.csv
    "id","parent_id","name","vertical","is_actionable"
    "1","","Alpha","Auto","0"
    "2","1","Epsilon","Auto","0"
    "3","2","Kappa","Auto","1"
    "4","2","Kappa","Auto","1"
    "5","1","Zeta","Auto","0"
    "6","5","Lambda","Auto","1"
    "7","5","Mu","Auto","1"

Solution

Try to create a Parse::Taxonomy::Index object using the file interface.

    local $@;
    eval {
        $obj = Parse::Taxonomy::Index->new( {
            file => './proposed_taxonomy_by_index.csv',
        } );
    };
    print STDERR "$@\n";

If $obj is created successfully, the taxonomy meets the requirements described in Parse::Taxonomy. This particular file, however, will throw an exception. Examination of the content of $@ will show that two records with the same parent_id have the same name.

Apply extra validations to a taxonomy

Problem

You have a taxonomy file from which you have successfully created a Parse::Taxonomy::Path object. From that you know that it is valid with respect to the requirements for a taxonomy imposed by this library. But you have additional business requirements which a taxonomy must fulfill before you can use the taxonomy in production.

Suppose that you have a taxonomy file with this data:

    $> cat local_requirement.csv
    "path","is_actionable"
    "|Alpha","0"
    "|Beta","0"
    "|Alpha|Epsilon","0"
    "|Alpha|Epsilon|Kappa","1"
    "|Alpha|Zeta","0"
    "|Alpha|Zeta|Lambda","1"
    "|Alpha|Zeta|Mu","0"
    "|Beta|Eta","1"
    "|Beta|Theta","1"
    "|Beta|Iota","0"

Suppose further that you have a business requirement that all nodes which are "pure" leaf nodes -- all nodes which have no children of their own -- have a true value for is_actionable.

Solution

Use Parse::Taxonomy:::Path accessor methods to get at the data in the taxonomy and write your own functions to conduct local validations.

In this case:

    $obj = Parse::Taxonomy::Path->new( {
        file    => 'local_requirement.csv',
    } );
    $hashified       = $obj->hashify();
    $child_counts    = $obj->child_counts();

Use hashify() to turn the taxonomy into a hash. Use child_counts() to get the number of children each node has. Then iterate over the hash checking whether an element has no children and, if so, whether the node's is_actionable setting is true.

    @non_actionable_leaf_nodes = ();
    for my $node (keys %{$hashified}) {
        if (
            ($child_counts->{$node} == 0) &&
            (! $hashified->{$node}->{is_actionable})
        ) {
            push @non_actionable_leaf_nodes, $node;
        }
    }
    warn "leaf node '$_' is non-actionable"
        for @non_actionable_leaf_nodes;

Output will resemble:

    leaf node '|Alpha|Zeta|Mu' is non-actionable at ...
    leaf node '|Beta|Iota' is non-actionable at ...

You can then decide how to handle this per your business requirements.

Convert a taxonomy-by-path to a taxonomy-by-index

Problem

You have a file which holds a validated taxonomy-by-path and you want to create a file which holds the equivalent taxonomy-by-index.

Suppose you have a file with this data:

    "path","is_actionable"
    "|Alpha","0"
    "|Beta","0"
    "|Alpha|Epsilon","0"
    "|Alpha|Epsilon|Kappa","1"
    "|Alpha|Zeta","0"
    "|Alpha|Zeta|Lambda","1"
    "|Alpha|Zeta|Mu","1"
    "|Beta|Eta","1"
    "|Beta|Theta","1"
    "|Beta|Iota","1"

Solution

Use the indexify() and write_indexified_to_csv() methods.

    $indexified = $obj->indexify();
    $file_taxonomy_by_index = $obj->write_indexified_to_csv($indexified);

The file's whose path is stored in $file_taxonomy_by_index will look like this:

    id,parent_id,name,is_actionable
    1,,Alpha,0
    2,,Beta,0
    3,1,Epsilon,0
    4,1,Zeta,0
    5,2,Eta,1
    6,2,Theta,1
    7,2,Iota,1
    8,3,Kappa,1
    9,4,Lambda,1
    10,4,Mu,1

Convert a taxonomy-by-index to a taxonomy-by-path

Problem

In a relational database (RDB), you have hierarchical data stored in a flat table by way of id, parent_id and name columns. You need to communicate the current status of that taxonomy to someone who is familiar with CSV-formatted data and who would like to see the structure in that taxonomy expressed in a single column.

Solution

First you need to get the data out of the RDB and into a text file. For that you might use a command-line language appropriate for that in RDB. For example, in psql, the command-line language associated with PostgreSQL, you would say:

    $> \copy (SELECT id, parent_id, name, is actionable FROM my_table) TO /path/to/taxonomy.csv WITH CSV HEADERS

The CSV file would then contain data like this:

    "id","parent_id","name","is_actionable"
    "1","","Alpha","0"
    "2","","Beta","0"
    "3","1","Epsilon","0"
    "4","3","Kappa","1"
    "5","1","Zeta","0"
    "6","5","Lambda","1"
    "7","5","Mu","0"
    "8","2","Eta","1"
    "9","2","Theta","1"

First, create a Parse::Taxonomy::Index object from this source file, and then apply the pathify() method to it with the as_string option set to a true value.

    $source = "/path/to/taxonomy.csv";
    $obj = Parse::Taxonomy::Index->new( {
        file    => $source,
    } );

    $rv = $obj->pathify( { as_string => 1 } );

That returns a Perl reference to an array of array references:

    [
      ["path", "is_actionable"],
      ["|Alpha", 0],
      ["|Beta", 0],
      ["|Alpha|Epsilon", 0],
      ["|Alpha|Epsilon|Kappa", 1],
      ["|Alpha|Zeta", 0],
      ["|Alpha|Zeta|Lambda", 1],
      ["|Alpha|Zeta|Mu", 0],
      ["|Beta|Eta", 1],
      ["|Beta|Theta", 1],
    ]

Confirm that two taxonomies are equivalent

Problem

TK

Solution

TK

Predict the result of a data migration between relational database tables

Problem

TK

Solution

TK