NAME

Graphics::Skullplot::ClassifyColumns - simple type inference of columns of tabular data

VERSION

Version 0.01

SYNOPSIS

use Graphics::Skullplot::ClassifyColumns;

my $cc = Graphics::Skullplot::ClassifyColumns->new( data => $data );  
my $plot_cols = 
  $cc->classify_columns_simple( { indie_count => $indie_count, } );

DESCRIPTION

Graphics::Skullplot::ClassifyColumns is a stripped down version of an old experimental module I was developing I called Data::Classify. I expect to go back to that project and develop a more elaborate system of plug-ins to target different kinds of databases and so on, most likely named Table::TypeInference.

This particular module just needs a "classify_columns_simple" routine that works well enough to figure out how to plot some data via ggplot2 in R (i.e. the "Graphics::Skullplot" project).

new

Creates a new Graphics::Skullplot::ClassifyColumns object.

Takes a hashref as an argument, with named fields identical to the names of the object attributes. These attributes are:

data: A required field, columns of data as an array of array references, with a header in the first row.

classify_columns_simple

Note: here "simple" might be thought of as "stub": This does the simplest possible categorization using only a single numeric hint for the number of independent fields.

The presumption here is the incoming data is organized like the output of a typical sql group by select, x-axis in the first column a number of columns of dependent data as the end, and (possibly) a certain number of categorical variables (ones with a small number of allowed values) in-between.

This returns a hash indicating how different columns should be handled in the plotting stage, the keys are:

x    (rename: indie_x )
y             but just for when there's only one dependent 
gb_cats
dep_fields  (rename: dependents_y }

Example usage:

my $cc = Graphics::Skullplot::ClassifyColumns->new( data => $data );  
my $opt = { indie_count => 1, };
my $plot_cols_href = 
  $cc->classify_columns_simple( $opt );

column_types

Given a reference to tabular data in an array-of-arrays format- with a header expected in the first row- tries to infer the rough data type of each column.

Returns a list (or aref) of the type codes, in sequence.

classify

A wrapper around Scalar::Classify's "classify", which also subdivides the string category, looking for datetime types.

The type is most often (but not limited to) one of the following:

ARRAY
HASH
:NUMBER:
:STRING:

This code examines any string values to see if a date/time code is more appropriate:

:DATE: 
:DATETIME: 
:TIME:

most_common

Given a hash of numeric counts, returns the key of the maximum count.

In the case of a tie, the return will be one of the tie values, which one is undefined.

define_regxeps

Generates a hashref of locally useful regexps.

These are mostly intended to identify dates and times. TODO just look up existing solutions, e.g. Regexp::Common.

AUTHOR

Joseph Brenner, <doom@kzsu.stanford.edu>, 22 May 2018

COPYRIGHT AND LICENSE

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

No warranty is provided with this code.

See http://dev.perl.org/licenses/ for more information.

To install Graphics::Skullplot, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Graphics::Skullplot

CPAN shell

perl -MCPAN -e shell
install Graphics::Skullplot

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)