The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

graph_histogram.pl

A script to graph a histogram (bar or line) of one or more datasets.

SYNOPSIS

graph_histogram.pl --bins <integer> --size <number> <filename>

graph_histogram.pl --bins <integer> --max <number> <filename>

graph_histogram.pl --size <number> --max <number> <filename>

  Options:
  --in <filename>
  --index <column_index>
  --bins <integer>
  --size <number>
  --min <number>
  --max <number>
  --ymax <integer>
  --yticks <integer>
  --skip <integer>
  --offset <integer>
  --format <integer>
  --lines
  --out <base_filename>
  --dir <output_directory>
  --version
  --help

OPTIONS

The command line flags and descriptions:

--in <filename>

Specify an input file containing either a list of database features or genomic coordinates for which to collect data. The file should be a tab-delimited text file, one row per feature, with columns representing feature identifiers, attributes, coordinates, and/or data values. The first row should be column headers. Text files generated by other BioToolBox scripts are acceptable. Files may be gzipped compressed.

--index <column_index>

Specify the column number(s) corresponding to the dataset(s) in the file to graph. Number is 0-based index. Each dataset should be demarcated by a comma. A range of indices may also be specified using a dash to demarcate the beginning and end of the inclusive range. Two datasets may also be graphed together; these indices should be joined with an ampersand. For example, "2,4-6,5&6" will individually graph datasets 2, 4, 5, 6, and a combination 5 and 6 graph.

If no dataset indices are specified, then they may be chosen interactively from a list.

--bins <integer>

Specify the number of bins or partitions into which the data will be grouped. This argument is optional if --max and --size are provided.

--size <number>

Specify the size of each bin or partition. A decimal number may be provided. This argument is optional if --bins and --max are provided.

--min <number>

Optionally indicate the minimum value of the bins. When generating the list of bins, this is used as the starting value. Default is 0. A negative number may be provided using the format --min=-1.

--max <number>

Specify the maximum bin value. This argument is optional if --bins and --size are provided.

--ymax <integer>

Specify the maximum Y axis value. The default is automatically determined.

--yticks <integer>

Specify explicitly the number of major ticks for the Y axes. The default is 4.

--skip <integer>

Specify the ordinal number of X axis major ticks to label. This avoids overlapping labels. The default is 4 (every 4th tick is labeled).

--offset <integer>

Specify the number of X axis ticks to skip at the beginning before starting to label them. This may help in adjusting the look of the graph. The default is 0.

--format <integer>

Specify the number of decimal places the X axis labels should be formatted. The default is the number of decimal places in the bin size parameter.

--lines

Optionally specify a line graph to be generated instead of the default vertical bar graph.

--out

Optionally specify the output filename prefix. The default value is "distribution_".

--dir

Optionally specify the name of the target directory to place the graphs. The default value is the basename of the input file appended with "_graphs".

--version

Print the version number.

--help

Print this help documenation

DESCRIPTION

This program will generate PNG graphic files representing the histogram of the values in one or two datasets. The size of each bin or partition must be provided, as well as either the number of bins or the maximum bin value. The resulting files are written to a subdirectory named after the input file. The files are named after the dataset name (column header) with a prefix.

AUTHOR

 Timothy J. Parnell, PhD
 Howard Hughes Medical Institute
 Dept of Oncological Sciences
 Huntsman Cancer Institute
 University of Utah
 Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.