NAME

data2gff.pl

A script to convert data into a frequency distribution, useful for graphing.

SYNOPSIS

data2frequency.pl --bins <integer> --size <number> <filename>

data2frequency.pl --bins <integer> --max <number> <filename>

data2frequency.pl --size <number> --max <number> <filename>

  Options:
  --in <filename>
  --bins <integer>
  --size <number>
  --index <list|range>
  --min <number>
  --max <number>
  --out <filename>
  --version
  --help

OPTIONS

The command line flags and descriptions:

--in <filename>: Specify an input file containing either a list of database features or genomic coordinates for which to collect data. The file should be a tab-delimited text file, one row per feature, with columns representing feature identifiers, attributes, coordinates, and/or data values. The first row should be column headers. Text files generated by other BioToolBox scripts are acceptable. Files may be gzipped compressed.
--bins <integer>: Specify the number of bins or partitions into which the data will be grouped. This argument is optional if --max and --size are provided.
--size <number>: Specify the size of each bin or partition. A decimal number may be provided. This argument is optional if --bins and --max are provided.
--min <number>: Optionally indicate the minimum value of the bins. When generating the list of bins, this is used as the starting value. All values less than this value will be ignored. The default is 0. A negative number may be provided using the format --min=-1.
--max <number>: Specify the maximum bin value. All values greater than this value will be ignored. This argument is optional if --bins and --size are provided.
--index <list|range>: Specify the datasets in the input data file to be converted to a distribution. The 0-based column number of the datasets should be provided. Multiple datasets may be provided as a comma-delimited list, as a consecutive list (start-stop), or a combination of both. Do not include spaces! If no datasets are provided, the program will interactively present to the user a list of possible datasets to convert.
--out <filename>: Specify the output file name. The default is to take the input file base name and append '_frequency' to it.
--version: Print the version number.
--help: Display this help

DESCRIPTION

This program will convert a datasets in a data file into a distribution. This may then be used to conveniantly plot a histogram using a program such as 'graph_profile.pl'.

Set the distribution parameters using the --bins and --binsize arguments, which set the number of bins and the size of each bin, respectively. The start number and maximum bin value may be optionally set as well.

One or more datasets within the data file may be converted. These may be specified on the command line or chosen interactively from a list presented to the user.

A data text file will be written as output. The bin values are listed as the first column, and the number of datapoints within each bin are listed in subsequent columns for each dataset requested.

AUTHOR

 Timothy J. Parnell, PhD
 Howard Hughes Medical Institute
 Dept of Oncological Sciences
 Huntsman Cancer Institute
 University of Utah
 Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.

To install Bio::ToolBox, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Bio::ToolBox

CPAN shell

perl -MCPAN -e shell
install Bio::ToolBox

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

OPTIONS

DESCRIPTION

AUTHOR

Module Install Instructions