The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

GenomeAnnotation

DESCRIPTION

METHODS

genomeTO_to_reconstructionTO

  $return = $obj->genomeTO_to_reconstructionTO($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a reconstructionTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int
reconstructionTO is a reference to a hash where the following keys are defined:
	subsystems has a value which is a variant_subsystem_pairs
	bindings has a value which is a fid_role_pairs
	assignments has a value which is a fid_function_pairs
variant_subsystem_pairs is a reference to a list where each element is a variant_of_subsystem
variant_of_subsystem is a reference to a list containing 2 items:
	0: a subsystem
	1: a variant
subsystem is a string
variant is a string
fid_role_pairs is a reference to a list where each element is a fid_role_pair
fid_role_pair is a reference to a list containing 2 items:
	0: a fid
	1: a role
fid is a string
role is a string
fid_function_pairs is a reference to a list where each element is a fid_function_pair
fid_function_pair is a reference to a list containing 2 items:
	0: a fid
	1: a function
function is a string

Description

genomeTO_to_feature_data

  $return = $obj->genomeTO_to_feature_data($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a fid_data_tuples
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int
fid_data_tuples is a reference to a list where each element is a fid_data_tuple
fid_data_tuple is a reference to a list containing 4 items:
	0: a fid
	1: a md5
	2: a location
	3: a function
fid is a string
md5 is a string
function is a string

Description

reconstructionTO_to_roles

  $return = $obj->reconstructionTO_to_roles($reconstructionTO)
Parameter and return types
$reconstructionTO is a reconstructionTO
$return is a reference to a list where each element is a role
reconstructionTO is a reference to a hash where the following keys are defined:
	subsystems has a value which is a variant_subsystem_pairs
	bindings has a value which is a fid_role_pairs
	assignments has a value which is a fid_function_pairs
variant_subsystem_pairs is a reference to a list where each element is a variant_of_subsystem
variant_of_subsystem is a reference to a list containing 2 items:
	0: a subsystem
	1: a variant
subsystem is a string
variant is a string
fid_role_pairs is a reference to a list where each element is a fid_role_pair
fid_role_pair is a reference to a list containing 2 items:
	0: a fid
	1: a role
fid is a string
role is a string
fid_function_pairs is a reference to a list where each element is a fid_function_pair
fid_function_pair is a reference to a list containing 2 items:
	0: a fid
	1: a function
function is a string

Description

reconstructionTO_to_subsystems

  $return = $obj->reconstructionTO_to_subsystems($reconstructionTO)
Parameter and return types
$reconstructionTO is a reconstructionTO
$return is a variant_subsystem_pairs
reconstructionTO is a reference to a hash where the following keys are defined:
	subsystems has a value which is a variant_subsystem_pairs
	bindings has a value which is a fid_role_pairs
	assignments has a value which is a fid_function_pairs
variant_subsystem_pairs is a reference to a list where each element is a variant_of_subsystem
variant_of_subsystem is a reference to a list containing 2 items:
	0: a subsystem
	1: a variant
subsystem is a string
variant is a string
fid_role_pairs is a reference to a list where each element is a fid_role_pair
fid_role_pair is a reference to a list containing 2 items:
	0: a fid
	1: a role
fid is a string
role is a string
fid_function_pairs is a reference to a list where each element is a fid_function_pair
fid_function_pair is a reference to a list containing 2 items:
	0: a fid
	1: a function
function is a string

Description

annotate_genome

  $return = $obj->annotate_genome($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

Given a genome object populated with contig data, perform gene calling and functional annotation and return the annotated genome. NOTE: Many of these "transformations" modify the input hash and copy the pointer. Be warned.

call_selenoproteins

  $return = $obj->call_selenoproteins($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

call_pyrrolysoproteins

  $return = $obj->call_pyrrolysoproteins($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

call_RNAs

  $return = $obj->call_RNAs($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

call_CDSs

  $return = $obj->call_CDSs($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

find_close_neighbors

  $return = $obj->find_close_neighbors($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

assign_functions_to_CDSs

  $return = $obj->assign_functions_to_CDSs($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

annotate_proteins

  $return = $obj->annotate_proteins($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

Given a genome object populated with feature data, reannotate the features that have protein translations. Return the updated genome object.

call_CDSs_by_projection

  $return = $obj->call_CDSs_by_projection($genomeTO)
Parameter and return types
$genomeTO is a genomeTO
$return is a genomeTO
genomeTO is a reference to a hash where the following keys are defined:
	id has a value which is a genome_id
	scientific_name has a value which is a string
	domain has a value which is a string
	genetic_code has a value which is an int
	source has a value which is a string
	source_id has a value which is a string
	close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
	0: a genome_id
	1: a float

	DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
	0: a string
	1: an int
	2: an int
	3: an int
	4: a string

	contigs has a value which is a reference to a list where each element is a contig
	features has a value which is a reference to a list where each element is a feature
genome_id is a string
contig is a reference to a hash where the following keys are defined:
	id has a value which is a contig_id
	dna has a value which is a string
contig_id is a string
feature is a reference to a hash where the following keys are defined:
	id has a value which is a feature_id
	location has a value which is a location
	type has a value which is a feature_type
	function has a value which is a string
	protein_translation has a value which is a string
	aliases has a value which is a reference to a list where each element is a string
	annotations has a value which is a reference to a list where each element is an annotation
feature_id is a string
location is a reference to a list where each element is a region_of_dna
region_of_dna is a reference to a list containing 4 items:
	0: a contig_id
	1: an int
	2: a string
	3: an int
feature_type is a string
annotation is a reference to a list containing 3 items:
	0: a string
	1: a string
	2: an int

Description

TYPES

md5

Definition
a string

md5s

Definition
a reference to a list where each element is a md5

genome_id

Definition
a string

feature_id

Definition
a string

contig_id

Definition
a string

feature_type

Definition
a string

region_of_dna

Description

A region of DNA is maintained as a tuple of four components:

                the contig
                the beginning position (from 1)
                the strand
                the length

           We often speak of "a region".  By "location", we mean a sequence
           of regions from the same genome (perhaps from distinct contigs).
Definition
a reference to a list containing 4 items:
0: a contig_id
1: an int
2: a string
3: an int

location

Description

a "location" refers to a sequence of regions

Definition
a reference to a list where each element is a region_of_dna

annotation

Definition
a reference to a list containing 3 items:
0: a string
1: a string
2: an int

feature

Definition
a reference to a hash where the following keys are defined:
id has a value which is a feature_id
location has a value which is a location
type has a value which is a feature_type
function has a value which is a string
protein_translation has a value which is a string
aliases has a value which is a reference to a list where each element is a string
annotations has a value which is a reference to a list where each element is an annotation

contig

Definition
a reference to a hash where the following keys are defined:
id has a value which is a contig_id
dna has a value which is a string

genomeTO

Definition
a reference to a hash where the following keys are defined:
id has a value which is a genome_id
scientific_name has a value which is a string
domain has a value which is a string
genetic_code has a value which is an int
source has a value which is a string
source_id has a value which is a string
close_genomes has a value which is a reference to a list where each element is a reference to a list containing 2 items:
0: a genome_id
1: a float

DNA_kmer_data has a value which is a reference to a list where each element is a reference to a list containing 5 items:
0: a string
1: an int
2: an int
3: an int
4: a string

contigs has a value which is a reference to a list where each element is a contig
features has a value which is a reference to a list where each element is a feature

subsystem

Definition
a string

variant

Definition
a string

variant_of_subsystem

Definition
a reference to a list containing 2 items:
0: a subsystem
1: a variant

variant_subsystem_pairs

Definition
a reference to a list where each element is a variant_of_subsystem

fid

Definition
a string

role

Definition
a string

function

Definition
a string

fid_role_pair

Definition
a reference to a list containing 2 items:
0: a fid
1: a role

fid_role_pairs

Definition
a reference to a list where each element is a fid_role_pair

fid_function_pair

Definition
a reference to a list containing 2 items:
0: a fid
1: a function

fid_function_pairs

Definition
a reference to a list where each element is a fid_function_pair

reconstructionTO

Definition
a reference to a hash where the following keys are defined:
subsystems has a value which is a variant_subsystem_pairs
bindings has a value which is a fid_role_pairs
assignments has a value which is a fid_function_pairs

fid_data_tuple

Definition
a reference to a list containing 4 items:
0: a fid
1: a md5
2: a location
3: a function

fid_data_tuples

Definition
a reference to a list where each element is a fid_data_tuple