obogaf::parser
use obogaf::parser;
my ($graph, $subonto, $stat, $res);
$graph= build_edges(obofile);
$subonto= build_subonto(edgesfile, namespace);
$stat= make_stat(edgesfile, parentIndex, childIndex);
($res, $stat)= gene2biofun(annfile, geneIndex, classIndex);
($res, $stat)= map_OBOterm_between_release(obofile, annfile, classIndex);
obogaf::parser is a perl5 module desinged to handle obo and gene association file.
my $graph= build_edges(obofile);
obofile: any obo file listed in OBO foundry. The file extension must be ".obo".
output: the graph is returned as tuple: subdomain <tab> source <tab> destination <tab> relationship. This means that the graph is returned as a list of edges, where each edge is represented as a pair of vertices in the form source <tab> destination. For each couple of nodes, the subdomain (if any) and the relationships for which is safe group annotations (i.e. is_a and part_of) are returned as well. The graph is stored as an anonymous scalar.
subdomain <tab> source <tab> destination <tab> relationship
source <tab> destination
is_a
part_of
my $subonto= build_subonto(edgesfile, namespace);
edgesfile: a graph in the form: subdomain <tab> source <tab> destination <tab> relationship. This file can be obtained by calling the subroutine build_edges.
build_edges
namespace: name of the subontology for which the edges must be extracted.
output: the graph is returned as a tuple>: source <tab> destination <tab> relationship. In other words the graph is returned as a list of edges, where each edge is represented as a pair of vertices in the form source <tab> destination. For each couple of nodes the relationships is_a and part_of are also returned. The graph is stored as an anonymous scalar.
source <tab> destination <tab> relationship
my $stat= make_stat(edgesfile, parentIndex, childIndex);
edgesfile: a graph represented as a list of edges, where each edge is stored as a pair of vertices <tab> separated. This file can be obtained by calling the subroutine build_edges.
parentIndex: index referring to the column containing the parent (source) vertices in edgesfile file.
childIndex: index referring to the column containing the child vertices (destination) in the edgesfile file.
output: statistics about the graph are printed on the shell. More precisely, for each vertex of the graph degree, in-degree and out-degree are printed. The vertex are sorted in a decreasing order on the basis of degree, from the higher degree to the smaller degree. Finally, the following statistics are returned as well: 1) number of nodes and edges of the graph; 2) maximum and minimum degree; 3) average and median degree; 4) density of the graph.
my ($res, $stat)= gene2biofun(annfile, geneIndex, classIndex);
annfile: an annotations file. The file extension can be either plain format (".txt") or compressed (".gz"). An example of the format of this file can be taken from GOA website (file with ".gaf.gz" extension) or HPO website. More in general any file structured as those aforementioned can be used (basically a ".csv" file using <tab> as separator).
geneIndex: index referring to the column containing the samples (genes/proteins).
classIndex: index referring to the column containing the ontology terms.
output: a list of two anonymous references. The first is an anonymous hash storing for each gene (or protein) all the associated ontology terms (pipe separated). The second is an anonymous scalar containing basic statistics, such as the total unique number of genes/proteins and annotated ontology terms.
my ($res, $stat)= map_OBOterm_between_release(obofile, annfile, classIndex);
obofile: an obo file (a new release). This file is used to make the alt_id - id pairing, by using alt_id as key. The file extension must be ".obo".
alt_id - id
alt_id
annfile: an annotation file (an old release). The file extension can be either plain format (".txt") or compressed (".gz").
classIndex: index referring to the column of the annfile containing the ontology terms to be mapped.
output: a list of two anonymous references. The first is an anonymous scalar storing the annotations file in the same format of the input file but with the obsolete ontology terms replaced with the updated ones. The second reference is an anonymous scalar containing some basic statistics, such as the total unique number of ontology terms and the total number of mapped and not mapped altID ontology terms. Finally, all the found pairs alt_id - id are returned (if any).
Please report any bugs here.
Copyright (C) 2019 Marco Notaro, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5 programming language system itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Marco Notaro (https://marconotaro.github.io)
A step-by-step tutorial showing how to apply obogaf::parser to real case studies in Computational Biology and Precision Medicine is situated at the following link https://github.com/marconotaro/obogaf-parser.
To install obogaf::parser, copy and paste the appropriate command in to your terminal.
cpanm
cpanm obogaf::parser
CPAN shell
perl -MCPAN -e shell install obogaf::parser
For more information on module installation, please visit the detailed CPAN module installation guide.