NAME

GVF::Parser - A parser for Genome Variation Format files.

VERSION

Version 0.01

SYNOPSIS

use GVF::Parser;

# Add unsupported attributes to the database. Currently five extra tags are allowed

# Example:
my $unsupported = {
    add_attribute1 => 'hgmd_disease',
    add_attribute2 => 'hgmd_location',
};

my $obj = GVF::Parser->new(
    file           => $gvf,          # required
    file_modifier  => $unsupported,  # pass the unsupported tags to GVF::Parser
);

# pragmas are stored in the object
# features are use to build sqlite database

$obj->pragmas;
$obj->features;

#---------------------------------------------------------

# Example one
# DBIx::Class approach.

# connection to db via DBIx::Class object
my $dbi = $obj->get_dbixclass;

# a simple example using DBIx::Class.
my $features   = $dbi->resultset('Features');
my $attributes = $dbi->resultset('Attributes');

# create a hash of all the feature items wanted
# using feature table primary key
my %feats;
while (my $f = $features->next) {
    $feats{ $f->id } = {
	type  => $f->type,
	start => $f->start,
	end   => $f->end,
    };
}

# use attribure resultset to access desired parts of file
# using attributes foreign_key to maintain relationship with features
while (my $i = $attributes->next ){
    if ( $feats{ $i->features_id } ){
	my $varInfo = $obj->effectHash( $i->varianteffect );

	if ( $varInfo->{'three_prime_UTR_variant'}) {
		print $varInfo->{'three_prime_UTR_variant'}->{'feature_type'}, "\t";
		print $varInfo->{'three_prime_UTR_variant'}->{'feature'}, "\t";
		print $feats{ $i->features_id }->{'start'}, "\t";
		print $feats{ $i->features_id }->{'type'}, "\t";
		print $i->referenceseq, "\t";
		print $i->variantseq, "\n";
	}
    }
}

#------------------------------------------------------------------------------

# Example two.
# accessing data in parts

# Example of using request methods.
my @feats   = $obj->featureRequest('seqid', 'uniq');
my @atts    = $obj->attributeRequest('Variant_effect');
my $regions = $obj->sequenceRegions;

# pragma can be requested with list or individually.
my @wantList  = qw/ multi-individual population  /;
my $foundList = $obj->pragmaRequest(\@list);
my $foundIndv = $obj->pragmaRequest('gvf-version');

#------------------------------------------------------------------------------

DESCRIPTION

Takes a given GVF file and creates a DBIx::Class sqlite3 database. In addition to having the ability to retrive sections of pragma and feature data directly via methods provided.

GVF::Parser partitions GVF files into pragma and feature data, and the feature data is further split into features and attributes. Pragma data is stored in object, and can be requested using the pragmaRequest method. Attribute information is stored/saved in a sqlite datafile, and can be accessed using the attributeRequest method, or more preferably via DBIx::Class requestset

SUBROUTINES/METHODS

pragmas

   Title    : pragmas
   Usage    : $obj->pragmas
   Returns  : None.

Pragma data is stored in object and requested via <L<https://metacpan.org/module/GVF::Parser#pragmaRequest> or <L<https://metacpan.org/module/GVF::Parser#getPragma>

features

   Title    : features
   Usage    : $obj->features
   Function : Builds a SQLite3 database of feature values.
   Returns  : None

This will populate a sqlite3 database creating a features and attributes table, parts of which can be accessed via <L<https://metacpan.org/module/GVF::Parser#featureRequest> or <L<https://metacpan.org/module/GVF::Parser#attributeRequest>

getPragma

   Title    : getPragma
   Usage    : $obj->getPragmas($pragma)
   Function : Allow you to search for a specific pragma.
   Returns  : requested pragma

Allows you to search for a single pragma key.  <L<https://metacpan.org/module/GVF::Parser#pragmaRequest> offers more functionality.

pragmaKeys

Title    : pragmaKeys
Usage    : $obj->pragmaKeys
Function : Grabs a list of all pragma keys in a given file
Returns  : Array of all pragma keys

pragmaValues

Title    : pragmaValues
Usage    : $obj->pragmaValues
Function : Grabs a list of all pragma values in a given file
Returns  : Array of all pragma values

pragmaRequest

Title    : pragmaRequest
Usage    : $wanted = $obj->pragmaRequest($request) or
           $wanted = $obj->pragmaRequest(\@arrayref)
Function : Capture requested simple pragma term
Returns  : Single request returns arrayref of value.
           Passing list returns arrayref of all values.

sequenceRegions

Title    : sequenceRegions
Usage    : $regions = $obj->sequenceRegions
Function : Capture all sequence regions from a GVF file.
Returns  : Arrayref of all sequence regions.

featureRequest

Title    : featureRequest
Usage    : @features = $obj->featureRequest('seqid');
           @features = $obj->featureRequest('seqid', 'uniq');
Function : Caputre requested feature types
Returns  : Returns array of requested features or,
           returns array of uniq features of requested type

attributeRequest

  Title    : attributeRequest
  Usage    : @attributes = $obj->attributeRequest('reference_seq');
             $attributes = $obj->attributeRequest('reference_seq', 'uniq');
  Function : Caputre requested attribute type.
  Returns  : Returns array of requested attribute types, or
             returns array of uniq attributes of requested type

tidyVariantEffect

   Title    : tidyVariantEffect
   Usage    : $wanted = $obj->tidyVariantEffect( "variant_effect line" ); 
   Function : Will take individual Variant_effect line and return  
              hashref of each feature type.
   Returns  : Hashref of Variant_effect. 
   Args     : Individual Variant_effect line.

Example  :
      From DBIx::Class resultset:
      my $varInfo = $obj->tidyVariantEffect( $result->varianteffect );
      
Results:
$_ = {
         'coding_sequence_variant' => {
                                        'feature_type' => 'mRNA',
                                        'index' => '0',
                                        'feature_id' => 'NM_000271'
                                      },
         'frameshift_variant' => {
                                   'feature_type' => 'mRNA',
                                   'index' => '0',
                                   'feature_id' => 'NM_000271'
                                 },
         'gene_variant' => {
                             'feature_type' => 'gene',
                             'index' => '0',
                             'feature_id' => 'NPC1'
                           }
       };

get_dbixclass

Title    : get_dbixclass
Usage    : $obj->get_dbixclass
Function : Handle, used to connect to DBIx::Class
Returns  : DBIx::Class object

AUTHOR

Shawn Rynearson, <shawn.rynerson at gmail.com>

BUGS

Please report any bugs or feature requests to bug-gvf-parser at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=GVF-Parser. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

perldoc GVF::Parser

You can also look for information at:

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

Copyright 2012 Shawn Rynearson.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.