The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

GVF::Parser - A parser for Genome Variation Format files.

VERSION

Version 0.01

SYNOPSIS

        use GVF::Parser;

        # Add unsupported attributes to the database. Currently five extra tags are allowed

        # Example:
        my $unsupported = {
            add_attribute1 => 'hgmd_disease',
            add_attribute2 => 'hgmd_location',
        };

        my $obj = GVF::Parser->new(
            file           => $gvf,          # required
            file_modifier  => $unsupported,  # pass the unsupported tags to GVF::Parser
        );

        # pragmas are stored in the object
        # features are use to build sqlite database

        $obj->pragmas;
        $obj->features;

        #---------------------------------------------------------

        # Example one
        # DBIx::Class approach.

        # connection to db via DBIx::Class object
        my $dbi = $obj->get_dbixclass;

        # a simple example using DBIx::Class.
        my $features   = $dbi->resultset('Features');
        my $attributes = $dbi->resultset('Attributes');

        # create a hash of all the feature items wanted
        # using feature table primary key
        my %feats;
        while (my $f = $features->next) {
            $feats{ $f->id } = {
                type  => $f->type,
                start => $f->start,
                end   => $f->end,
            };
        }

        # use attribure resultset to access desired parts of file
        # using attributes foreign_key to maintain relationship with features
        while (my $i = $attributes->next ){
            if ( $feats{ $i->features_id } ){
                my $varInfo = $obj->effectHash( $i->varianteffect );

                if ( $varInfo->{'three_prime_UTR_variant'}) {
                        print $varInfo->{'three_prime_UTR_variant'}->{'feature_type'}, "\t";
                        print $varInfo->{'three_prime_UTR_variant'}->{'feature'}, "\t";
                        print $feats{ $i->features_id }->{'start'}, "\t";
                        print $feats{ $i->features_id }->{'type'}, "\t";
                        print $i->referenceseq, "\t";
                        print $i->variantseq, "\n";
                }
            }
        }

        #------------------------------------------------------------------------------

        # Example two.
        # accessing data in parts

        # Example of using request methods.
        my @feats   = $obj->featureRequest('seqid', 'uniq');
        my @atts    = $obj->attributeRequest('Variant_effect');
        my $regions = $obj->sequenceRegions;

        # pragma can be requested with list or individually.
        my @wantList  = qw/ multi-individual population  /;
        my $foundList = $obj->pragmaRequest(\@list);
        my $foundIndv = $obj->pragmaRequest('gvf-version');

        #------------------------------------------------------------------------------

DESCRIPTION

Takes a given GVF file and creates a DBIx::Class sqlite3 database. In addition to having the ability to retrive sections of pragma and feature data directly via methods provided.

GVF::Parser partitions GVF files into pragma and feature data, and the feature data is further split into features and attributes. Pragma data is stored in object, and can be requested using the pragmaRequest method. Attribute information is stored/saved in a sqlite datafile, and can be accessed using the attributeRequest method, or more preferably via DBIx::Class requestset

SUBROUTINES/METHODS

pragmas

    Title    : pragmas
    Usage    : $obj->pragmas
    Returns  : None.

 Pragma data is stored in object and requested via <L<https://metacpan.org/module/GVF::Parser#pragmaRequest> or <L<https://metacpan.org/module/GVF::Parser#getPragma>

features

    Title    : features
    Usage    : $obj->features
    Function : Builds a SQLite3 database of feature values.
    Returns  : None

 This will populate a sqlite3 database creating a features and attributes table, parts of which can be accessed via <L<https://metacpan.org/module/GVF::Parser#featureRequest> or <L<https://metacpan.org/module/GVF::Parser#attributeRequest>

getPragma

    Title    : getPragma
    Usage    : $obj->getPragmas($pragma)
    Function : Allow you to search for a specific pragma.
    Returns  : requested pragma

 Allows you to search for a single pragma key.  <L<https://metacpan.org/module/GVF::Parser#pragmaRequest> offers more functionality.

pragmaKeys

    Title    : pragmaKeys
    Usage    : $obj->pragmaKeys
    Function : Grabs a list of all pragma keys in a given file
    Returns  : Array of all pragma keys

pragmaValues

    Title    : pragmaValues
    Usage    : $obj->pragmaValues
    Function : Grabs a list of all pragma values in a given file
    Returns  : Array of all pragma values

pragmaRequest

    Title    : pragmaRequest
    Usage    : $wanted = $obj->pragmaRequest($request) or
               $wanted = $obj->pragmaRequest(\@arrayref)
    Function : Capture requested simple pragma term
    Returns  : Single request returns arrayref of value.
               Passing list returns arrayref of all values.

sequenceRegions

    Title    : sequenceRegions
    Usage    : $regions = $obj->sequenceRegions
    Function : Capture all sequence regions from a GVF file.
    Returns  : Arrayref of all sequence regions.

featureRequest

    Title    : featureRequest
    Usage    : @features = $obj->featureRequest('seqid');
               @features = $obj->featureRequest('seqid', 'uniq');
    Function : Caputre requested feature types
    Returns  : Returns array of requested features or,
               returns array of uniq features of requested type

attributeRequest

    Title    : attributeRequest
    Usage    : @attributes = $obj->attributeRequest('reference_seq');
               $attributes = $obj->attributeRequest('reference_seq', 'uniq');
    Function : Caputre requested attribute type.
    Returns  : Returns array of requested attribute types, or
               returns array of uniq attributes of requested type
  

tidyVariantEffect

    Title    : tidyVariantEffect
    Usage    : $wanted = $obj->tidyVariantEffect( "variant_effect line" ); 
    Function : Will take individual Variant_effect line and return  
               hashref of each feature type.
    Returns  : Hashref of Variant_effect. 
    Args     : Individual Variant_effect line.

 Example  :
       From DBIx::Class resultset:
       my $varInfo = $obj->tidyVariantEffect( $result->varianteffect );
       
 Results:
 $_ = {
          'coding_sequence_variant' => {
                                         'feature_type' => 'mRNA',
                                         'index' => '0',
                                         'feature_id' => 'NM_000271'
                                       },
          'frameshift_variant' => {
                                    'feature_type' => 'mRNA',
                                    'index' => '0',
                                    'feature_id' => 'NM_000271'
                                  },
          'gene_variant' => {
                              'feature_type' => 'gene',
                              'index' => '0',
                              'feature_id' => 'NPC1'
                            }
        };

get_dbixclass

    Title    : get_dbixclass
    Usage    : $obj->get_dbixclass
    Function : Handle, used to connect to DBIx::Class
    Returns  : DBIx::Class object

AUTHOR

Shawn Rynearson, <shawn.rynerson at gmail.com>

BUGS

Please report any bugs or feature requests to bug-gvf-parser at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=GVF-Parser. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

    perldoc GVF::Parser

You can also look for information at:

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

Copyright 2012 Shawn Rynearson.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.