process_gadfly.pl - Massage Gadfly/FlyBase GFF files into a version suitable for the Generic Genome Browser
% process_gadfly.pl ./RELEASE2 > gadfly.gff
This script massages the Flybase/Gadfly GFF files located at ftp://ftp.fruitfly.org/pub/genomic/gadfly/ into the "correct" version of the GFF format.
To use this script, get the Gadfly GFF distribution archive which is organized by GenBank accession unit (e.g. "RELEASE2GFF.tar.gz"). Unpacking it will yield a directory named after the release, e.g. RELEASE2.
Give that directory as the argument to this script, and capture the script's output to a file:
The gadfly.gff file can then be loaded into a Bio::DB::GFF database using the following command:
% bulk_load_gff.pl -d <databasename> gadfly.gff
The resulting database will have the following feature types (represented as "method:source"):
Component:arm A chromosome arm Component:scaffold A chromosome scaffold (accession #) Component:gap A gap in the assembly clone:clonelocator A BAC clone gene:gadfly A gene accession number transcript:gadfly A transcript accession number translation:gadfly A translation codon:gadfly Significance unknown exon:gadfly An exon symbol:gadfly A classical gene symbol similarity:blastn A BLASTN hit similarity:blastx A BLASTX hit similarity:sim4 EST->genome using SIM4 similarity:groupest EST->genome using GROUPEST similarity:repeatmasker A repeat
Bio::DB::GFF, bulk_load_gff.pl, load_gff.pl
Lincoln Stein <lstein@cshl.org>.
Copyright (c) 2002 Cold Spring Harbor Laboratory
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty.
To install LocalConfig, copy and paste the appropriate command in to your terminal.
cpanm
cpanm LocalConfig
CPAN shell
perl -MCPAN -e shell install LocalConfig
For more information on module installation, please visit the detailed CPAN module installation guide.