Seq::Parse - The Bioperl ReadSeq interface
Simple perl interface/wrapper to D.G. Gilbert's ReadSeq program. Used by Seq.pm when internal parsing/formatting code fails.
**NOTE** Not currently used by any of the core bioperl modules. It can be used as a standalone interface to the readseq package but manual editing of is required. See the first few lines of the .pm file for details.
This package was called upon by Seq.pm when internal attemts to format or parse a sequence fail. It is currently not used by any bioperl core module. Basically we decided to deal with sequence formatting in a different way.
Parse.pm is a simple interface to D.G. Gilbert's ReadSeq program, it is not meant to be particularly elegant or efficient. The interface should be abstract enough to allow future versions to seamlessly access other sequence conversion programs besides ReadSeq.
At this time the interface methods have not been fully thought out or implemented. Suggestions are welcome.
If ReadSeq is not on the local system, or this package is not properly configured, Seq.pm will (hopefully) realize this and not attempt to use this code.
The ReadSeq executable needs to be installed on your system.
Readseq is freely distributed and is available in shell archive (.shar) form via FTP from ftp.bio.indiana.edu (129.79.224.25) in the molbio/readseq directory. (URL) ftp://ftp.bio.indiana.edu/molbio/readseq/
use Parse;
If properly configured, Seq.pm will automatically use this module when internal methods at parsing or formatting fail.
The correct path to the readseq executable is configured into this module during the 'make Makefile.PL' phase of installation.
Manual edits needed in Parse.pm if auto-configuration does not happen:
- Change the value of $READSEQ_PATH so that it defines a path to the ReadSeq executable on your system.
- Uncomment the line(s) containing $OK = "Y"
Parse.pm should be usable is a standalone module. See the usage instructions.
ReadSeq has trouble with raw sequences so an explicit convert_from_raw() method has been written. The following code will return the sequence "GAATTCGTT" as a GCG formatted string.
$reply = &Parse::convert_from_raw(-sequence=>'GAATTCGTT', -fmt=>'gcg');
The "fmt" named-parameter field can be set for the following formats:
IG (or 'Stanford') GenBank (or 'GB') NBRF EMBL GCG Strider Fitch Fasta Zuker Phylip3.2 (use 'Phylip3') Phylip Plain (or 'Raw') PIR (or 'CODATA') MSF ASN.1 (use 'ASN1') PAUP Pretty
The "options" named-parameter field can be used to pass switches directly to the ReadSeq executable. This option should only be used by people familiar with operating ReadSeq on the command-line. Use at your own risk as this has not been fully tested.
As an example, the ReadSeq switch -c will cause all of the characters in the formatted sequence to be returned in lowercase.
-c
$reply = &Parse::convert_from_raw(-sequence=>"$seq_string", -options=>'-c', -fmt=>'gcg');
The following documentation describes the various functions contained in this package. Some functions are for internal use and are not meant to be called by the user; they are preceded by an underscore ("_").
Title : _rearrange Usage : n/a (internal function) Function : Rearranges named parameters to requested order. Example : &_rearrange([SEQUENCE,ID,DESC],@p); Returns : @params - an array of parameters in the requested order. Argument : $order : a reference to an array which describes the desired order of the named parameters. @param : an array of parameters, either as a list (in which case the function simply returns the list), or as an associative array (in which case the function sorts the values according to @{$order} and returns that new array.
Title : _write_tmp_file Usage : n/a (internal function) Function : Writes a temporary file to disk. Uses : the POSIX tmpnam() call to get path & : filename. Should be more portable than : just writing to /tmp. : Example : &_write_tmp_file("$formatted_sequence"); Returns : string containing the temp file path Argument : string that is to be written to disk
Title : version Usage : &Parse::version; Function : Prints current package version Example : &Parse::version; Returns : none Argument : none :
Title : convert_from_raw() Usage : print &Parse::convert_from_raw(-sequence=>$raw_seq, : -fmt=>'asn1'); : : $reply = &Parse::convert_from_raw(-sequence=>'GAATTCGTT', : -options=>'-c', : -fmt=>'gcg'); : Function : ReadSeq does not function well when called upon : to read or convert "raw" or unformatted sequence : strings or files. This code will take a given : raw sequence and manipulate it into FASTA : format before invoking ReadSeq. : : The following named paramters may be used as : arguments: : : -sequence=> Sequence string. : -fmt=> Format sequence will be converted to. : -options=> String containing command-line : switches for ReadSeq. Passed : directly. : Example : see usage Returns : Formatted sequence string Argument : named parameters, see function :
Title : convert : Usage : print &Parse::convert(-sequence=>$raw_seq, : -fmt=>'asn1'); : : $reply = &Parse::convert(-sequence=>'GAATTCGTT', : -options=>'-c', : -fmt=>'gcg'); : : $reply = &Parse::convert(-location=>'/tmp/a.seq', : -fmt=>'raw'); : Note : ReadSeq does not function well when called upon : to read or convert "raw" or unformatted sequence : strings or files. User beware. : Function : Will read/parse a given sequence string *OR* a given : sequence file. : : If a sequence string AND a sequence file path are : both passed in, the file path will be used with no : complaint. : : The following named paramters may be used as : arguments: : : -sequence=> Sequence string. : -location=> Sequence file path. : -fmt=> Format sequence will be converted to. : -options=> String containing command-line : switches for ReadSeq. Passed : directly. : Example : see usage Returns : Formatted sequence string Argument : named parameters, see function :
Core bioperl modules
Bioperl Project http://bio.perl.org
Copyright (c) 1997-1998 Chris Dagdigian, Georg Fuellen, Steven E. Brenner and others. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Bio::Seq, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Seq
CPAN shell
perl -MCPAN -e shell install Bio::Seq
For more information on module installation, please visit the detailed CPAN module installation guide.