The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Bio::BPWrapper::SeqManipulations - Functions for bioseq

SYNOPSIS

    use Bio::BPWrapper::SeqManipulations;
    # Set options hash ...
    initialize(\%opts);
    write_out(\%opts);

SUBROUTINES

initialize()

Sets up most of the actions to be performed on an alignment.

Call this right after setting up an options hash.

Sets package variables: $in, $in_format, $filename, $out_format, and $out.

write_out()

Writes out the sequence file.

Call this after calling #initialize(\%opts) and processing those options.

reading_frame_ops

Translate in 1, 3, or 6 frames based on the value of $opts set via #initilize(\%opts). Wraps Bio::Seq->translate(), Bio::SeqUtils->translate_3frames(), and Bio::SeqUtils->translate_6frames().

restrict_digest()

Predicted fragments from digestion by a specified restriction enzyme specified in $opts{restrinct} set via #initilize(\%opts).

An input file with a single sequence is expected. Wraps Bio::Restriction::Analysis->cut().

anonymize()

Replace sequence IDs with serial IDs n characters long, as specified in $opts{'anonymize'} set via #initilize(\%opts). For example if $opts{'anonymize'}, the first ID will be S0001. leading 'S' The length of the serial idea

A sed script file is produced with a .sed suffix that may be used with sed's '-f' argument. If the filename is '-', the sed file is named STDOUT.sed instead. A message containing the sed filename is written to STDERR.

shred_seq()

Break into individual sequences writing a FASTA file for each sequence.

count_codons()

Count codons for coding sequences (e.g., a genome file consisting of CDS sequences). Wraps Bio::Tools::SeqStats->count_codons().

print gene sequences in FASTA from a GenBank file of bacterial genome. Won't work for a eukaryote genbank file.

count_leading_gaps()

Count and print the number of leading gaps in each sequence.

hydroB()

Return the mean Kyte-Doolittle hydropathicity for protein sequences. Wraps Bio::Tools::SeqStats->hydropathicity().

linearize()

Linearize FASTA, print one sequence per line.

reloop_at()

Re-circularize a bacterial genome by starting at a specified position given in the $opts{"reloop" set via #initilize(\%opts).

For example for sequence "ABCDE". bioseq -R'2' .. would generate"'BCDEA".

remove_stop()

Remove stop codons.