Ewan Birney


Bio::Tools::RestrictionEnzyme - Bioperl object for a restriction endonuclease (cuts DNA at specific locations)


Object Creation

    require Bio::Tools::RestrictionEnzyme;

    ## Create a new object by name.

    $re1 = new Bio::Tools::RestrictionEnzyme(-NAME =>'EcoRI');

    ## Create a new object using special syntax
    ## which specifies the enzyme name, recognition site, and cut position.
    ## Used for enzymes not known to this module.

    $re2 = new Bio::Tools::RestrictionEnzyme(-NAME =>'EcoRV--GAT^ATC', 
                                             -MAKE =>'custom');

    ## Get a list of names of all available restriction enzymes 
    ## known to this module.

    @all = $re->available_list();

    ## Get the names of restriction enzymes that have 6 bp 
    ## recognition sequences

    @sixcutters = $re->available_list(6);


This module is included with the central Bioperl distribution:


Follow the installation instructions included in the README file.


The Bio::Tools::RestrictionEnzyme.pm module encapsulates generic data and methods for using restriction endonucleases for in silico restriction analysis of DNA sequences.


This module is a precursor for a more full featured version that may do such things as download data from online databases such as REBase http://www.neb.com/rebase/. Thus, there is currently no functionality for obtaining data about commercial availability for a restriction enzyme.

At some point in the future, it may make sense to derive RestrictionEnzymes from a class such as Bio::Enzyme or Bio::Prot::Protein (neither of which now exist) so that more data about the enzyme and related information can be easily obtained.

This module is currently in use at


Digesting on Runs of N

To digest a sequence on runs of N's in the sequence. Here's what you can do:

    $re_n  = new Bio::Tools::RestrictionEnzyme(-name=>'N--NNNNN', -make=>'custom');

Specify the number of N's you want to match in the -name parameter. So the above example will recognize and cut at runs of 5 Ns. If you wanted to cut at runs of 10 N's, you would use

     -name => 'N--NNNNNNNNNN'

Note that you must use a specific number of N's, you cannot use a regexp to digest at N+ for example, because the actual number of N's at each site are not recorded when the sequence is analyzed. So cut_locations( ) wouldn't be correct.


See the script examples/restriction.pl in the Bioperl distribution.


Bio::Tools::RestrictionEnzyme.pm is a concrete class that inherits from Bio::Root::Root and uses by delegation Bio::PrimarySeq.


Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

   bioperl-l@bioperl.org             - General discussion
   http://bioperl.org/MailList.shtml - About the mailing lists

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:



Steve Chervitz, <sac@bioperl.org>


Copyright (c) 1997-2002 Steve A. Chervitz. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


  Bio::Root::Root    - Base class.
  Bio::PrimarySeq    - Lightweight sequence object.

  http://bio.perl.org/  - Bioperl Project Homepage


Methods beginning with a leading underscore are considered private and are intended for internal use by this module. They are not considered part of the public interface and are described here for documentation purposes only.


 Title     : new
 Purpose   : Initializes the RestrictionEnzyme object and calls
           : superclass constructor last (Bio:Seq.pm).
 Returns   : n/a
 Argument  : Parameters passed to new()
 Comments  : A RestrictionEnzyme object manages its recognition sequence
           : as a Bio::PrimarySeq object.

See Also : _make_custom(), _make_standard(), Bio::PrimarySeq.pm::_initialize()


 Title     : cuts_after
 Usage     : $num = $re->cuts_after();
 Purpose   : Sets/Gets an integer indicating the position of cleavage 
           : relative to the 5' end of the recognition sequence.
 Returns   : Integer
 Argument  : Integer (optional)
 Throws    : Exception if argument is non-numeric.
 Access    : Public
 Comments  : This method is only needed to change the cuts at
           : position. This data is automatically set during
           : construction.

See Also : _make_standard(), _make_custom()


 Title     : site
 Usage     : $re->site();
 Purpose   : Gets the recognition sequence for the enzyme. 
 Example   : $seq_string = $re->site();
 Returns   : String containing recognition sequence indicating 
           : cleavage site as in  'G^AATTC'.
 Argument  : n/a
 Throws    : n/a
 Comments  : If you want a simple string representing the site without 
             any '^', use the string() method.

See Also : string()


 Title     : seq
 Usage     : $re->seq();
 Purpose   : Get the Bio::PrimarySeq.pm-derived object representing 
           : the recognition sequence
 Returns   : String
 Argument  : n/a
 Throws    : n/a

See Also : string(), revcom()


 Title     : string
 Usage     : $re->string();
 Purpose   : Get a string representing the recognition sequence.
 Returns   : String. Does NOT contain a  '^' representing the cut location
             as returned by the site() method
 Argument  : n/a
 Throws    : n/a
 Comments  : Delegates to the Bio::PrimarySeq-derived object.

See Also : seq(), site(), revcom()


 Title     : revcom
 Usage     : $re->revcom();
 Purpose   : Get a string representing the reverse complement of
           : the recognition sequence.
 Returns   : String
 Argument  : n/a
 Throws    : n/a
 Comments  : Delegates to the Bio::PrimarySeq.pm-derived object, but needs to get
             out the string from it, as now Bio::PrimarySeq->revcom makes a Bio::PrimarySeq

See Also : seq(), string()


 Title     : cut_seq
 Usage     : $re->cut_seq(<sequence object>);
 Purpose   : Conceptually cut or "digest" a DNA sequence with the given enzyme.
 Example   : $string = $re->cut_seq(<sequence object>); 
 Returns   : List of strings containing the resulting fragments.
 Argument  : Reference to a Bio::PrimarySeq.pm-derived object.
 Throws    : Exception if argument is not an object.
           : (Does not yet verify that it is derived from Bio::PrimarySeq.pm.)
 Comments  : Strategy relies on Perl's built-in split() function.
           : Since split removes the recognition pattern, the resulting
           : fragments must be repaired after split()-ing.
           : There is currently no support for partial digestions.


 Title     : cut_locations
 Usage     : my $locations = $re->cut_locations(<sequence_object>);
 Purpose   : Report the location of the recognition site(s) within
           : an input sequence. 
 Example   : my $locations = $re->annotate_seq($seqObj);
 Returns   : Arrayref of starting locations where enzyme would cut 
 Argument  : Reference to a Bio::PrimarySeqI-derived sequence object.
 Throws    : n/a
 Comments  : 


 Title     : annotate_seq
 Usage     : $re->annotate_seq(<sequence_object>);
 Purpose   : Identify the location of the recognition site(s) within
           : an input sequence. Uses HTML.
 Example   : $annot_seq = $re->annotate_seq($seqObj);
 Returns   : String containing the annotated sequence.
 Argument  : Reference to a Bio::PrimarySeq.pm-derived sequence object.
 Throws    : n/a
 Comments  : The annotated sequence must be viewed with a web
           : browser to see the location(s) of the recognition site(s).


 Title     : palindromic
 Usage     : $re->palindromic();
 Purpose   : Determines if the recognition sequence is palindromic
           : for the current restriction enzyme.
 Returns   : Boolean
 Argument  : n/a
 Throws    : n/a
 Access    : Public 
 Comments  : A palindromic site (EcoRI): 5-GAATTC-3
           :                             3-CTTAAG-5


 Title     : is_available
 Usage     : $re->is_available(<string containing name of enzyme>);
 Purpose   : Determine if an enzyme is available (to this module).
           : (see the package lexical %RE).
 Example   : $re->is_available('EcoRI');
           : &Bio::Tools::RestrictionEnzyme::is_available($object,'EcoRI');
 Returns   : Boolean
 Argument  : String
 Throws    : n/a
 Comments  : This method does NOT give information about
           : commercial availability (yet). 
           : Enzyme names are CASE SENSITIVE.

See Also : available_list()


 Title   : name
 Usage   : $obj->name($newval)
 Example : 
 Returns : value of name
 Args    : newvalue (optional)


 Title     : available_list
 Usage     : $re->available_list([<integer>]);
 Purpose   : Retrieve a list of currently available enzymes.
 Example   : @all = $re->available_list();  ## All enzymes
           : @six_cutters = $re->available_list(6);  ## All 6-cutters
 Returns   : List of strings
 Argument  : Integer (optional)
 Throws    : n/a
 Comments  : This method may be more appropriate for a REData.pm class.

See Also : is_available()