NAME
Bio::Trace::ABIF - Perl extension for reading and parsing ABIF (Applied Biosystems, Inc. Format) files
VERSION
Version 1.05
SYNOPSIS
The ABIF file format is a binary format for storing data (especially, those produced by sequencers), developed by Applied Biosystems, Inc. Typical file suffixes for such files are .ab1
and .fsa
.
The data inside ABIF files is organized in records, in the following referred to as either directory entries or data items. Each data item is uniquely identified by a pair made of a four character string and a number: we call such pair a tag and its components the tag name and the tag number, respectively. Tags are defined in the official documentation for ABIF files (see the "SEE ALSO" Section at the end of this document).
This module provides methods for accessing any data item contained into an ABIF file (with or without knowledge of the corresponding tag) and methods for assessing the quality of the data (e.g., for computing LOR scores, clear ranges, and so on). The module has also support for ABIF file modification, that is, any directory entry can be overwritten (it is not possible, however, to add new directory entries corresponding to tags not already present in the file).
use Bio::Trace::ABIF;
my $abif = Bio::Trace::ABIF->new();
$abif->open_abif('/Path/to/my/file.ab1');
print $abif->sample_name(), "\n";
my @quality_values = $abif->quality_values();
my $sequence = $abif->sequence();
# etc...
$abif->close_abif();
module_version()
Usage : $version = Bio::Trace::ABIF->module_version();
Returns : This module's version number.
CONSTRUCTOR
Creates a new ABIF object.
new()
Usage : my $abif = Bio::Trace::ABIF->new();
Returns : An instance of ABIF.
Creates an ABIF object.
OPENING AND CLOSING ABIF FILES
The methods in this section allow you to open an ABIF file (either read-only or for modification), to close it or to verify the ABIF format version number.
open_abif()
Usage : $abif->open_abif($pathname);
$abif->open_abif($pathname, 1); # Read/Write mode
Returns : 1 if the file is opened;
0 otherwise.
Opens the specified file in binary format and checks whether it is in ABIF format. If the second optional argument is not false then the file is opened in read/write mode (by default, the file is opened in read only mode). Opening in read/write mode is necessary only if you want to use write_tag()
(see below).
close_abif()
Usage : $abif->close_abif();
Returns : Nothing.
Closes the currently opened file.
is_abif_open()
Usage : if ($abif->is_abif_open()) { # ...
Returns : 1 if an ABIF file is open;
0 otherwise.
is_abif_format()
Usage : if ($abif->is_abif_format()) { # ...
Returns : 1 if the file is in ABIF format;
0 otherwise.
Checks that the file is in ABIF format.
abif_version()
Usage : $v = $abif->abif_version();
Returns : The ABIF file version number (e.g., '1.01').
Used to determine the ABIF file version number.
GENERAL METHODS
The "low-level" methods of this section allow you to access any directory entry in a file. It is up to the caller to correctly interpret the values returned by these methods, so they should be used only if the caller knows what (s)he is doing. In any case, it is strongly recommended to use the accessor methods defined later in this document: in most cases, they will do just fine.
num_dir_entries()
Usage : $n = $abif->num_dir_entries();
Returns : The number of data items in the file.
Used to determine the number of directory entries in the ABIF file.
data_offset()
Usage : $n = $abif->data_offset();
Returns : The offset of the first data item, in bytes.
Used to determine the offset of the first directory entry from the beginning of the file.
tags()
Usage : @tags = $abif->tags();
Returns : A list of the tags in the file.
get_directory()
Usage : %D = $abif->get_directory($tagname, $tagnum);
Returns : A hash of the content of the given data item;
() if the given tag is not found.
Retrieves the directory entry identified by the pair ($tag_name
, $tag_num
). The $tagname
must be a four letter ASCII code and $tagnum
must be an integer (typically, 1 <= $tag_num
<= 1000). The returned hash has the following keys:
TAG_NAME: the tag name;
TAG_NUMBER: the tag number;
ELEMENT_TYPE: a string denoting the type of the data item
('char', 'byte', 'float', etc...);
ELEMENT_SIZE: the size, in bytes, of one element;
NUM_ELEMENTS: the number of elements in the data item;
DATA_SIZE: the size, in bytes, of the data item;
DATA_ITEM: the raw sequence of bytes of the data item.
Nota Bene: it is upon the caller to interpret the data item field correctly (typically, by unpack()
ing the item).
Refer to the "SEE ALSO" Section for further information.
get_data_item()
Usage : @data = $abif->get_data_item($tagname,
$tagnum,
$template
);
Returns : A list of elements unpacked according to $template;
(), if the tag is not found.
Retrieves the data item specified by the pair ($tagname
, $tagnum
) and unpacks it according to $template
. The $tagname
is a four letter ASCII code and $tagnum
is an integer (typically, 1 <= $tagnum
<= 1000). The $template
has the same format as in the pack()
function.
Refer to the "SEE ALSO" Section for further information.
SEARCHING AND OVERWRITING DATA
The methods in this section allow you to search for a specific tag and to overwrite existing data corresponding to a given tag.
search_tag()
Usage : $abif->search_tag($tagname, $tagnum)
Returns : 1 if the tag is found;
0, otherwise
Searches for the the specified data tag. If the tag is found, then the file handle is positioned just after the tag number (ready to read the element type).
write_tag()
Usage : $abif->write_tag($tagname, $tagnum, $data);
$abif->write_tag($tagname, $tagnum, \@data);
$abif->write_tag($tagname, $tagnum, \$data_str);
Returns : 1 if the data item is overwritten;
0, otherwise.
Overwrites an existing tag with the given data. You may find the tag name and the tag number of each piece of data in an ABIF file in the documentation of the corresponding method (see below). You must open the file in read/write mode if you want to overwrite it (see open_abif()
).
REMEMBER TO BACKUP YOUR FILE BEFORE OVERWRITING IT!
You must be careful when you overwrite data: the type of the new data must match the type of the old one. There is no restriction on the length of the data, e.g. you may overwrite the basecalled sequence with a longer or shorter one. Examples of how to use this method follow.
To overwrite the basecalled sequence:
my $new_sequence = 'GATGCATCT...';
$abif->write_tag('PBAS', 1, \$new_sequence);
# ($new_sequence can be passed also by value)
print 'New sequence is: ', $abif->edited_sequence();
To overwrite the quality values:
my @qv = (10, 20, 30, ...); # All values must be < 128
$abif->write_tag('PCON', 1, \@qv); # Pass by reference!
print 'New qv's: ', $abif->edited_quality_values();
To overwrite a date:
# Date format: yyyy-mm-dd
$abif->write_tag('RUND', 3, '2007-01-22');
print 'New date: ', $abif->data_collection_start_date();
To overwrite a time stamp:
# Time format: hh:mm:ss.nn
$abif->write_tag('RUNT', 4, '16:01:30.45');
print 'New time: ', $abif->data_collection_stop_time();
To overwrite a comment:
$abif->write_tag('CMNT', 1, 'New comment');
print 'New comment: ', $abif->comment();
To overwrite noise values:
my @noise = (3.14, 2.71, ...);
$abif->write_tag('NOIS', 1, \@noise);
print 'Noise values: ', $abif->noise();
To overwrite the capillary number:
$abif->write_tag('LANE', 1, 95);
print 'Capillary number: ', $abif->capillary_number();
and so on.
ACCESSOR METHODS
The methods in this section can be used to retrieve specific information from a file without having to specify a tag. It is strongly recommended that you read data from a file by using one or more of these methods.
analyzed_data_for_channel()
Usage : @data = analyzed_data_for_channel($ch_num);
Returns : The channel analyzed data;
() if the channel number is out of range
or the data item is not in the file.
ABIF Tag : DATA9, DATA10, DATA11, DATA12, DATA205
ABIF Type : short array
File Type : ab1
There are four channels in an ABIF file, numbered from 1 to 4. An optional channel number 5 exists in some files. The channel number is the argument of the method.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
analysis_protocol_settings_name()
Usage : $s = $abif->analysis_protocol_settings_name();
Returns : The Analysis Protocol settings name;
undef if the data item is not in the file.
ABIF Tag : APrN1
ABIF Type : cString
File Type : ab1
analysis_protocol_settings_version()
Usage : $s = $abif->analysis_protocol_settings_version();
Returns : The Analysis Protocol settings version;
undef if the data item is not in the file.
ABIF Tag : APrV1
ABIF Type : cString
File Type : ab1
analysis_protocol_xml()
Usage : $xml = $abif->analysis_protocol_xml();
Returns : The Analysis Protocol XML string;
undef if the data item is not in the file.
ABIF Tag : APrX1
ABIF Type : char array
File Type : ab1
analysis_protocol_xml_schema_version()
Usage : $s = $abif->analysis_protocol_xml_schema_version();
Returns : The Analysis Protocol XML schema version;
undef if the data item is not in the file.
ABIF Tag : APXV1
ABIF Type : cString
File Type : ab1
analysis_return_code()
Usage : $rc = $abif->analysis_return_code();
Returns : The analysis return code;
undef if the data item is not in the file.
ABIF Tag : ARTN1
ABIF Type : long
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
avg_peak_spacing()
Usage : $aps = $abif->avg_peak_spacing();
Returns : The average peak spacing used in last analysis;
undef if the data item is not in the file.
ABIF Tag : SPAC1
ABIF Type : float
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
basecaller_apsf()
Usage : $n = $abif->basecaller_apsf();
Returns : The basecaller adaptive processing success flag;
undef if the data item is not in the file.
ABIF Tag : ASPF1
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
basecaller_bcp_dll()
Usage : $v = basecaller_bcp_dll();
Returns : A string with the basecalled BCP/DLL;
undef if the data item is not in the file.
ABIF Tag : SPAC2
ABIF Type : pString
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
basecaller_version()
Usage : $v = $abif->basecaller_version();
Returns : The basecaller version (e.g., 'KB 1.3.0');
undef if the data item is not in the file.
ABIF Tag : SVER2
ABIF Type : pString
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
basecalling_analysis_timestamp()
Usage : $s = $abif->basecalling_analysis_timestamp();
Returns : A time stamp;
undef if the data item is not in the file.
ABIF Tag : BCTS1
ABIF Type : pString
File Type : ab1
Returns the time stamp for last successful basecalling analysis.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
base_locations()
Usage : @bl = $abif->base_locations();
Returns : The list of base locations;
() if the data item is not in the file.
ABIF Tag : PLOC2
ABIF Type : short array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
base_locations_edited()
Usage : @bl = $abif->base_locations_edited();
Returns : The list of base locations (edited);
() if the data item is not in the file.
ABIF Tag : PLOC1
ABIF Type : short array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
base_order()
Usage : @bo = $abif->base_order();
Returns : An array of characters sorted by channel number;
() if the data item is not in the file.
ABIF Tag : FWO_1
ABIF Type : char array
File Type : ab1
Returns an array of characters sorted by increasing channel number. For example, if the list is ('G', 'A', 'T', 'C')
then G is channel 1, A is channel 2, and so on. If you want to do the opposite, that is, mapping bases to their channels, use order_base()
instead. See also the channel()
method.
base_spacing()
Usage : $spacing = $abif->base_spacing();
Returns : The spacing;
undef if the data item is not in the file.
ABIF Tag : SPAC3
ABIF Type : float
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
buffer_tray_temperature()
Usage : @T = $abif->buffer_tray_temperature();
Returns : The buffer tray heater temperature in °C;
() if the data item is not in the file.
ABIF Tag : BufT1
ABIF Type : short array
File Type : ab1
capillary_number()
Usage : $cap_n = $abif->capillary_number();
Returns : The LANE/Capillary number;
undef if the data item is not in the file.
ABIF Tag : LANE1
ABIF Type : short
File Type : ab1, fsa
channel()
Usage : $n = $abif->channel($base);
Returns : The channel number corresponding to a given base.
undef if the data item is not in the file.
Returns the channel number corresponding to the given base.
The possible values for $base
are 'A', 'C', 'G' and 'T' (case insensitive).
chem()
Usage : $s = $abif->chem();
Returns : The primer or terminator chemistry;
undef if the data item is not in the file.
ABIF Tag : phCH1
ABIF Type : pString
File Type : ab1
Returns the primer or terminator chemistry (equivalent to CHEM in phd1 file).
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
comment()
Usage : $comment = $abif->comment();
$comment = $abif->comment($n);
Returns : The comment about the sample;
undef if the data item is not in the file.
ABIF Tag : CMNT1 ... CMNT 'N'
ABIF Type : pString
File Type : ab1, fsa
This is an optional data item. In some files there is more than one comment: the optional argument is used to specify the number of the comment.
comment_title()
Usage : $comment_title = $abif->comment_title();
Returns : The comment title;
undef if the data item is not in the file.
ABIF Tag : CTTL1
ABIF Type : pString
File Type : ab1, fsa
container_identifier()
Usage : $id = $abif->container_identifier();
Returns : The container identifier, a.k.a. plate barcode;
undef if the data item is not in the file.
ABIF Tag : CTID1
ABIF Type : cString
File Type : ab1, fsa
container_name()
Usage : $name = $abif->container_name();
Returns : The container name;
undef if the data item is not in the file.
ABIF Tag : CTNM1
ABIF Type : cString
File Type : ab1, fsa
Usually, this is identical to the container identifier.
container_owner()
Usage : $owner = $abif->container_owner();
Returns : The container's owner;
: undef if the data item is not in the file.
ABIF Tag : CTow1
ABIF Type : cString
File Type : ab1
current()
Usage : @c = $abif->current();
Returns : Current, measured in milliamps;
() if the data item is not in the file.
ABIF Tag : DATA6
ABIF Type : short array
File Type : ab1, fsa
data_collection_module_file()
Usage : $s = $abif->data_collection_module_file();
Returns : The data collection module file;
undef if the data item is not in the file.
ABIF Tag : MODF1
ABIF Type : pString
File Type : ab1, fsa
data_collection_software_version()
Usage : $v = $abif->data_collection_software_version();
Returns : The data collection software version.
undef if the data item is not in the file.
ABIF Tag : SVER1
ABIF Type : pString
File Type : ab1, fsa
data_collection_firmware_version()
Usage : $v = $abif->data_collection_firmware_version();
Returns : The data collection firmware version;
undef if the data item is not in the file.
ABIF Tag : SVER3
ABIF Type : pString
File Type : ab1, fsa
data_collection_start_date()
Usage : $date = $abif->data_collection_start_date();
Returns : The Data Collection start date (yyyy-mm-dd);
undef if the data item is not in the file.
ABIF Tag : RUND3
ABIF Type : date
File Type : ab1, fsa
data_collection_start_time()
Usage : $time = $abif->data_collection_start_time();
Returns : The Data Collection start time (hh:mm:ss.nn);
undef if the data item is not in the file.
ABIF Tag : RUNT3
ABIF Type : time
File Type : ab1, fsa
data_collection_stop_date()
Usage : $date = $abif->data_collection_stop_date();
Returns : The Data Collection stop date (yyyy-mm-dd);
undef if the data item is not in the file.
ABIF Tag : RUND4
ABIF Type : date
File Type : ab1, fsa
data_collection_stop_time()
Usage : $time = $abif->data_collection_stop_time();
Returns : The Data Collection stop time (hh:mm:ss.nn);
undef if the data item is not in the file.
ABIF Tag : RUNT4
ABIF Type : time
File Type : ab1, fsa
detector_heater_temperature()
Usage : $dt = $abif->detector_heater_temperature();
Returns : The detector cell heater temperature in °C;
undef if the data item is not in the file.
ABIF Tag : DCHT1
ABIF Type : short
File Type : ab1
downsampling_factor()
Usage : $df = $abif->downsampling_factor();
Returns : The downsampling factor;
undef if the data item is not in the file.
ABIF Tag : DSam1
ABIF Type : short
File Type : ab1, fsa
dye_name()
Usage : $n = $abif->dye_name($n);
Returns : The name of dye number $n;
undef if the data item is not in the file;
undef if $n is not in the range [1..5].
ABIF Tag : DyeN1, DyeN2, DyeN3, DyeN4, DyeN5
ABIF Type : pString
File Type : ab1, fsa
Dye 5 name is an optional tag.
dye_set_name()
Usage : $dsn = $abif->dye_set_name();
Returns : The dye set name;
undef if the data item is not in the file.
ABIF Tag : DySN1
ABIF Type : pString
File Type : ab1, fsa
dye_significance()
Usage : $dsn = $abif->dye_significance($n);
Returns : The $n-th dye significance;
undef if the data item is not in the file
ABIF Tag : DyeB1, DyeB2, DyeB3, DyeB4, DyeB5
ABIF Type : char
File Type : fsa
The argument must be an integer from 1 to 5. Dye significance 5 is optional. The returned value is 'S' for standard, ' ' for sample;
dye_type()
Usage : $dsn = $abif->dye_type();
Returns : The dye type;
undef if the data item is not in the file.
ABIF Tag : phDY1
ABIF Type : pString
File Type : ab1
The dye type is equivalent to DYE in phd1
files.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
dye_wavelength()
Usage : $n = $abif->dye_wavelength($n);
Returns : The wavelength of dye number $n;
undef if the data item is not in the file;
undef if $n is not in the range [1..5].
ABIF Tag : DyeW1, DyeW2, DyeW3, DyeW4, DyeW5
ABIF Type : short
File Type : ab1, fsa
Dye 5 wavelength is an optional data item.
edited_quality_values()
Usage : @qv = $abif->edited_quality_values();
Returns : The list of edited quality values;
() if the data item is not in the file.
ABIF Tag : PCON1
ABIF Type : char array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
edited_quality_values_ref()
Usage : $ref_to_qv = $abif->edited_quality_values_ref();
Returns : A reference to the list of edited quality values;
a reference to the empty list if the data item
is not in the file.
ABIF Tag : PCON1
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
edited_sequence()
Usage : $sequence = edited_sequence();
Returns : The string of the edited basecalled sequence;
undef if the data item is not in the file.
ABIF Tag : PBAS1
ABIF Type : char array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
edited_sequence_length()
Usage : $l = edited_sequence_length();
Returns : The length of the basecalled sequence;
0 if the sequence is not in the file.
File Type : ab1
electrophoresis_voltage()
Usage : $v = $abif->electrophoresis_voltage();
Returns : The electrophoresis voltage setting in volts;
undef if the data item is not found.
ABIF Tag : EPVt1
ABIF Type : long
File Type : ab1, fsa
gel_type()
Usage : $s = $abif->gel_type();
Returns : The gel type description;
undef if the data item is not in the file.
ABIF Tag : GTyp1
ABIF Type : pString
File Type : ab1, fsa
gene_mapper_analysis_method()
Usage : $s = $abif->gene_mapper_analysis_method();
Returns : The GeneMapper(R) software analysis method name;
undef if the data item is not in the file.
ABIF Tag : ANME1
ABIF Type : cString
File Type : fsa
gene_mapper_panel_name()
Usage : $s = $abif->gene_mapper_panel_name();
Returns : The GeneMapper(R) software panel name;
undef if the data item is not in the file.
ABIF Tag : PANL1
ABIF Type : cString
File Type : fsa
gene_mapper_sample_type()
Usage : $s = $abif->gene_mapper_sample_type();
Returns : The GeneMapper(R) software Sample Type;
undef if the data item is not in the file.
ABIF Tag : STYP1
ABIF Type : cString
File Type : fsa
gene_scan_sample_name()
Usage : $s = $abif->gene_scan_sample_name();
Returns : The sample name for GeneScan(R) sample files;
undef if the data item is not in the file.
ABIF Tag : SpNm1
ABIF Type : pString
File Type : fsa
injection_time()
Usage : $t = $abif->injection_time();
Returns : The injection time in seconds;
undef if the data item is not in the file.
ABIF Tag : InSc1
ABIF Type : long
File Type : ab1, fsa
injection_voltage()
Usage : $t = $abif->injection_voltage();
Returns : The injection voltage in volts;
undef if the data item is not in the file
ABIF Tag : InVt1
ABIF Type : long
File Type : ab1, fsa
instrument_class()
Usage : $class = $abif->instrument_class();
Returns : The instrument class;
undef if the data item is not in the file.
ABIF Tag : HCFG1
ABIF Type : cString
File Type : ab1
instrument_family()
Usage : $class = $abif->instrument_family();
Returns : The instrument family;
undef if the data item is not in the file.
ABIF Tag : HCFG2
ABIF Type : cString
File Type : ab1
instrument_name_and_serial_number()
Usage : $sn = instrument_name_and_serial_number()
Returns : The instrument name and the serial number;
undef if the data item is not in the file.
ABIF Tag : MCHN1
ABIF Type : pString
File Type : ab1, fsa
instrument_param()
Usage : $param = $abif->instrument_param();
Returns : The instrument parameters;
undef if the data item is not in the file.
ABIF Tag : HCFG4
ABIF Type : cString
File Type : ab1
is_capillary_machine()
Usage : $bool = $abif->is_capillary_machine();
Returns : A value > 0 if the data item is true;
0 if the data item is false;
undef if the data item is not in the file.
ABIF Tag : CpEP1
ABIF Type : byte
File Type : ab1, fsa
laser_power()
Usage : $n = $abif->laser_power();
Returns : The laser power setting in microwatt;
undef if the data item is not in the file.
ABIF Tag : LsrP1
ABIF Type : long
File Type : ab1, fsa
length_to_detector()
Usage : $n = $abif->length_to_detector();
Returns : The length of detector in cm;
undef if the data item is not in the file.
ABIF Tag : LNTD1
ABIF Type : short
File Type : ab1, fsa
mobility_file()
Usage : $mb = $abif->mobility_file()
Returns : The mobility file;
undef if the data item is not in the file.
ABIF Tag : PDMF2
ABIF Type : pString
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
mobility_file_orig()
Usage : $mb = $abif->mobility_file_orig()
Returns : The mobility file (orig);
undef if the data item is not in the file.
ABIF Tag : PDMF1
ABIF Type : pString
File Type : ab1
model_number()
Usage : $mn = $abif->model_number();
Returns : The model number;
undef if the data item is not in the file.
ABIF Tag : MODL1
ABIF Type : char[4]
File Type : ab1, fsa
noise()
Usage : %noise = $abif->noise();
Returns : The estimated noise for each dye;
() if the data item is not in the file.
ABIF Tag : NOIS1
ABIF Type : float array
File Type : ab1
The keys of the returned hash are the values retrieved with base_order()
. This is an optional data item. This method works only with files containing data processed by the KB(tm) Basecaller.
num_capillaries()
Usage : $nc = $abif->num_capillaries();
Returns : The number of capillaries;
undef if the data item is not in the file.
ABIF Tag : NLNE1
ABIF Type : short
File Type : ab1, fsa
num_dyes()
Usage : $n = $abif->num_dyes();
Returns : The number of dyes;
undef if the data item is not in the file.
ABIF Tag : Dye#1
ABIF Type : short
File Type : ab1, fsa
num_scans()
Usage : $n = $abif->num_scans();
Returns : The number of scans;
undef if the data item is not in the file.
ABIF Tag : SCAN1
ABIF Type : long
File Type : ab1, fsa
official_instrument_name()
Usage : $name = $abif->official_instrument_name();
Returns : The official instrument name;
undef if the data item is not in the file.
ABIF Tag : HCFG3
ABIF Type : cString
File Type : ab1
offscale_peaks()
Usage : @bytes = $abif->offscale_peaks($n);
Returns : The range of offscale peaks.
() if the data item is not in the file.
ABIF Tag : OffS1 ... OffS 'N'
ABIF Type : user
File Type : fsa
This data item's type is a user defined data structure. As such, it is returned as a list of bytes that must be interpreted by the caller. This is an optional data item.
offscale_scans()
Usage : @p = $abif->offscale_scans();
Returns : A list of scans.
() if the data item is not in the file.
ABIF Tag : OfSc1
ABIF Type : long array
File Type : ab1, fsa
Returns the list of scans that are marked off scale in Collection. This is an optional data item.
order_base()
Usage : %bases = $abif->order_base();
Returns : A mapping of the four bases to their channel numbers;
() if the base order is not in the file.
File Type : ab1
Returns the channel numbers corresponding to the bases. This method does the opposite as base_order()
does. See also the channel()
method.
peak1_location()
Usage : $pl = peak1_location();
Returns : The peak 1 location;
undef if the data item is not in the file.
ABIF Tag : B1Pt2
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
peak1_location_orig()
Usage : $pl = peak1_location_orig();
Returns : The peak 1 location (orig);
undef if the data item is not in the file.
ABIF Tag : B1Pt1
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
peak_area_ratio()
Usage : $par = $abif->peak_area_ratio();
Returns : The peak area ratio;
undef if the data item is not in the file.
ABIF Tag : phAR1
ABIF Type : float
File Type : ab1
Returns the peak area ratio (equivalent to TRACE_PEAK_AREA_RATIO in phd1 file).
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
peaks()
Usage : @pks = $abif->peaks(1);
Returns : An array of peak hashes. Each peak hash contains the following attributes:
'position', 'height', 'beginPos', 'endPos', 'beginHI', 'endHI',
'area', 'volume', 'fragSize', 'isEdited', 'label';
() if the data item is not in the file.
ABIF Tag : PEAK
ABIF Type : user-defined structure
File Type : fsa
Returns the data associated with PEAK data structures.
pixel_bin_size()
Usage : $n = $abif->pixel_bin_size();
Returns : The pixel bin size;
undef if the data item is not in the file.
ABIF Tag : PXLB1
ABIF Type : long
File Type : ab1, fsa
pixels_lane()
Usage : $n = $abif->pixels_lane();
Returns : The pixels averaged per lane;
undef if the data item is not in the file.
ABIF Tag : NAVG1
ABIF Type : short
File Type : ab1, fsa
plate_type()
Usage : $s = $abif->plate_type();
Returns : The plate type;
undef if the data item is not in the file.
ABIF Tag : PTYP1
ABIF Type : cString
File Type : ab1, fsa
Returns the plate type. Allowed values are 96-Well, 384-Well;
plate_size()
Usage : $n = $abif->plate_size();
Returns : The plate size.
undef if the data item is not in the file.
ABIF Tag : PSZE1
ABIF Type : long
File Type : ab1, fsa
Returns the number of sample positions in the container (allowed values are 96 and 384);
polymer_expiration_date()
Usage : $s = $abif->polymer_expiration_date()
Returns : The polymer lot expiration date;
undef if the data item is not in the file.
ABIF Tag : SMED1
ABIF Type : pString
File Type : ab1, fsa
The format of the date is implementation dependent.
polymer_lot_number()
Usage : $s = $abif->polymer_lot_number();
Returns : A string containing the polymer lot number;
undef if the data item is not in the file.
ABIF Tag : SMLt1
ABIF Type : pString
File Type : ab1, fsa
The format of the date is implementation dependent.
power()
Usage : @p = $abif->power();
Returns : The power, measured in milliwatts;
() if the data item is not in the file.
ABIF Tag : DATA7
ABIF Type : short array
File Type : ab1, fsa
quality_levels()
Usage : $n = $abif->quality_levels();
Returns : The maximum quality value;
undef if the data item is not in the file.
ABIF Tag : phQL1
ABIF Type : short
File Type : ab1
Returns the maximum quality value (equivalent to QUALITY_LEVELS in phd1 file).
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
quality_values()
Usage : @qv = $abif->quality_values();
Returns : The list of quality values;
() if the data item is not in the file.
ABIF Tag : PCON2
ABIF Type : char array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
quality_values_ref()
Usage : $qvref = $abif->quality_values_ref();
Returns : A reference to the list of quality values;
a reference to the empty list if
the data item is not in the file.
ABIF Tag : PCON2
ABIF Type : char array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
raw_data_for_channel()
Usage : @data = $abif->raw_data_for_channel($channel_number);
Returns : The channel $channel_number raw data;
() if the data item is not in the file.
ABIF Tag : DATA1, DATA2, DATA3, DATA4, DATA105
ABIF Type : short array
File Type : ab1, fsa
There are four channels in an ABIF file, numbered from 1 to 4. An optional channel number 5 exists in some files.
raw_trace()
Usage : @trace = $abif->raw_trace($base);
Returns : The raw trace corresponding to $base;
() if the data item is not in the file.
File Type : ab1
The possible values for $base
are 'A', 'C', 'G' and 'T' (case insensitive).
rescaling()
Usage : $name = $abif->rescaling();
Returns : The rescaling divisor for color data;
undef if the data item is not in the file.
ABIF Tag : Scal1
ABIF Type : float
File Type : ab1, fsa
results_group()
Usage : $name = $abif->results_group();
Returns : The results group name;
undef if the data item is not in the file.
ABIF Tag : RGNm1
ABIF Type : cString
File Type : ab1, fsa
results_group_comment()
Usage : $s = $abif->results_group_comment();
Returns : The results group comment;
undef if the data item is not in the file.
ABIF Tag : RGCm1
ABIF Type : cString
File Type : ab1, fsa
This is an optional data item.
results_group_owner()
Usage : $s = $abif->results_group_owner();
Returns : The results group owner;
undef if the data item is not in the file.
ABIF Tag : RGOw1
ABIF Type : cString
File Type : ab1
Returns the name entered as the owner of the results group, in the Results Group editor. This is an optional data item.
reverse_complement_flag()
Usage : $n = $abif->reverse_complement_flag();
Returns : The reverse complement flag;
undef if the data item is not in the file.
ABIF Tag : RevC1
ABIF Type : short
File Type : ab1
This data item is from Sequencing Analysis v5.2 Software.
run_module_name()
Usage : $name = $abif->run_module_name();
Returns : The run module name;
undef if the data item is not in the file.
ABIF Tag : RMdN1
ABIF Type : cString
File Type : ab1, fsa
This should be the same as the value returned by data_collection_module_file()
.
run_module_version()
Usage : $name = $abif->run_module_version();
Returns : The run module version;
undef if the data item is not in the file.
ABIF Tag : RMdV1
ABIF Type : cString
File Type : ab1, fsa
run_module_xml_schema_version()
Usage : $vers = $abif->run_module_xml_schema_version();
Returns : The run module XML schema version;
undef if the data item is not in the file.
ABIF Tag : RMXV1
ABIF Type : cString
File Type : ab1, fsa
run_module_xml_string()
Usage : $xml = $abif->run_module_xml_string();
Returns : The run module XML string;
undef if the data item is not in the file.
ABIF Tag : RMdX1
ABIF Type : char array
File Type : ab1, fsa
run_name()
Usage : $name = $abif->run_name();
Returns : The run name;
undef if the data item is not in the file.
ABIF Tag : RunN1
ABIF Type : cString
File Type : ab1, fsa
run_protocol_name()
Usage : $xml = $abif->run_protocol_name();
Returns : The run protocol name;
undef if the data item is not in the file.
ABIF Tag : RPrN1
ABIF Type : cString
File Type : ab1, fsa
run_protocol_version()
Usage : $vers = $abif->run_protocol_version();
Returns : The run protocol version;
undef if the data item is not in the file.
ABIF Tag : RPrV1
ABIF Type : cString
File Type : ab1, fsa
run_start_date()
Usage : $date = $abif->run_start_date();
Returns : The run start date (yyyy-mm-dd);
undef if the data item is not in the file.
ABIF Tag : RUND1
ABIF Type : date
File Type : ab1, fsa
run_start_time()
Usage : $time = $abif->run_start_time();
Returns : The run start time (hh:mm:ss.nn);
undef if the data item is not in the file.
ABIF Tag : RUNT1
ABIF Type : time
File Type : ab1, fsa
run_stop_date()
Usage : $date = $abif->run_stop_date();
Returns : The run stop date (yyyy-mm-dd);
undef if the data item is not in the file.
ABIF Tag : RUND2
ABIF Type : date
File Type : ab1, fsa
run_stop_time()
Usage : $time = $abif->run_stop_time();
Returns : The run stop time (hh:mm:ss.nn);
undef if the data item is not in the file.
ABIF Tag : RUNT2
ABIF Type : time
File Type : ab1, fsa
run_temperature()
Usage : $temp = $abif->run_temperature();
Returns : The run temperature setting in °C;
undef if the data item is not in the file.
ABIF Tag : Tmpr1
ABIF Type : long
File Type : ab1, fsa
sample_file_format_version()
Usage : $v = $abif->sample_file_format_version();
Returns : The Sample File Format Version;
undef if the data item is not in the file.
ABIF Tag : SVER4
ABIF Type : pString
File Type : fsa
The Sample File Format Version contains the version of the sample file format used to write the file.
sample_name()
Usage : $name = $abif->sample_name();
Returns : The sample name;
undef if the data item is not in the file.
ABIF Tag : SMPL1
ABIF Type : pString
File Type : ab1
sample_tracking_id()
Usage : $sample_id = $abif->sample_tracking_id();
Returns : The sample tracking ID;
undef if the data item is not in the file.
ABIF Tag : LIMS1
ABIF Type : pString
File Type : ab1, fsa
scanning_rate()
Usage : @bytes = $abif->scanning_rate();
Returns : The scanning rate;
() if the data item is not in the file.
ABIF Tag : Rate1
ABIF Type : user
File Type : ab1, fsa
This data item's type is a user defined data structure. As such, it is returned as a list of bytes that must be interpreted by the caller.
scan_color_data_values()
Usage : @C = $abif->scan_color_data_values($n);
Returns : A list of color data values;
() if the data item is not in the file.
ABIF Tag : OvrV1 ... OvrV 'N'
ABIF Type : long array
File Type : ab1, fsa
Returns the list of color data values for the locations listed by scan_number_indices()
. This is an optional data item.
scan_numbers()
Usage : @N = $abif->scan_numbers();
Returns : The scan numbers of data points;
() if the data item is not in the file.
ABIF Tag : Satd1
ABIF Type : long array
File Type : ab1, fsa
Returns an array of integers representing the scan numbers of data points, which are flagged as saturated by data collection;
This is an optional data item.
scan_number_indices()
Usage : @I = $abif->scan_number_indices($n);
Returns : A list of scan number indices;
() if the data item is not in the file.
ABIF Tag : OvrI1 ... OvrI 'N'
ABIF Type : long array
File Type : ab1, fsa
Returns the list of scan number indices for scans with color data value greater than 32767.
This is an optional data item.
seqscape_project_name()
Usage : $name = $abif->seqscape_project_name();
Returns : SeqScape(R) project name;
undef if the data item is not in the file.
ABIF Tag : PROJ4
ABIF Type : cString
File Type : ab1
This data item is in SeqScape(R) software sample files only. This is an optional data item.
seqscape_project_template()
Usage : name = $abif->seqscape_project_template();
Returns : SeqScape(R) project template name;
undef if the data item is not in the file.
ABIF Tag : PRJT1
ABIF Type : cString
File Type : ab1
This data item is in SeqScape(R) software sample files only. This is an optional data item.
seqscape_specimen_name()
Usage : $name = $abif->seqscape_specimen_name();
Returns : SeqScape(R) specimen name;
undef if the data item is not in the file.
ABIF Tag : SPEC1
ABIF Type : cString
File Type : ab1
This data item is in SeqScape(R) software sample files only. This is an optional data item.
sequence()
Usage : $sequence = sequence();
Returns : The basecalled sequence;
undef if the data item is not in the file.
ABIF Tag : PBAS2
ABIF Type : char array
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
sequence_length()
Usage : $l = sequence_length();
Returns : The length of the base called sequence;
0 if the sequence is not in the file.
File Type : ab1
sequencing_analysis_param_filename()
Usage : $f = sequencing_analysis_param_filename();
Returns : The Sequencing Analysis parameters filename;
undef if the data item is not in the file.
ABIF Tag : APFN2
ABIF Type : pString
File Type : ab1
signal_level()
Usage : %signal_level = $abif->signal_level();
Returns : The signal level for each dye;
() if the data item is not in the file.
ABIF Tag : S/N%1
ABIF Type : short array
File Type : ab1
The keys of the returned hash are the values retrieved with base_order()
.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
size_standard_filename()
Usage : $s = $abif->size_standard_filename();
Returns : The Size Standard file name;
undef if the data item is not in the file.
ABIF Tag : StdF1
ABIF Type : pString
File Type : fsa
snp_set_name()
Usage : $s = $abif->snp_set_name();
Returns : SNP set name;
undef if the data item is not in the file.
ABIF Tag : SnpS1
ABIF Type : pString
File Type : fsa
This is an optional data item.
start_collection_event()
Usage : $s = $abif->start_collection_event();
Returns : The start collection event;
undef if the data item is not in the file.
ABIF Tag : EVNT3
ABIF Type : pString
File Type : ab1, fsa
start_point()
Usage : $n = $abif->start_point();
Returns : The start point;
undef if the data item is not in the file.
ABIF Tag : ASPt2
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
start_point_orig()
Usage : $n = $abif->start_point_orig();
Returns : The start point (orig);
undef if the data item is not in the file.
ABIF Tag : ASPt1
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
start_run_event()
Usage : $s = $abif->start_run_event();
Returns : The start run event;
undef if the data item is not in the file.
ABIF Tag : EVNT1
ABIF Type : pString
File Type : ab1, fsa
stop_collection_event()
Usage : $s = $abif->stop_collection_event();
Returns : The stop collection event;
undef if the data item is not in the file.
ABIF Tag : EVNT4
ABIF Type : pString
File Type : ab1, fsa
stop_point()
Usage : $n = $abif->stop_point();
Returns : The stop point;
undef if the data item is not in the file.
ABIF Tag : AEPt2
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
stop_point_orig()
Usage : $n = $abif->stop_point_orig();
Returns : The stop point (orig);
undef if the data item is not in the file.
ABIF Tag : AEPt1
ABIF Type : short
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
stop_run_event()
Usage : $s = $abif->stop_run_event();
Returns : The stop run event;
undef if the data item is not in the file.
ABIF Tag : EVNT2
ABIF Type : pString
File Type : ab1, fsa
temperature()
Usage : @t = $abif->temperature();
Returns : The temperature, measured in °C
() if the data item is not in the file.
ABIF Tag : DATA8
ABIF Type : short array
File Type : ab1, fsa
trace()
Usage : @trace = $abif->trace($base);
Returns : The (analyzed) trace corresponding to $base;
() if the data item is not in the file.
File Type : ab1
The possible values for $base
are 'A', 'C', 'G' and 'T'.
trim_probability_threshold()
Usage : $pr = $abif->trim_probability_threshold();
Returns : The trim probability threshold used;
undef if the data item is not in the file.
ABIF Tag : phTR2
ABIF Type : float
File Type : ab1
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
trim_region()
Usage : $n = $abif->trim_region();
Returns : The read positions;
undef if the data item is not in the file.
ABIF Tag : phTR1
ABIF Type : short
File Type : ab1
Returns the read positions of the first and last bases in trim region; along with trim_probability_threshold()
, this is equivalent to TRIM in phd1 file.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
voltage()
Usage : @v = $abif->voltage();
Returns : The voltage, measured in decavolts;
() if the data item is not in the file.
ABIF Tag : DATA5
ABIF Type : short array
File Type : ab1, fsa
user()
Usage : $user = $abif->user();
Returns : The name of the user who created the plate;
undef if the data item is not in the file.
ABIF Tag : User1
ABIF Type : pString
File Type : ab1, fsa
This is an optional data item.
well_id()
Usage : $well_id = $abif->well_id();
Returns : The well ID;
undef if the data item is not in the file.
ABIF Tag : TUBE1
ABIF Type : pString
File Type : ab1, fsa
METHODS FOR ASSESSING QUALITY
The following methods compute some values that help assessing the quality of the data.
avg_signal_to_noise_ratio()
Usage : $sn_ratio = $abif->avg_signal_to_noise_ratio()
Returns : The average signal to noise ratio;
0 on error.
This method works only with files containing data processed by the KB(tm) Basecaller. If the information needed to compute such value is missing, it returns 0.
clear_range()
Usage : ($b, $e) = $abif->clear_range();
($b, $e) = $abif->clear_range(
$window_width,
$bad_bases_threshold,
$quality_threshold
);
Returns : The clear range of the sequence;
(-1, -1) if there is no clear range.
The Sequencing Analysis program determines the clear range of the sequence by trimming bases from the 5' to 3' ends until fewer than 4 bases out of 20 have a quality value less than 20. You can change these parameters by explicitly passing arguments to this method (the default values are $window_width
= 20, $bad_bases_threshold
= 4, $quality_threshold
= 20). Note that Sequencing Analysis counts the bases starting from one, so you have to add one to the return values to get consistent results.
clear_range_start()
Usage : $b = $abif->clear_range_start();
$b = $abif->clear_range_start(
$window_width,
$bad_bases_threshold,
$quality_threshold
);
Returns : The clear range start position;
-1 if no clear range exists.
See clear_range()
.
clear_range_stop()
Usage : $e = $abif->clear_range_stop();
$e = $abif->clear_range_stop(
$window_width,
$bad_bases_threshold,
$quality_threshold
);
Returns : The clear range stop position;
-1 if no clear range exists.
See clear_range()
.
contiguous_read_length()
Usage : ($b, $e) = $abif->contiguous_read_length(
$window_width,
$quality_threshold
);
($b, $e) = $abif->contiguous_read_length(
$window_width,
$quality_threshold,
$trim_ends
);
Returns : The start and stop position of the CRL;
(-1, -1) if there is no CRL.
The CRL is (the length of) the longest uninterrupted stretch in a read such that the average quality of any interval of $window_width
bases that is inside such stretch never goes below $threshold
. The threshold must be at least 10. The positions are counted from zero. If $trim_ends
is true, the ends of the CRL are trimmed until there are no bases with quality values less than 10 within the first five and the last five bases. Trimming is not applied by default. If there is more than one CRL, the position of the first one is reported.
length_of_read()
Usage : $LOR = $abif->length_of_read(
$window_width,
$quality_threshold
);
$LOR = $abif->length_of_read(
$window_width,
$quality_threshold,
$method
);
Returns : The Length Of Read (LOR) value.
The Length Of Read (LOR) score gives an approximate measure of the usable range of high-quality or high-accuracy bases determined by quality values. Such range can be determined in several ways. Two possible procedures are currently implemented and described below.
If $method
is the string 'SequencingAnalysis' then the LOR is computed as the widest range starting and ending with $window_width
bases whose average quality is greater than or equal to $quality_threshold
. This is the default method that is applied if this optional argument is omitted.
If $method
is the string 'GoodQualityWindows' then the LOR is computed as the number of intervals of $window_width
bases whose average quality is greater than or equal to $quality_threshold
.
num_low_quality_bases()
Usage : $n = $abif->num_low_quality_bases($threshold);
$n = $abif->num_low_quality_bases(
$threshold,
$start,
$stop
);
Returns : The number of low quality bases;
-1 on error.
Returns the number of quality bases in the range [$start,$stop]
, or in the whole sequence if no range is specified, with quality value less than or equal to $threshold
. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
num_high_quality_bases()
Usage : $n = $abif->num_high_quality_bases($threshold);
$n = $abif->num_high_quality_bases(
$threshold,
$start,
$stop
);
Returns : The number of high quality bases;
-1 on error.
Returns the number of quality bases in the range [$start,$stop]
, or in the whole sequence if no range is specified, with quality value greater than or equal to $threshold
. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
num_medium_quality_bases()
Usage : $n = $abif->num_medium_quality_bases(
$min_qv,
$max_qv
);
$n = $abif->num_medium_quality_bases(
$min_qv,
$max_qv,
$start,
$stop
);
Returns : The number of medium quality bases;
-1 on error.
Returns the number of quality bases in the range [$start,$stop]
, or in the whole sequence if no range is specified, whose quality value is in the (closed) range [$min_qv,$max_qv]
. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
sample_score()
Usage : $ss = $abif->sample_score();
: $ss = $abif->sample_score(
$window_width,
$bad_bases_threshold,
$quality_threshold
);
Returns : The sample score of the sequence.
The sample score is the average quality value of the bases in the clear range of the sequence (see clear_range()
). The method returns 0 if the information needed to compute such value is missing or if the clear range is empty.
AUTHOR
Nicola Vitacolonna, <vitacolonna at appliedgenomics.org>
BUGS
Please report any bugs or feature requests to bug-bio-trace-abif at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Bio-Trace-ABIF. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Bio::Trace::ABIF
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
RT: CPAN's request tracker
Search CPAN
SEE ALSO
See http://www.appliedbiosystems.com/support/ for the ABIF format file specification sheet.
There is an ABI module on CPAN (http://search.cpan.org/~malay/).
bioperl-ext also parses ABIF files and other trace formats.
You are welcome at http://www.appliedgenomics.org!
ACKNOWLEDGEMENTS
Thanks to Simone Scalabrin for many helpful suggestions and for the first implementation of the length_of_read()
method the way Sequencing Analysis does it (and for rating this module five stars)! Thanks to Fabrizio Levorin and other people reporting bugs!
Some explanation about how Sequencing Analysis computes some parameters has been found at http://keck.med.yale.edu/dnaseq/.
COPYRIGHT & LICENSE
Copyright 2006-2010 Nicola Vitacolonna, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Feel free to rate this module on CPAN!
DISCLAIMER
This software is provided "as is" without warranty of any kind.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1143:
Non-ASCII character seen before =encoding in '°C;'. Assuming UTF-8