Bio::Trace::ABIF - Perl extension for reading and parsing ABIF (Applied Biosystems, Inc. Format) files
Version 1.05
The ABIF file format is a binary format for storing data (especially, those produced by sequencers), developed by Applied Biosystems, Inc. Typical file suffixes for such files are .ab1 and .fsa.
.ab1
.fsa
The data inside ABIF files is organized in records, in the following referred to as either directory entries or data items. Each data item is uniquely identified by a pair made of a four character string and a number: we call such pair a tag and its components the tag name and the tag number, respectively. Tags are defined in the official documentation for ABIF files (see the "SEE ALSO" Section at the end of this document).
This module provides methods for accessing any data item contained into an ABIF file (with or without knowledge of the corresponding tag) and methods for assessing the quality of the data (e.g., for computing LOR scores, clear ranges, and so on). The module has also support for ABIF file modification, that is, any directory entry can be overwritten (it is not possible, however, to add new directory entries corresponding to tags not already present in the file).
use Bio::Trace::ABIF; my $abif = Bio::Trace::ABIF->new(); $abif->open_abif('/Path/to/my/file.ab1'); print $abif->sample_name(), "\n"; my @quality_values = $abif->quality_values(); my $sequence = $abif->sequence(); # etc... $abif->close_abif();
Usage : $version = Bio::Trace::ABIF->module_version(); Returns : This module's version number.
Creates a new ABIF object.
Usage : my $abif = Bio::Trace::ABIF->new(); Returns : An instance of ABIF.
Creates an ABIF object.
The methods in this section allow you to open an ABIF file (either read-only or for modification), to close it or to verify the ABIF format version number.
Usage : $abif->open_abif($pathname); $abif->open_abif($pathname, 1); # Read/Write mode Returns : 1 if the file is opened; 0 otherwise.
Opens the specified file in binary format and checks whether it is in ABIF format. If the second optional argument is not false then the file is opened in read/write mode (by default, the file is opened in read only mode). Opening in read/write mode is necessary only if you want to use write_tag() (see below).
write_tag()
Usage : $abif->close_abif(); Returns : Nothing.
Closes the currently opened file.
Usage : if ($abif->is_abif_open()) { # ... Returns : 1 if an ABIF file is open; 0 otherwise.
Usage : if ($abif->is_abif_format()) { # ... Returns : 1 if the file is in ABIF format; 0 otherwise.
Checks that the file is in ABIF format.
Usage : $v = $abif->abif_version(); Returns : The ABIF file version number (e.g., '1.01').
Used to determine the ABIF file version number.
The "low-level" methods of this section allow you to access any directory entry in a file. It is up to the caller to correctly interpret the values returned by these methods, so they should be used only if the caller knows what (s)he is doing. In any case, it is strongly recommended to use the accessor methods defined later in this document: in most cases, they will do just fine.
Usage : $n = $abif->num_dir_entries(); Returns : The number of data items in the file.
Used to determine the number of directory entries in the ABIF file.
Usage : $n = $abif->data_offset(); Returns : The offset of the first data item, in bytes.
Used to determine the offset of the first directory entry from the beginning of the file.
Usage : @tags = $abif->tags(); Returns : A list of the tags in the file.
Usage : %D = $abif->get_directory($tagname, $tagnum); Returns : A hash of the content of the given data item; () if the given tag is not found.
Retrieves the directory entry identified by the pair ($tag_name, $tag_num). The $tagname must be a four letter ASCII code and $tagnum must be an integer (typically, 1 <= $tag_num <= 1000). The returned hash has the following keys:
$tag_name
$tag_num
$tagname
$tagnum
TAG_NAME: the tag name; TAG_NUMBER: the tag number; ELEMENT_TYPE: a string denoting the type of the data item ('char', 'byte', 'float', etc...); ELEMENT_SIZE: the size, in bytes, of one element; NUM_ELEMENTS: the number of elements in the data item; DATA_SIZE: the size, in bytes, of the data item; DATA_ITEM: the raw sequence of bytes of the data item.
Nota Bene: it is upon the caller to interpret the data item field correctly (typically, by unpack()ing the item).
unpack()
Refer to the "SEE ALSO" Section for further information.
Usage : @data = $abif->get_data_item($tagname, $tagnum, $template ); Returns : A list of elements unpacked according to $template; (), if the tag is not found.
Retrieves the data item specified by the pair ($tagname, $tagnum) and unpacks it according to $template. The $tagname is a four letter ASCII code and $tagnum is an integer (typically, 1 <= $tagnum <= 1000). The $template has the same format as in the pack() function.
$template
pack()
The methods in this section allow you to search for a specific tag and to overwrite existing data corresponding to a given tag.
Usage : $abif->search_tag($tagname, $tagnum) Returns : 1 if the tag is found; 0, otherwise
Searches for the the specified data tag. If the tag is found, then the file handle is positioned just after the tag number (ready to read the element type).
Usage : $abif->write_tag($tagname, $tagnum, $data); $abif->write_tag($tagname, $tagnum, \@data); $abif->write_tag($tagname, $tagnum, \$data_str); Returns : 1 if the data item is overwritten; 0, otherwise.
Overwrites an existing tag with the given data. You may find the tag name and the tag number of each piece of data in an ABIF file in the documentation of the corresponding method (see below). You must open the file in read/write mode if you want to overwrite it (see open_abif()).
open_abif()
REMEMBER TO BACKUP YOUR FILE BEFORE OVERWRITING IT!
You must be careful when you overwrite data: the type of the new data must match the type of the old one. There is no restriction on the length of the data, e.g. you may overwrite the basecalled sequence with a longer or shorter one. Examples of how to use this method follow.
To overwrite the basecalled sequence:
my $new_sequence = 'GATGCATCT...'; $abif->write_tag('PBAS', 1, \$new_sequence); # ($new_sequence can be passed also by value) print 'New sequence is: ', $abif->edited_sequence();
To overwrite the quality values:
my @qv = (10, 20, 30, ...); # All values must be < 128 $abif->write_tag('PCON', 1, \@qv); # Pass by reference! print 'New qv's: ', $abif->edited_quality_values();
To overwrite a date:
# Date format: yyyy-mm-dd $abif->write_tag('RUND', 3, '2007-01-22'); print 'New date: ', $abif->data_collection_start_date();
To overwrite a time stamp:
# Time format: hh:mm:ss.nn $abif->write_tag('RUNT', 4, '16:01:30.45'); print 'New time: ', $abif->data_collection_stop_time();
To overwrite a comment:
$abif->write_tag('CMNT', 1, 'New comment'); print 'New comment: ', $abif->comment();
To overwrite noise values:
my @noise = (3.14, 2.71, ...); $abif->write_tag('NOIS', 1, \@noise); print 'Noise values: ', $abif->noise();
To overwrite the capillary number:
$abif->write_tag('LANE', 1, 95); print 'Capillary number: ', $abif->capillary_number();
and so on.
The methods in this section can be used to retrieve specific information from a file without having to specify a tag. It is strongly recommended that you read data from a file by using one or more of these methods.
Usage : @data = analyzed_data_for_channel($ch_num); Returns : The channel analyzed data; () if the channel number is out of range or the data item is not in the file. ABIF Tag : DATA9, DATA10, DATA11, DATA12, DATA205 ABIF Type : short array File Type : ab1
There are four channels in an ABIF file, numbered from 1 to 4. An optional channel number 5 exists in some files. The channel number is the argument of the method.
This data item is from SeqScape(R) v2.5 and Sequencing Analysis v5.2 Software.
Usage : $s = $abif->analysis_protocol_settings_name(); Returns : The Analysis Protocol settings name; undef if the data item is not in the file. ABIF Tag : APrN1 ABIF Type : cString File Type : ab1
Usage : $s = $abif->analysis_protocol_settings_version(); Returns : The Analysis Protocol settings version; undef if the data item is not in the file. ABIF Tag : APrV1 ABIF Type : cString File Type : ab1
Usage : $xml = $abif->analysis_protocol_xml(); Returns : The Analysis Protocol XML string; undef if the data item is not in the file. ABIF Tag : APrX1 ABIF Type : char array File Type : ab1
Usage : $s = $abif->analysis_protocol_xml_schema_version(); Returns : The Analysis Protocol XML schema version; undef if the data item is not in the file. ABIF Tag : APXV1 ABIF Type : cString File Type : ab1
Usage : $rc = $abif->analysis_return_code(); Returns : The analysis return code; undef if the data item is not in the file. ABIF Tag : ARTN1 ABIF Type : long File Type : ab1
Usage : $aps = $abif->avg_peak_spacing(); Returns : The average peak spacing used in last analysis; undef if the data item is not in the file. ABIF Tag : SPAC1 ABIF Type : float File Type : ab1
Usage : $n = $abif->basecaller_apsf(); Returns : The basecaller adaptive processing success flag; undef if the data item is not in the file. ABIF Tag : ASPF1 ABIF Type : short File Type : ab1
Usage : $v = basecaller_bcp_dll(); Returns : A string with the basecalled BCP/DLL; undef if the data item is not in the file. ABIF Tag : SPAC2 ABIF Type : pString File Type : ab1
Usage : $v = $abif->basecaller_version(); Returns : The basecaller version (e.g., 'KB 1.3.0'); undef if the data item is not in the file. ABIF Tag : SVER2 ABIF Type : pString File Type : ab1
Usage : $s = $abif->basecalling_analysis_timestamp(); Returns : A time stamp; undef if the data item is not in the file. ABIF Tag : BCTS1 ABIF Type : pString File Type : ab1
Returns the time stamp for last successful basecalling analysis.
Usage : @bl = $abif->base_locations(); Returns : The list of base locations; () if the data item is not in the file. ABIF Tag : PLOC2 ABIF Type : short array File Type : ab1
Usage : @bl = $abif->base_locations_edited(); Returns : The list of base locations (edited); () if the data item is not in the file. ABIF Tag : PLOC1 ABIF Type : short array File Type : ab1
Usage : @bo = $abif->base_order(); Returns : An array of characters sorted by channel number; () if the data item is not in the file. ABIF Tag : FWO_1 ABIF Type : char array File Type : ab1
Returns an array of characters sorted by increasing channel number. For example, if the list is ('G', 'A', 'T', 'C') then G is channel 1, A is channel 2, and so on. If you want to do the opposite, that is, mapping bases to their channels, use order_base() instead. See also the channel() method.
('G', 'A', 'T', 'C')
order_base()
channel()
Usage : $spacing = $abif->base_spacing(); Returns : The spacing; undef if the data item is not in the file. ABIF Tag : SPAC3 ABIF Type : float File Type : ab1
Usage : @T = $abif->buffer_tray_temperature(); Returns : The buffer tray heater temperature in °C; () if the data item is not in the file. ABIF Tag : BufT1 ABIF Type : short array File Type : ab1
Usage : $cap_n = $abif->capillary_number(); Returns : The LANE/Capillary number; undef if the data item is not in the file. ABIF Tag : LANE1 ABIF Type : short File Type : ab1, fsa
Usage : $n = $abif->channel($base); Returns : The channel number corresponding to a given base. undef if the data item is not in the file.
Returns the channel number corresponding to the given base.
The possible values for $base are 'A', 'C', 'G' and 'T' (case insensitive).
$base
Usage : $s = $abif->chem(); Returns : The primer or terminator chemistry; undef if the data item is not in the file. ABIF Tag : phCH1 ABIF Type : pString File Type : ab1
Returns the primer or terminator chemistry (equivalent to CHEM in phd1 file).
Usage : $comment = $abif->comment(); $comment = $abif->comment($n); Returns : The comment about the sample; undef if the data item is not in the file. ABIF Tag : CMNT1 ... CMNT 'N' ABIF Type : pString File Type : ab1, fsa
This is an optional data item. In some files there is more than one comment: the optional argument is used to specify the number of the comment.
Usage : $comment_title = $abif->comment_title(); Returns : The comment title; undef if the data item is not in the file. ABIF Tag : CTTL1 ABIF Type : pString File Type : ab1, fsa
Usage : $id = $abif->container_identifier(); Returns : The container identifier, a.k.a. plate barcode; undef if the data item is not in the file. ABIF Tag : CTID1 ABIF Type : cString File Type : ab1, fsa
Usage : $name = $abif->container_name(); Returns : The container name; undef if the data item is not in the file. ABIF Tag : CTNM1 ABIF Type : cString File Type : ab1, fsa
Usually, this is identical to the container identifier.
Usage : $owner = $abif->container_owner(); Returns : The container's owner; : undef if the data item is not in the file. ABIF Tag : CTow1 ABIF Type : cString File Type : ab1
Usage : @c = $abif->current(); Returns : Current, measured in milliamps; () if the data item is not in the file. ABIF Tag : DATA6 ABIF Type : short array File Type : ab1, fsa
Usage : $s = $abif->data_collection_module_file(); Returns : The data collection module file; undef if the data item is not in the file. ABIF Tag : MODF1 ABIF Type : pString File Type : ab1, fsa
Usage : $v = $abif->data_collection_software_version(); Returns : The data collection software version. undef if the data item is not in the file. ABIF Tag : SVER1 ABIF Type : pString File Type : ab1, fsa
Usage : $v = $abif->data_collection_firmware_version(); Returns : The data collection firmware version; undef if the data item is not in the file. ABIF Tag : SVER3 ABIF Type : pString File Type : ab1, fsa
Usage : $date = $abif->data_collection_start_date(); Returns : The Data Collection start date (yyyy-mm-dd); undef if the data item is not in the file. ABIF Tag : RUND3 ABIF Type : date File Type : ab1, fsa
Usage : $time = $abif->data_collection_start_time(); Returns : The Data Collection start time (hh:mm:ss.nn); undef if the data item is not in the file. ABIF Tag : RUNT3 ABIF Type : time File Type : ab1, fsa
Usage : $date = $abif->data_collection_stop_date(); Returns : The Data Collection stop date (yyyy-mm-dd); undef if the data item is not in the file. ABIF Tag : RUND4 ABIF Type : date File Type : ab1, fsa
Usage : $time = $abif->data_collection_stop_time(); Returns : The Data Collection stop time (hh:mm:ss.nn); undef if the data item is not in the file. ABIF Tag : RUNT4 ABIF Type : time File Type : ab1, fsa
Usage : $dt = $abif->detector_heater_temperature(); Returns : The detector cell heater temperature in °C; undef if the data item is not in the file. ABIF Tag : DCHT1 ABIF Type : short File Type : ab1
Usage : $df = $abif->downsampling_factor(); Returns : The downsampling factor; undef if the data item is not in the file. ABIF Tag : DSam1 ABIF Type : short File Type : ab1, fsa
Usage : $n = $abif->dye_name($n); Returns : The name of dye number $n; undef if the data item is not in the file; undef if $n is not in the range [1..5]. ABIF Tag : DyeN1, DyeN2, DyeN3, DyeN4, DyeN5 ABIF Type : pString File Type : ab1, fsa
Dye 5 name is an optional tag.
Usage : $dsn = $abif->dye_set_name(); Returns : The dye set name; undef if the data item is not in the file. ABIF Tag : DySN1 ABIF Type : pString File Type : ab1, fsa
Usage : $dsn = $abif->dye_significance($n); Returns : The $n-th dye significance; undef if the data item is not in the file ABIF Tag : DyeB1, DyeB2, DyeB3, DyeB4, DyeB5 ABIF Type : char File Type : fsa
The argument must be an integer from 1 to 5. Dye significance 5 is optional. The returned value is 'S' for standard, ' ' for sample;
Usage : $dsn = $abif->dye_type(); Returns : The dye type; undef if the data item is not in the file. ABIF Tag : phDY1 ABIF Type : pString File Type : ab1
The dye type is equivalent to DYE in phd1 files.
phd1
Usage : $n = $abif->dye_wavelength($n); Returns : The wavelength of dye number $n; undef if the data item is not in the file; undef if $n is not in the range [1..5]. ABIF Tag : DyeW1, DyeW2, DyeW3, DyeW4, DyeW5 ABIF Type : short File Type : ab1, fsa
Dye 5 wavelength is an optional data item.
Usage : @qv = $abif->edited_quality_values(); Returns : The list of edited quality values; () if the data item is not in the file. ABIF Tag : PCON1 ABIF Type : char array File Type : ab1
Usage : $ref_to_qv = $abif->edited_quality_values_ref(); Returns : A reference to the list of edited quality values; a reference to the empty list if the data item is not in the file. ABIF Tag : PCON1 File Type : ab1
Usage : $sequence = edited_sequence(); Returns : The string of the edited basecalled sequence; undef if the data item is not in the file. ABIF Tag : PBAS1 ABIF Type : char array File Type : ab1
Usage : $l = edited_sequence_length(); Returns : The length of the basecalled sequence; 0 if the sequence is not in the file. File Type : ab1
Usage : $v = $abif->electrophoresis_voltage(); Returns : The electrophoresis voltage setting in volts; undef if the data item is not found. ABIF Tag : EPVt1 ABIF Type : long File Type : ab1, fsa
Usage : $s = $abif->gel_type(); Returns : The gel type description; undef if the data item is not in the file. ABIF Tag : GTyp1 ABIF Type : pString File Type : ab1, fsa
Usage : $s = $abif->gene_mapper_analysis_method(); Returns : The GeneMapper(R) software analysis method name; undef if the data item is not in the file. ABIF Tag : ANME1 ABIF Type : cString File Type : fsa
Usage : $s = $abif->gene_mapper_panel_name(); Returns : The GeneMapper(R) software panel name; undef if the data item is not in the file. ABIF Tag : PANL1 ABIF Type : cString File Type : fsa
Usage : $s = $abif->gene_mapper_sample_type(); Returns : The GeneMapper(R) software Sample Type; undef if the data item is not in the file. ABIF Tag : STYP1 ABIF Type : cString File Type : fsa
Usage : $s = $abif->gene_scan_sample_name(); Returns : The sample name for GeneScan(R) sample files; undef if the data item is not in the file. ABIF Tag : SpNm1 ABIF Type : pString File Type : fsa
Usage : $t = $abif->injection_time(); Returns : The injection time in seconds; undef if the data item is not in the file. ABIF Tag : InSc1 ABIF Type : long File Type : ab1, fsa
Usage : $t = $abif->injection_voltage(); Returns : The injection voltage in volts; undef if the data item is not in the file ABIF Tag : InVt1 ABIF Type : long File Type : ab1, fsa
Usage : $class = $abif->instrument_class(); Returns : The instrument class; undef if the data item is not in the file. ABIF Tag : HCFG1 ABIF Type : cString File Type : ab1
Usage : $class = $abif->instrument_family(); Returns : The instrument family; undef if the data item is not in the file. ABIF Tag : HCFG2 ABIF Type : cString File Type : ab1
Usage : $sn = instrument_name_and_serial_number() Returns : The instrument name and the serial number; undef if the data item is not in the file. ABIF Tag : MCHN1 ABIF Type : pString File Type : ab1, fsa
Usage : $param = $abif->instrument_param(); Returns : The instrument parameters; undef if the data item is not in the file. ABIF Tag : HCFG4 ABIF Type : cString File Type : ab1
Usage : $bool = $abif->is_capillary_machine(); Returns : A value > 0 if the data item is true; 0 if the data item is false; undef if the data item is not in the file. ABIF Tag : CpEP1 ABIF Type : byte File Type : ab1, fsa
Usage : $n = $abif->laser_power(); Returns : The laser power setting in microwatt; undef if the data item is not in the file. ABIF Tag : LsrP1 ABIF Type : long File Type : ab1, fsa
Usage : $n = $abif->length_to_detector(); Returns : The length of detector in cm; undef if the data item is not in the file. ABIF Tag : LNTD1 ABIF Type : short File Type : ab1, fsa
Usage : $mb = $abif->mobility_file() Returns : The mobility file; undef if the data item is not in the file. ABIF Tag : PDMF2 ABIF Type : pString File Type : ab1
Usage : $mb = $abif->mobility_file_orig() Returns : The mobility file (orig); undef if the data item is not in the file. ABIF Tag : PDMF1 ABIF Type : pString File Type : ab1
Usage : $mn = $abif->model_number(); Returns : The model number; undef if the data item is not in the file. ABIF Tag : MODL1 ABIF Type : char[4] File Type : ab1, fsa
Usage : %noise = $abif->noise(); Returns : The estimated noise for each dye; () if the data item is not in the file. ABIF Tag : NOIS1 ABIF Type : float array File Type : ab1
The keys of the returned hash are the values retrieved with base_order(). This is an optional data item. This method works only with files containing data processed by the KB(tm) Basecaller.
base_order()
Usage : $nc = $abif->num_capillaries(); Returns : The number of capillaries; undef if the data item is not in the file. ABIF Tag : NLNE1 ABIF Type : short File Type : ab1, fsa
Usage : $n = $abif->num_dyes(); Returns : The number of dyes; undef if the data item is not in the file. ABIF Tag : Dye#1 ABIF Type : short File Type : ab1, fsa
Usage : $n = $abif->num_scans(); Returns : The number of scans; undef if the data item is not in the file. ABIF Tag : SCAN1 ABIF Type : long File Type : ab1, fsa
Usage : $name = $abif->official_instrument_name(); Returns : The official instrument name; undef if the data item is not in the file. ABIF Tag : HCFG3 ABIF Type : cString File Type : ab1
Usage : @bytes = $abif->offscale_peaks($n); Returns : The range of offscale peaks. () if the data item is not in the file. ABIF Tag : OffS1 ... OffS 'N' ABIF Type : user File Type : fsa
This data item's type is a user defined data structure. As such, it is returned as a list of bytes that must be interpreted by the caller. This is an optional data item.
Usage : @p = $abif->offscale_scans(); Returns : A list of scans. () if the data item is not in the file. ABIF Tag : OfSc1 ABIF Type : long array File Type : ab1, fsa
Returns the list of scans that are marked off scale in Collection. This is an optional data item.
Usage : %bases = $abif->order_base(); Returns : A mapping of the four bases to their channel numbers; () if the base order is not in the file. File Type : ab1
Returns the channel numbers corresponding to the bases. This method does the opposite as base_order() does. See also the channel() method.
Usage : $pl = peak1_location(); Returns : The peak 1 location; undef if the data item is not in the file. ABIF Tag : B1Pt2 ABIF Type : short File Type : ab1
Usage : $pl = peak1_location_orig(); Returns : The peak 1 location (orig); undef if the data item is not in the file. ABIF Tag : B1Pt1 ABIF Type : short File Type : ab1
Usage : $par = $abif->peak_area_ratio(); Returns : The peak area ratio; undef if the data item is not in the file. ABIF Tag : phAR1 ABIF Type : float File Type : ab1
Returns the peak area ratio (equivalent to TRACE_PEAK_AREA_RATIO in phd1 file).
Usage : @pks = $abif->peaks(1); Returns : An array of peak hashes. Each peak hash contains the following attributes: 'position', 'height', 'beginPos', 'endPos', 'beginHI', 'endHI', 'area', 'volume', 'fragSize', 'isEdited', 'label'; () if the data item is not in the file. ABIF Tag : PEAK ABIF Type : user-defined structure File Type : fsa
Returns the data associated with PEAK data structures.
Usage : $n = $abif->pixel_bin_size(); Returns : The pixel bin size; undef if the data item is not in the file. ABIF Tag : PXLB1 ABIF Type : long File Type : ab1, fsa
Usage : $n = $abif->pixels_lane(); Returns : The pixels averaged per lane; undef if the data item is not in the file. ABIF Tag : NAVG1 ABIF Type : short File Type : ab1, fsa
Usage : $s = $abif->plate_type(); Returns : The plate type; undef if the data item is not in the file. ABIF Tag : PTYP1 ABIF Type : cString File Type : ab1, fsa
Returns the plate type. Allowed values are 96-Well, 384-Well;
Usage : $n = $abif->plate_size(); Returns : The plate size. undef if the data item is not in the file. ABIF Tag : PSZE1 ABIF Type : long File Type : ab1, fsa
Returns the number of sample positions in the container (allowed values are 96 and 384);
Usage : $s = $abif->polymer_expiration_date() Returns : The polymer lot expiration date; undef if the data item is not in the file. ABIF Tag : SMED1 ABIF Type : pString File Type : ab1, fsa
The format of the date is implementation dependent.
Usage : $s = $abif->polymer_lot_number(); Returns : A string containing the polymer lot number; undef if the data item is not in the file. ABIF Tag : SMLt1 ABIF Type : pString File Type : ab1, fsa
Usage : @p = $abif->power(); Returns : The power, measured in milliwatts; () if the data item is not in the file. ABIF Tag : DATA7 ABIF Type : short array File Type : ab1, fsa
Usage : $n = $abif->quality_levels(); Returns : The maximum quality value; undef if the data item is not in the file. ABIF Tag : phQL1 ABIF Type : short File Type : ab1
Returns the maximum quality value (equivalent to QUALITY_LEVELS in phd1 file).
Usage : @qv = $abif->quality_values(); Returns : The list of quality values; () if the data item is not in the file. ABIF Tag : PCON2 ABIF Type : char array File Type : ab1
Usage : $qvref = $abif->quality_values_ref(); Returns : A reference to the list of quality values; a reference to the empty list if the data item is not in the file. ABIF Tag : PCON2 ABIF Type : char array File Type : ab1
Usage : @data = $abif->raw_data_for_channel($channel_number); Returns : The channel $channel_number raw data; () if the data item is not in the file. ABIF Tag : DATA1, DATA2, DATA3, DATA4, DATA105 ABIF Type : short array File Type : ab1, fsa
There are four channels in an ABIF file, numbered from 1 to 4. An optional channel number 5 exists in some files.
Usage : @trace = $abif->raw_trace($base); Returns : The raw trace corresponding to $base; () if the data item is not in the file. File Type : ab1
Usage : $name = $abif->rescaling(); Returns : The rescaling divisor for color data; undef if the data item is not in the file. ABIF Tag : Scal1 ABIF Type : float File Type : ab1, fsa
Usage : $name = $abif->results_group(); Returns : The results group name; undef if the data item is not in the file. ABIF Tag : RGNm1 ABIF Type : cString File Type : ab1, fsa
Usage : $s = $abif->results_group_comment(); Returns : The results group comment; undef if the data item is not in the file. ABIF Tag : RGCm1 ABIF Type : cString File Type : ab1, fsa
This is an optional data item.
Usage : $s = $abif->results_group_owner(); Returns : The results group owner; undef if the data item is not in the file. ABIF Tag : RGOw1 ABIF Type : cString File Type : ab1
Returns the name entered as the owner of the results group, in the Results Group editor. This is an optional data item.
Usage : $n = $abif->reverse_complement_flag(); Returns : The reverse complement flag; undef if the data item is not in the file. ABIF Tag : RevC1 ABIF Type : short File Type : ab1
This data item is from Sequencing Analysis v5.2 Software.
Usage : $name = $abif->run_module_name(); Returns : The run module name; undef if the data item is not in the file. ABIF Tag : RMdN1 ABIF Type : cString File Type : ab1, fsa
This should be the same as the value returned by data_collection_module_file().
data_collection_module_file()
Usage : $name = $abif->run_module_version(); Returns : The run module version; undef if the data item is not in the file. ABIF Tag : RMdV1 ABIF Type : cString File Type : ab1, fsa
Usage : $vers = $abif->run_module_xml_schema_version(); Returns : The run module XML schema version; undef if the data item is not in the file. ABIF Tag : RMXV1 ABIF Type : cString File Type : ab1, fsa
Usage : $xml = $abif->run_module_xml_string(); Returns : The run module XML string; undef if the data item is not in the file. ABIF Tag : RMdX1 ABIF Type : char array File Type : ab1, fsa
Usage : $name = $abif->run_name(); Returns : The run name; undef if the data item is not in the file. ABIF Tag : RunN1 ABIF Type : cString File Type : ab1, fsa
Usage : $xml = $abif->run_protocol_name(); Returns : The run protocol name; undef if the data item is not in the file. ABIF Tag : RPrN1 ABIF Type : cString File Type : ab1, fsa
Usage : $vers = $abif->run_protocol_version(); Returns : The run protocol version; undef if the data item is not in the file. ABIF Tag : RPrV1 ABIF Type : cString File Type : ab1, fsa
Usage : $date = $abif->run_start_date(); Returns : The run start date (yyyy-mm-dd); undef if the data item is not in the file. ABIF Tag : RUND1 ABIF Type : date File Type : ab1, fsa
Usage : $time = $abif->run_start_time(); Returns : The run start time (hh:mm:ss.nn); undef if the data item is not in the file. ABIF Tag : RUNT1 ABIF Type : time File Type : ab1, fsa
Usage : $date = $abif->run_stop_date(); Returns : The run stop date (yyyy-mm-dd); undef if the data item is not in the file. ABIF Tag : RUND2 ABIF Type : date File Type : ab1, fsa
Usage : $time = $abif->run_stop_time(); Returns : The run stop time (hh:mm:ss.nn); undef if the data item is not in the file. ABIF Tag : RUNT2 ABIF Type : time File Type : ab1, fsa
Usage : $temp = $abif->run_temperature(); Returns : The run temperature setting in °C; undef if the data item is not in the file. ABIF Tag : Tmpr1 ABIF Type : long File Type : ab1, fsa
Usage : $v = $abif->sample_file_format_version(); Returns : The Sample File Format Version; undef if the data item is not in the file. ABIF Tag : SVER4 ABIF Type : pString File Type : fsa
The Sample File Format Version contains the version of the sample file format used to write the file.
Usage : $name = $abif->sample_name(); Returns : The sample name; undef if the data item is not in the file. ABIF Tag : SMPL1 ABIF Type : pString File Type : ab1
Usage : $sample_id = $abif->sample_tracking_id(); Returns : The sample tracking ID; undef if the data item is not in the file. ABIF Tag : LIMS1 ABIF Type : pString File Type : ab1, fsa
Usage : @bytes = $abif->scanning_rate(); Returns : The scanning rate; () if the data item is not in the file. ABIF Tag : Rate1 ABIF Type : user File Type : ab1, fsa
This data item's type is a user defined data structure. As such, it is returned as a list of bytes that must be interpreted by the caller.
Usage : @C = $abif->scan_color_data_values($n); Returns : A list of color data values; () if the data item is not in the file. ABIF Tag : OvrV1 ... OvrV 'N' ABIF Type : long array File Type : ab1, fsa
Returns the list of color data values for the locations listed by scan_number_indices(). This is an optional data item.
scan_number_indices()
Usage : @N = $abif->scan_numbers(); Returns : The scan numbers of data points; () if the data item is not in the file. ABIF Tag : Satd1 ABIF Type : long array File Type : ab1, fsa
Returns an array of integers representing the scan numbers of data points, which are flagged as saturated by data collection;
Usage : @I = $abif->scan_number_indices($n); Returns : A list of scan number indices; () if the data item is not in the file. ABIF Tag : OvrI1 ... OvrI 'N' ABIF Type : long array File Type : ab1, fsa
Returns the list of scan number indices for scans with color data value greater than 32767.
Usage : $name = $abif->seqscape_project_name(); Returns : SeqScape(R) project name; undef if the data item is not in the file. ABIF Tag : PROJ4 ABIF Type : cString File Type : ab1
This data item is in SeqScape(R) software sample files only. This is an optional data item.
Usage : name = $abif->seqscape_project_template(); Returns : SeqScape(R) project template name; undef if the data item is not in the file. ABIF Tag : PRJT1 ABIF Type : cString File Type : ab1
Usage : $name = $abif->seqscape_specimen_name(); Returns : SeqScape(R) specimen name; undef if the data item is not in the file. ABIF Tag : SPEC1 ABIF Type : cString File Type : ab1
Usage : $sequence = sequence(); Returns : The basecalled sequence; undef if the data item is not in the file. ABIF Tag : PBAS2 ABIF Type : char array File Type : ab1
Usage : $l = sequence_length(); Returns : The length of the base called sequence; 0 if the sequence is not in the file. File Type : ab1
Usage : $f = sequencing_analysis_param_filename(); Returns : The Sequencing Analysis parameters filename; undef if the data item is not in the file. ABIF Tag : APFN2 ABIF Type : pString File Type : ab1
Usage : %signal_level = $abif->signal_level(); Returns : The signal level for each dye; () if the data item is not in the file. ABIF Tag : S/N%1 ABIF Type : short array File Type : ab1
The keys of the returned hash are the values retrieved with base_order().
Usage : $s = $abif->size_standard_filename(); Returns : The Size Standard file name; undef if the data item is not in the file. ABIF Tag : StdF1 ABIF Type : pString File Type : fsa
Usage : $s = $abif->snp_set_name(); Returns : SNP set name; undef if the data item is not in the file. ABIF Tag : SnpS1 ABIF Type : pString File Type : fsa
Usage : $s = $abif->start_collection_event(); Returns : The start collection event; undef if the data item is not in the file. ABIF Tag : EVNT3 ABIF Type : pString File Type : ab1, fsa
Usage : $n = $abif->start_point(); Returns : The start point; undef if the data item is not in the file. ABIF Tag : ASPt2 ABIF Type : short File Type : ab1
Usage : $n = $abif->start_point_orig(); Returns : The start point (orig); undef if the data item is not in the file. ABIF Tag : ASPt1 ABIF Type : short File Type : ab1
Usage : $s = $abif->start_run_event(); Returns : The start run event; undef if the data item is not in the file. ABIF Tag : EVNT1 ABIF Type : pString File Type : ab1, fsa
Usage : $s = $abif->stop_collection_event(); Returns : The stop collection event; undef if the data item is not in the file. ABIF Tag : EVNT4 ABIF Type : pString File Type : ab1, fsa
Usage : $n = $abif->stop_point(); Returns : The stop point; undef if the data item is not in the file. ABIF Tag : AEPt2 ABIF Type : short File Type : ab1
Usage : $n = $abif->stop_point_orig(); Returns : The stop point (orig); undef if the data item is not in the file. ABIF Tag : AEPt1 ABIF Type : short File Type : ab1
Usage : $s = $abif->stop_run_event(); Returns : The stop run event; undef if the data item is not in the file. ABIF Tag : EVNT2 ABIF Type : pString File Type : ab1, fsa
Usage : @t = $abif->temperature(); Returns : The temperature, measured in °C () if the data item is not in the file. ABIF Tag : DATA8 ABIF Type : short array File Type : ab1, fsa
Usage : @trace = $abif->trace($base); Returns : The (analyzed) trace corresponding to $base; () if the data item is not in the file. File Type : ab1
The possible values for $base are 'A', 'C', 'G' and 'T'.
Usage : $pr = $abif->trim_probability_threshold(); Returns : The trim probability threshold used; undef if the data item is not in the file. ABIF Tag : phTR2 ABIF Type : float File Type : ab1
Usage : $n = $abif->trim_region(); Returns : The read positions; undef if the data item is not in the file. ABIF Tag : phTR1 ABIF Type : short File Type : ab1
Returns the read positions of the first and last bases in trim region; along with trim_probability_threshold(), this is equivalent to TRIM in phd1 file.
trim_probability_threshold()
Usage : @v = $abif->voltage(); Returns : The voltage, measured in decavolts; () if the data item is not in the file. ABIF Tag : DATA5 ABIF Type : short array File Type : ab1, fsa
Usage : $user = $abif->user(); Returns : The name of the user who created the plate; undef if the data item is not in the file. ABIF Tag : User1 ABIF Type : pString File Type : ab1, fsa
Usage : $well_id = $abif->well_id(); Returns : The well ID; undef if the data item is not in the file. ABIF Tag : TUBE1 ABIF Type : pString File Type : ab1, fsa
The following methods compute some values that help assessing the quality of the data.
Usage : $sn_ratio = $abif->avg_signal_to_noise_ratio() Returns : The average signal to noise ratio; 0 on error.
This method works only with files containing data processed by the KB(tm) Basecaller. If the information needed to compute such value is missing, it returns 0.
Usage : ($b, $e) = $abif->clear_range(); ($b, $e) = $abif->clear_range( $window_width, $bad_bases_threshold, $quality_threshold ); Returns : The clear range of the sequence; (-1, -1) if there is no clear range.
The Sequencing Analysis program determines the clear range of the sequence by trimming bases from the 5' to 3' ends until fewer than 4 bases out of 20 have a quality value less than 20. You can change these parameters by explicitly passing arguments to this method (the default values are $window_width = 20, $bad_bases_threshold = 4, $quality_threshold = 20). Note that Sequencing Analysis counts the bases starting from one, so you have to add one to the return values to get consistent results.
$window_width
$bad_bases_threshold
$quality_threshold
Usage : $b = $abif->clear_range_start(); $b = $abif->clear_range_start( $window_width, $bad_bases_threshold, $quality_threshold ); Returns : The clear range start position; -1 if no clear range exists.
See clear_range().
clear_range()
Usage : $e = $abif->clear_range_stop(); $e = $abif->clear_range_stop( $window_width, $bad_bases_threshold, $quality_threshold ); Returns : The clear range stop position; -1 if no clear range exists.
Usage : ($b, $e) = $abif->contiguous_read_length( $window_width, $quality_threshold ); ($b, $e) = $abif->contiguous_read_length( $window_width, $quality_threshold, $trim_ends ); Returns : The start and stop position of the CRL; (-1, -1) if there is no CRL.
The CRL is (the length of) the longest uninterrupted stretch in a read such that the average quality of any interval of $window_width bases that is inside such stretch never goes below $threshold. The threshold must be at least 10. The positions are counted from zero. If $trim_ends is true, the ends of the CRL are trimmed until there are no bases with quality values less than 10 within the first five and the last five bases. Trimming is not applied by default. If there is more than one CRL, the position of the first one is reported.
$threshold
$trim_ends
Usage : $LOR = $abif->length_of_read( $window_width, $quality_threshold ); $LOR = $abif->length_of_read( $window_width, $quality_threshold, $method ); Returns : The Length Of Read (LOR) value.
The Length Of Read (LOR) score gives an approximate measure of the usable range of high-quality or high-accuracy bases determined by quality values. Such range can be determined in several ways. Two possible procedures are currently implemented and described below.
If $method is the string 'SequencingAnalysis' then the LOR is computed as the widest range starting and ending with $window_width bases whose average quality is greater than or equal to $quality_threshold. This is the default method that is applied if this optional argument is omitted.
$method
If $method is the string 'GoodQualityWindows' then the LOR is computed as the number of intervals of $window_width bases whose average quality is greater than or equal to $quality_threshold.
Usage : $n = $abif->num_low_quality_bases($threshold); $n = $abif->num_low_quality_bases( $threshold, $start, $stop ); Returns : The number of low quality bases; -1 on error.
Returns the number of quality bases in the range [$start,$stop], or in the whole sequence if no range is specified, with quality value less than or equal to $threshold. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
[$start,$stop]
Usage : $n = $abif->num_high_quality_bases($threshold); $n = $abif->num_high_quality_bases( $threshold, $start, $stop ); Returns : The number of high quality bases; -1 on error.
Returns the number of quality bases in the range [$start,$stop], or in the whole sequence if no range is specified, with quality value greater than or equal to $threshold. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
Usage : $n = $abif->num_medium_quality_bases( $min_qv, $max_qv ); $n = $abif->num_medium_quality_bases( $min_qv, $max_qv, $start, $stop ); Returns : The number of medium quality bases; -1 on error.
Returns the number of quality bases in the range [$start,$stop], or in the whole sequence if no range is specified, whose quality value is in the (closed) range [$min_qv,$max_qv]. Returns -1 if the information needed to compute such value (i.e., the quality values) is missing from the file.
[$min_qv,$max_qv]
Usage : $ss = $abif->sample_score(); : $ss = $abif->sample_score( $window_width, $bad_bases_threshold, $quality_threshold ); Returns : The sample score of the sequence.
The sample score is the average quality value of the bases in the clear range of the sequence (see clear_range()). The method returns 0 if the information needed to compute such value is missing or if the clear range is empty.
Nicola Vitacolonna, <vitacolonna at appliedgenomics.org>
<vitacolonna at appliedgenomics.org>
Please report any bugs or feature requests to bug-bio-trace-abif at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Bio-Trace-ABIF. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-bio-trace-abif at rt.cpan.org
You can find documentation for this module with the perldoc command.
perldoc Bio::Trace::ABIF
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Bio-Trace-ABIF
CPAN Ratings
http://cpanratings.perl.org/d/Bio-Trace-ABIF
RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Bio-Trace-ABIF
Search CPAN
http://search.cpan.org/dist/Bio-Trace-ABIF
See http://www.appliedbiosystems.com/support/ for the ABIF format file specification sheet.
There is an ABI module on CPAN (http://search.cpan.org/~malay/).
bioperl-ext also parses ABIF files and other trace formats.
You are welcome at http://www.appliedgenomics.org!
Thanks to Simone Scalabrin for many helpful suggestions and for the first implementation of the length_of_read() method the way Sequencing Analysis does it (and for rating this module five stars)! Thanks to Fabrizio Levorin and other people reporting bugs!
length_of_read()
Some explanation about how Sequencing Analysis computes some parameters has been found at http://keck.med.yale.edu/dnaseq/.
Copyright 2006-2010 Nicola Vitacolonna, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Feel free to rate this module on CPAN!
This software is provided "as is" without warranty of any kind.
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in '°C;'. Assuming UTF-8
To install Bio::Trace::ABIF, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Trace::ABIF
CPAN shell
perl -MCPAN -e shell install Bio::Trace::ABIF
For more information on module installation, please visit the detailed CPAN module installation guide.