MS::Reader::MzML - A simple but complete mzML parser
use MS::Reader::MzML; my $run = MS::Reader::MzML->new('run.mzML'); while (my $spectrum = $run->next_spectrum) { # only want MS1 next if ($spectrum->ms_level > 1); my $rt = $spectrum->rt; # see MS::Reader::MzML::Spectrum and MS::Spectrum for all available # methods } $spectrum = $run->fetch_spectrum(0); # first spectrum $spectrum = $run->find_by_time(1500); # in seconds
MS::Reader::MzML is a parser for the HUPO PSI standard mzML format for raw mass spectrometry data. It aims to provide complete access to the data contents while not being overburdened by detailed class infrastructure. Convenience methods are provided for accessing commonly used data. Users who want to extract data not accessible through the available methods should examine the data structure of the parsed object. The dump() method of MS::Reader::XML, from which this class inherits, provides an easy method of doing so.
MS::Reader::MzML
dump()
MS::Reader::MzML is a subclass of MS::Reader::XML, which in turn inherits from MS::Reader, and inherits the methods of these parental classes. Please see the documentation for those classes for details of available methods not detailed below.
my $run = MS::Reader::MzML->new( $fn, use_cache => 0, paranoid => 0, );
Takes an input filename (required) and optional argument hash and returns an MS::Reader::MzML object. This constructor is inherited directly from MS::Reader. Available options include:
use_cache — cache fetched records in memory for repeat access (default: FALSE)
paranoid — when loading index from disk, recalculates MD5 checksum each time to make sure raw file hasn't changed. This adds (typically) a few seconds to load times. By default, only file size and mtime are checked.
while (my $s = $run->next_spectrum) { # do something }
Returns an MS::Reader::MzML::Spectrum object representing the next spectrum in the file, or undef if the end of records has been reached. Typically used to iterate over each spectrum in the run.
MS::Reader::MzML::Spectrum
undef
my $s = $run->fetch_spectrum($idx);
Takes a single argument (zero-based spectrum index) and returns an MS::Reader::MzML::Spectrum object representing the spectrum at that index. Throws an exception if the index is out of range.
$run->goto_spectrum($idx);
Takes a single argument (zero-based spectrum index) and sets the spectrum record iterator to that index (for subsequent calls to next_spectrum).
next_spectrum
$run->curr_spectrum_index
Returns the 0-based index of the current spectrum pointer.
my $idx = $run->spectrum_index_by_id($id);
Takes a single argument (spectrum ID) and returns the index of the matching spectrum (generally for input into other methods).
my $idx = $run->find_by_time($rt);
Takes a single argument (retention time in SECONDS) and returns the index of the nearest spectrum with retention time equal to or greater than that given. Throws an exception if the given retention time is out of range.
NOTE: The first time this method is called, the spectral indices are sorted by retention time for subsequent access. This can be a bit slow. The retention time index is saved and subsequent calls should be relatively quick. This is done because the mzML specification doesn't guarantee that the spectra are ordered by RT (even though they invariably are).
my $n = $run->n_spectra;
Returns the number of spectra present in the file.
my $tic = $run->get_tic; my $tic = $run->get_tic($force);
Returns an MS::Reader::MzML::Chromatogram object containing the total ion current chromatogram for the run. By default, first searches the chromatogram list to see if a TIC is already defined, and returns it if so. Otherwise, walks the MS1 spectra and calculates the TIC. Takes a single optional boolean argument which, if true, forces recalculation of the TIC even if one exists in the file.
MS::Reader::MzML::Chromatogram
my $tic = $run->get_bpc; my $tic = $run->get_bpc($force);
Returns an MS::Reader::MzML::Chromatogram object containing the base peak chromatogram for the run. By default, first searches the chromatogram list to see if a BPC is already defined, and returns it if so. Otherwise, walks the MS1 spectra and calculates the BPC. Takes a single optional boolean argument which, if true, forces recalculation of the BPC even if one exists in the file.
my $xic = $run->get_xic(%args);
Returns an MS::Reader::MzML::Chromatogram object containing an extracted ion chromatogram for the run. Required arguments include:
mz — The m/z value to extract (REQUIRED)
mz
err_ppm — The allowable m/z error tolerance (in PPM)
err_ppm
Optional arguments include:
rt — The center of the retention time window, in seconds
rt
rt_win — The window scanned on either size of rt, in seconds
rt_win
charge — Expected charge of the target species at mz
charge
iso_steps — The number of isotopic shifts to consider
iso_steps
If rt and rt_win are not given, the full range of the run will be used. If charge and iso_steps are given, will include peaks falling within the expected isotopic envelope (up to iso_steps shifts in either direction) - otherwise the isotopic envelope will not be considered.
Returns the ID of the run as specified in the <mzML> element.
<mzML
The API is in alpha stage and is not guaranteed to be stable.
Please reports bugs or feature requests through the issue tracker at https://github.com/jvolkening/p5-MS/issues.
InSilicoSpectro
MzML::Parser
Jeremy Volkening <jdv@base2bio.com>
Copyright 2015-2016 Jeremy Volkening
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
To install MS, copy and paste the appropriate command in to your terminal.
cpanm
cpanm MS
CPAN shell
perl -MCPAN -e shell install MS
For more information on module installation, please visit the detailed CPAN module installation guide.