Bio::Kmer
A module for helping with kmer analysis.
use strict; use warnings; use Bio::Kmer; my $kmer=Bio::Kmer->new("file.fastq.gz",{kmercounter=>"jellyfish",numcpus=>4}); my $kmerHash=$kmer->kmers(); my $countOfCounts=$kmer->histogram();
The BioPerl way
use strict; use warnings; use Bio::SeqIO; use Bio::Kmer; # Load up any Bio::SeqIO object. Quality values will be # faked internally to help with compatibility even if # a fastq file is given. my $seqin = Bio::SeqIO->new(-file=>"input.fasta"); my $kmer=Bio::Kmer->new($seqin); my $kmerHash=$kmer->kmers(); my $countOfCounts=$kmer->histogram();
A module for helping with kmer analysis. The basic methods help count kmers and can produce a count of counts. Currently this module only supports fastq format. Although this module can count kmers with pure perl, it is recommended to give the option for a different kmer counter such as Jellyfish.
Create a new instance of the kmer counter. One object per file.
Applicable arguments: Argument Default Description kmercounter perl What kmer counter software to use. Choices: Perl, Jellyfish. kmerlength 21 Kmer length numcpus 1 This module uses perl multithreading with pure perl or can supply this option to other software like jellyfish. gt 1 If the count of kmers is fewer than this, ignore the kmer. This might help speed analysis if you do not care about low-count kmers. sample 1 Retain only a percentage of kmers. 1 is 100%; 0 is 0% Only works with the perl kmer counter. Examples: my $kmer=Bio::Kmer->new("file.fastq.gz",{kmercounter=>"jellyfish",numcpus=>4});
Query the set of kmers with your own query
Arguments: query (string) Returns: Count of kmers. 0 indicates that the kmer was not found. -1 indicates an invalid kmer (e.g., invalid length)
Count the frequency of kmers.
Arguments: none Returns: Reference to an array of counts. The index of the array is the frequency.
Return actual kmers
Arguments: None Returns: Reference to a hash of kmers and their counts
Finds the union between two sets of kmers
Arguments: Another Bio::Kmer object Returns: List of kmers
Finds the intersection between two sets of kmers
Finds the set of kmers unique to this Bio::Kmer object.
Cleans the temporary directory and removes this object from RAM. Good for when you might be counting kmers for many things but want to keep your overhead low.
Arguments: None Returns: 1
MIT license. Go nuts.
Author: Lee Katz <lkatz@cdc.gov>
For additional help, go to https://github.com/lskatz/Bio--Kmer
CPAN module at http://search.cpan.org/~lskatz/Bio-Kmer/
To install Mashtree, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Mashtree
CPAN shell
perl -MCPAN -e shell install Mashtree
For more information on module installation, please visit the detailed CPAN module installation guide.