The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

fastQ_brew - a module for preprocessing of fastQ formatted files

SYNOPSIS

  use fastQ_brew;
  use List::Util qw(min max sum);
  use fastQ_brew_Utilities;
  use Cwd;
  
  my $lib       = "sanger";
  my $file_path = cwd();
  my $in_file   = "sample_sanger.fastq";

  my $tmp = fastQ_brew->new();

  $tmp->load_fastQ_brew(
                    library_type  => $lib || "illumina",
                    file_path     => $file_path,
                    in_file       => $in_file,
                    summary       => "Y",
                    de_duplex     => "Y",
                    qual_filter   => 30,
                    length_filter => 25,
                    adapter_left  => "GTACGTGTGGTGGGGAT",
                    mismatches_l  => 1,
                    adapter_right => "TAGCGCGCGATGATT",
                    mismatches_r  => 1,
                    left_trim     => 5,
                    right_trim    => 8,
                    fasta_convert => "Y",
                    dna_rna       => "Y",
                    rev_comp      => "Y",
                    remove_n      => "Y",
                    cleanup       => "Y"
  );

  $tmp->run_fastQ_brew();

DESCRIPTION

Returns summary statistics for all reads from fastQ formatted files and provides methods for filtering and trimming reads by lenght and quality.

FEEDBACK

damienoh@gwu.edu

Mailing Lists

User feedback is an integral part of the evolution of this module. Send your comments and suggestions preferably to one of the mailing lists. Your participation is much appreciated.

Support

Please direct usage questions or support issues to: <damienoh@gwu.edu> Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the GitHub bug tracking system to help keep track of the bugs and their resolution. Bug reports can be submitted via the GitHub page:

 https://github.com/dohalloran/fastQ_brew/issues

AUTHORS - Damien OHalloran

Email: damienoh@gwu.edu

APPENDIX

The rest of the documentation details each of the object methods.

new()

 Title   : new()
 Usage   : my $tmp = fastQ_brew->new();
 Function: constructor routine
 Returns : a blessed object
 Args    : none

load_fastQ_brew()

 Title   : load_fastQ_brew()
 Usage   : $tmp->load_fastQ_brew(
                    library_type  => $lib || "illumina",
                    file_path     => $file_path,
                    in_file       => $in_file,
                    summary       => "Y",
                    de_duplex     => "Y",
                    qual_filter   => 30,
                    length_filter => 25,
                    adapter_left  => "GTACGTGTGGTGGGGAT",
                    mismatches_l  => 1,
                    adapter_right => "GTACGTGTGGTGGGGAT",
                    mismatches_r  => 1,
                    left_trim     => 5,
                    right_trim    => 8,
                    fasta_convert => "Y",
                    dna_rna       => "Y",
                    rev_comp      => "Y",
                    remove_n      => "Y",
                    cleanup       => "Y"
              );
 Function: Populates the user data into $self hash
 Returns : nothing returned
 Args    :
 -library_type, either sanger or illumina
 -file_path, path to sequences
 -in_file, the name of the files containing the fastQ reads
 -summary, return summary statistics for the unfiltered and filtered fastq data
 -de_duplex, remove duplicate entries
 -qual_filter, fiter reads by Q score: N=no, 200=remove reads with Quality (Q) scores below 200
 -adapter_left, remove adapter from left side
 -mismatches_l, remove adapter from left side that include a number of mismatches
 -adapter_right, remove adapter from right side
 -mismatches_r, remove adapter from right side that include a number of mismatches
 -left_trim, remove x number of bases from left end
 -right_trim, remove x nnumber of bases from right end
 -length_filter, fiter reads by length: N=no, 40=remove reads shorter than 40 bases
 -fasta_convert, option to convert to fastA file: Y=yes, N=no
 -dna_rna, transcribe reads in fastQ file: N=no, Y=yes
 -rev_comp, reverse complement reads in fastQ file: N=no, Y=yes
 -remove_n, remove reads with non-designated bases (i.e. N's) in fastQ file: N=no, Y=yes
 -cleanup, option to delete tmp file: Y=yes, N=no

run_fastQ_brew()

 Title   : run_fastQ_brew()
 Usage   : $self->run_fastQ_brew(%arg)
 Function: processes the input file and start cycle
 Returns : tmp file with only phred score and sequence for each read
 Args    : fastQ file

_summary_stats()

 Title   : _summary_stats()
 Usage   : _summary_stats();
 Function: runs the summary stats
 Returns : the stats
 Args    : $self, %arg

_de_duplex()

 Title   : _de_duplex
 Usage   : _de_duplex();
 Function: remove duplicate reads
 Returns : fastQ file with only singletons
 Args    : Y=yes, N=no

_remove_n()

 Title   : _remove_n()
 Usage   : $self->_remove_n(%arg)
 Function: option to remove reads with N's
 Returns : fastQ file
 Args    : Y=yes, N=no

_remove_adapter_left()

 Title   : _remove_adapter_left
 Usage   : _remove_adapter_left();
 Function: option to remove specific adapters from left side 
 Returns : fastQ file
 Args    : string="GTCGAGT" and mismatches=integer

_remove_adapter_right()

 Title   : _remove_adapter_right
 Usage   : _remove_adapter_right();
 Function: option to remove specific adapters from right side 
 Returns : fastQ file
 Args    : string="GTCGAGT" and mismatches=integer

_right_trim()

 Title   : _right_trim()
 Usage   : $self->_right_trim(%arg)
 Function: option to remove right side bases from reads
 Returns : right trimmed fastQ file
 Args    : integer=yes, N=no

_left_trim()

 Title   : _left_trim()
 Usage   : $self->_left_trim(%arg)
 Function: option to remove left side bases from reads
 Returns : left trimmed fastQ file
 Args    : integer=yes, N=no

_prune_fastq()

 Title   : _prune_fastq()
 Usage   : _prune_fastq();
 Function: option to remove reads below phred score
 Returns : pruned fastQ file
 Args    : integer=yes, N=no

_trim_length()

 Title   : _trim_length()
 Usage   : $self->_trim_length(%arg)
 Function: option to remove reads below specified length
 Returns : trimmed fastQ file
 Args    : integer=yes, N=no

_convert_fasta()

 Title   : _convert_fasta()
 Usage   : _convert_fasta();
 Function: option to convert fastQ file to fastA
 Returns : fastA file
 Args    : Y=yes, N=no

_reverse_comp()

 Title   : _reverse_comp()
 Usage   : $self->_reverse_comp(%arg)
 Function: option to rev comp fastQ reads
 Returns : reverse complemented fastQ file
 Args    : Y=yes, N=no

_dna_rna()

 Title   : _dna_rna()
 Usage   : $self->_dna_rna(%arg)
 Function: option to convert dna to rna for fastQ reads
 Returns : RNA fastQ file
 Args    : Y=yes, N=no

_post_fastQ_brew()

 Title   : _post_fastQ_brew
 Usage   : _post_fastQ_brew();
 Function: runs the summary stats after filtering
 Returns : the stats
 Args    : $self, %arg

_cleanup()

 Title   : _cleanup()
 Usage   : _cleanup();
 Function: option to delete tmp files
 Returns : nothing
 Args    : Y=yes, N=no

DESTROY()

 Title   : DESTROY
 Usage   : DESTROY();
 Function: garbage collection
 Returns : nothing
 Args    : automatically called

get_lib_type()

 Title   : get_lib_type()
 Usage   : my $get_lib_type= $tmp->get_lib_type();
 Function: Retrieves the library type used
 Returns : A string of the type e.g. Sanger
 Args    : none

set_lib_type()

 Title   : set_lib_type()
 Usage   : my $set_lib_type = $tmp->set_lib_type("sanger");
 Function: Populates the $self->{lib_type} property
 Returns : $self->{lib_type}
 Args    : the lib as a string

get_in_file()

 Title   : get_in_file()
 Usage   : my $get_in_file = $tmp->get_in_file();
 Function: Retrieves the input filename
 Returns : A string containing filename
 Args    : none

set_in_file()

 Title   : set_in_file()
 Usage   : my $set_in_file= $tmp->set_in_file("myOutPutFile.txt");
 Function: Populates the $self->{in_file} property
 Returns : $self->{in_file}
 Args    : name of the user provided input file

get_de_duplex()

 Title   : get_de_duplex()
 Usage   : my $get_de_duplex= $tmp->get_de_duplex();
 Function: Retrieves the de_duplex choice 
 Returns : Y or N
 Args    : none

set_de_duplex()

 Title   : set_de_duplex()
 Usage   : my $set_de_duplex= $tmp->set_de_duplex();
 Function: Sets the de_duplex choice 
 Returns : Populates the $self->{de_duplex} property
 Args    : Y or N

get_qual_filter()

 Title   : get_qual_filter()
 Usage   : my $get_qual_filter= $tmp->get_qual_filter();
 Function: Retrieves the qual filter used
 Returns : integer
 Args    : none

set_qual_filter()

 Title   : set_qual_filter()
 Usage   : my $set_qual_filter= $tmp->set_qual_filter();
 Function: Sets the qual filter used
 Returns : Populates the $self->{qual_filter} property
 Args    : integer

get_len_filter()

 Title   : get_len_filte()
 Usage   : my $get_len_filte= $tmp->get_len_filte();
 Function: Retrieves the length filter
 Returns : integer
 Args    : none

set_len_filter()

 Title   : set_len_filter()
 Usage   : my $set_len_filter= $tmp->set_len_filter();
 Function: Sets the len filter used
 Returns : Populates the $self->{length_filter} property
 Args    : integer

get_adapter_l()

 Title   : get_adapter_l()
 Usage   : my $get_adapter_l= $tmp->get_adapter_l();
 Function: Retrieves the left adapter specified 
 Returns : A string of the left adapater
 Args    : none

set_adapter_l()

 Title   : set_adapter_l()
 Usage   : my $set_adapter_l= $tmp->set_adapter_l();
 Function: Sets the $self->{adapter_left} property
 Returns : Populates the $self->{adapter_left} property
 Args    : string

get_adapter_r()

 Title   : get_adapter_r()
 Usage   : my $get_adapter_r= $tmp->get_adapter_r();
 Function: Retrieves the right adapter specified 
 Returns : A string of the right adapater
 Args    : none

set_adapter_r()

 Title   : set_adapter_r()
 Usage   : my $set_adapter_r= $tmp->set_adapter_r();
 Function: Sets the $self->{adapter_right} property
 Returns : Populates the $self->{adapter_right} property
 Args    : string

get_left_trim()

 Title   : get_left_trim()
 Usage   : my $get_left_trim= $tmp->get_left_trim();
 Function: Retrieves the left trim number
 Returns : integer
 Args    : none

set_left_trim()

 Title   : set_left_trim()
 Usage   : my $set_left_trim = $tmp->set_left_trim();
 Function: Populates the $self->{left_trim} property
 Returns : $self->{left_trim}
 Args    : integer

get_right_trim()

 Title   : get_right_trim()
 Usage   : my $get_right_trim= $tmp->get_right_trim();
 Function: gets the right trim number
 Returns : integer
 Args    : none

set_right_trim()

 Title   : set_right_trim()
 Usage   : my $set_right_trim = $tmp->set_right_trim();
 Function: Populates the $self->{right_trim} property
 Returns : $self->{right_trim}
 Args    : integer

get_fasta()

 Title   : get_fasta()
 Usage   : my $get_fasta= $tmp->get_fasta();
 Function: Retrieves the get_fasta option
 Returns : Y or N
 Args    : none

set_fasta()

 Title   : set_fasta()
 Usage   : my $set_fasta = $tmp->set_fasta();
 Function: Populates the $self->{fasta_convert} property
 Returns : $self->{fasta_convert}
 Args    : a command to execute fastA convert or not: Y=yes, N=no

get_rev_com()

 Title   : get_rev_com()
 Usage   : my $get_rev_com= $tmp->get_rev_com();
 Function: Retrieves the rev_comp option
 Returns : Y or N
 Args    : none

set_rev_com()

 Title   : set_rev_com()
 Usage   : my $set_rev_com = $tmp->set_rev_com();
 Function: Populates the $self->{rev_comp} property
 Returns : $self->{rev_comp}
 Args    : a command to execute rev_comp or not: Y=yes, N=no

get_remove_n()

 Title   : get_remove_n()
 Usage   : my $get_remove_n= $tmp->get_remove_n();
 Function: Retrieves the command for N removal reads
 Returns : Y or N
 Args    : none

set_remove_n()

 Title   : set_remove_n()
 Usage   : my $set_remove_n = $tmp->set_remove_n();
 Function: Populates the $self->{remove_n} property
 Returns : $self->{remove_n}
 Args    : a command to remove reads with N or not: Y=yes, N=no

get_cleanup()

 Title   : get_cleanup()
 Usage   : my $get_cleanup = $tmp->get_cleanup();
 Function: returns the value option for cleanup
 Returns : Y or N
 Args    : none

set_cleanup()

 Title   : set_cleanup()
 Usage   : my $set_cleanup = $tmp->set_cleanup("Y");
 Function: Populates the $self->{cleanup} property
 Returns : $self->{cleanup}
 Args    : a command to execute cleanup or not: Y=yes, N=no

LICENSE AND COPYRIGHT

 Copyright (C) 2017 Damien M. O'Halloran
 GNU GENERAL PUBLIC LICENSE