The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::SimulateReads::Command::Digest - digest command class. Simulate single-end and paired-end reads.

VERSION

version 0.10

SYNOPSIS

 simulate_reads digest [options] <fasta-file>

 Arguments:
  a fasta-file 

 Options:
  -h, --help               brief help message
  -M, --man                full documentation
  -v, --verbose            print log messages
  -p, --prefix             prefix output [default:"out"]        
  -o, --output-dir         output directory [default:"."]
  -j, --jobs               number of jobs [default:"1"; Integer]
  -z, --gzip               compress output file
  -c, --coverage           fastq-file coverage [default:"1", Number]
  -n, --number-of-reads    directly set the number of reads
                           [default:"1", Integer]
  -t, --sequencing-type    single-end or paired-end reads
                           [default:"paired-end"]
  -q, --quality-profile    illumina sequencing system profiles
                           [default:"hiseq"]
  -e, --sequencing-error   sequencing error rate
                           [default:"0.005"; Number]
  -r, --read-size          the read size [default:"101"; Integer]
  -m, --fragment-mean      the fragment mean size for paired-end reads
                           [default:"300"; Integer]
  -d, --fragment-stdd      the fragment standard deviation size for
                           paired-end reads [default:"50"; Integer]
  -b, --strand-bias        which strand to be used: plus, minus and random
                           [default:"random"]
  -w, --seqid-weight       seqid raffle type: length, same, file
                           [default: "length"]
  -f, --weight-file        weight file when seqid-weight=file

DESCRIPTION

simulate_reads will read the given input file and do something useful with the contents thereof.

OPTIONS

--help

Print a brief help message and exits.

--man

Prints the manual page and exits.

--verbose

Prints log information to standard error

--prefix

Concatenates the prefix to the output-file name.

--output-dir

Creates output-file inside output-dir. If output-dir does not exist, it is created recursively

--jobs

Sets the number of child jobs to be created

--gzip

Compress the output-file with gzip algorithm. It is possible to pass --no-gzip if one wants uncompressed output-file

--read-size

Sets the read size. For now the unique valid value is 101

--coverage

Calculates the number of reads based on the sequence coverage: number_of_reads = (sequence_size * coverage) / read_size

--number-of-reads

Sets directly the number of reads desired. It overrides coverage, in case the two options are given

--sequencing-type

Sets the sequencing type to single-end or paired-end

--fragment-mean

If the sequencing-type is set to paired-end, it sets the fragment mean

--fragment-stdd

If the sequencing-type is set to paired-end, it sets the fragment standard deviation

--sequencing-error

Sets the sequencing error rate. Valid values are between zero and one

--quality-profile

Sets the illumina sequencing system profile for quality. For now, the unique valid values are hiseq and poisson

--strand-bias

Sets which strand to use to make a read. Valid options are plus, minus and random - if you want to randomly calculte the strand for each read

--seqid-weight

Sets the seqid (e.g. chromossome, ensembl id) raffle behavior. Valid options are length, same and file. If it is set to 'same', all seqid receives the same weight when raffling. If it is set to 'length', the seqid weight is calculated based on the seqid sequence length. And finally, if it is set to 'file', the user must set the option --weight-file. For details, see --weight-file

--weight-file

If --seqid-weight is set to file, then this option becomes mandatory. A valid weight file is a tab-separated values file with 2 columns. The first column is for the seqid and the second column for the desired weight. Valid weights are integers

AUTHOR

Thiago L. A. Miller <tmiller@mochsl.org.br>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2017 by Teaching and Research Institute from Sírio-Libanês Hospital.

This is free software, licensed under:

  The GNU General Public License, Version 3, June 2007