App::SimulateReads::Command::Simulate::Genome - simulate subcommand class. Simulate genome sequencing
version 0.13
simulate_reads simulate genome [options] <fasta-file> Arguments: a fasta-file Options: -h, --help brief help message -M, --man full documentation -v, --verbose print log messages -p, --prefix prefix output [default:"out"] -o, --output-dir output directory [default:"."] -i, --append-id append to the defined template id [Format] -I, --id overlap the default template id [Format] -j, --jobs number of jobs [default:"1"; Integer] -z, --gzip compress output file -c, --coverage fastq-file coverage [default:"8", Number] -t, --sequencing-type single-end or paired-end reads [default:"paired-end"] -q, --quality-profile illumina sequencing system profiles [default:"hiseq"] -e, --sequencing-error sequencing error rate [default:"0.005"; Number] -r, --read-size the read size [default:"101"; Integer] -m, --fragment-mean the fragment mean size for paired-end reads [default:"300"; Integer] -d, --fragment-stdd the fragment standard deviation size for paired-end reads [default:"50"; Integer]
Simulate genome sequencing.
Print a brief help message and exits.
Prints the manual page and exits.
Prints log information to standard error
Concatenates the prefix to the output-file name.
Creates output-file inside output-dir. If output-dir does not exist, it is created recursively
Append string template to the defined template id. See Format
Overlap the default defined template id: single-end %i.%U_%c_%s_%t_%n and paired-end %i.%U_%c_%s_%S_%E e.g. SR123.1_chr1_P_1001_1101 See Format
A string Format is a combination of literal and escape characters similar to the way printf works. That way, the user has the freedom to customize the fastq sequence identifier to fit her needs. Valid escape characteres are:
Common escape characters
Escape Meaning ------ ------------------------------------------ %i instrument id composed by SR + PID %I job slot number %q quality profile %e sequencing error %R read 1, or 2 if it is the paired-end mate %U read number %r read size %c sequence id as chromossome, ref %s read or fragment strand %t read start position %n read end position
Paired-end specific escape characters
Escape Meaning ------ ------------------------------------------ %T mate read start position %N mate read end position %D distance between the paired-reads %m fragment mean %d fragment standard deviation %f fragment size %S fragment start position %E fragment end position
Sets the number of child jobs to be created
Compress the output-file with gzip algorithm. It is possible to pass --no-gzip if one wants uncompressed output-file
Sets the read size. For now the unique valid value is 101
Calculates the number of reads based on the sequence coverage: number_of_reads = (sequence_size * coverage) / read_size. This is the default option for genome sequencing simulation
Sets the sequencing type to single-end or paired-end
If the sequencing-type is set to paired-end, it sets the fragment mean
If the sequencing-type is set to paired-end, it sets the fragment standard deviation
Sets the sequencing error rate. Valid values are between zero and one
Sets the illumina sequencing system profile for quality. For now, the unique valid values are hiseq and poisson
Thiago L. A. Miller <tmiller@mochsl.org.br>
This software is Copyright (c) 2018 by Teaching and Research Institute from Sírio-Libanês Hospital.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007
To install App::SimulateReads, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::SimulateReads
CPAN shell
perl -MCPAN -e shell install App::SimulateReads
For more information on module installation, please visit the detailed CPAN module installation guide.