BioUtil::Util - Utilities for operation on data or file
Some great modules like BioPerl provide many robust solutions. However, it is not easy to install for someone in some platforms. And for some simple task scripts, a lite module may be a good choice. So I reinvented some wheels and added some useful utilities into this module, hoping it would be helpful.
Version 2015.0228
file_list_from_argv get_file_list delete_string_elements_by_indexes delete_array_elements_by_indexes extract_parameters_from_string get_parameters_from_file get_list_from_file get_column_data read_json_file write_json_file run run_time readable_second check_positive_integer mean_and_stdev filename_prefix check_all_files_exist check_in_out_dir rm_and_mkdir get_paired_fq_gz_file_from_dir get_paired_fa_gz_file_from_dir
use BioUtil::Util;
getopt FOR ME
Example -a b -c t tt -d bb -dbtype asdfafd -test
-a: b -c: ARRAY(0xee25e8) -d: bb -dbtype: asdfafd -infmt: fasta -test: 1
Get file list from @ARGV. You should use this after parsing options!
When no arguments given, 'STDIN' will be added to the list, which could be further used by, e.g. FastaReader.
Find files/directories with custom filter, max serach depth could be specified.
Example (searching perl scripts)
my $dir = "~"; my $depth = 2; my $list = get_file_list( $dir, sub { if ( -d or /^\./i ) { # ignore configuration file and folders return 0; } if (/\.pm/i or /\.pl/i) { return 1; } return 0; }, $depth ); print "$_\n" for @$list;
Delete string elements by indexes, it uses delete_array_elements_by_indexes
Delete array elements by given indexes.
Example:
@list = qw(a b c d e f); @idx = (1, 2, 4); $list2 = delete_array_elements_by_indexes(\@list, \@idx); print "@$list2\n"; # result: a, d, f
Extract parameters from string.
The regular expression is
/([\w\d\_\-\.]+)\s*=\s*([^\=;]*)[\s;]*/
# bad format, but could also be parsed # my $s = " s = b; a=test; b_c=12 3; a.b =; b # = asdf # sd; ads-f = 12313"; # recommended my $s = "key1=abcde; key2=123; conf.a=file; conf.b=12; "; my $pa = extract_parameters_from_string($s); print "=$_:$$p{$_}=\n" for sort keys %$pa;
Get parameters from a file. Comments start with # are allowed in file.
my $pa = get_parameters_from_file("d.txt"); print "$_: $$pa{$_}\n" for sort keys %$pa;
For a file with content:
# cell phone apple = 1 # note nokia = 2 #
output is:
apple: 1 nokia: 2
Get list from a file. Comments start with # are allowed in file.
my $list = get_list_from_file("d.txt"); print "$_\n" for @$list;
# cell phone apple # note nokia
apple nokia
Get one column of a file.
my $list = get_column_data("d.txt", 2); print "$_\n" for @$list;
Read json file and decode it into a hash ref.
my $hashref = read_json_file($file);
Write a hash ref into a file.
my $hashref = { "a" => 1, "b" => 2 }; write_json_file($hashref, $file);
Run a command
my $fail = run($cmd); die "failed to run:$cmd\n" if $fail;
Run a subroutine with given arguments N times, and return the mean and stdev of time.
my $read_by_record = sub { my ($file) = @_; my $next_seq = FastaReader($file); while ( my $fa = &$next_seq() ) { my ( $header, $seq ) = @$fa; # print ">$header\n$seq\n"; } }; my ($mean, $stdev) = run_time( 8, $read_by_record, $file ); printf STDERR "\n## Compute time: %0.03f ± %0.03f s\n\n", $mean, $stdev;
readable_second
print readable_second(11312314),"\n"; # 130 day 22 hour 18 min 34 sec
Check Positive Integer
check_positive_integer(1);
return mean and stdev of a list
Example: my @list = qq/1 2 3/; mean_and_stdev(\@list);
Get filename prefix
filename_prefix("test.fa"); # "test" filename_prefix("tmp"); # "tmp"
Check whether all files existed.
Check in and $fh2 directory.
check_in_out_dir("~/dir", "~/dir.out");
Make a directory, remove it firstly if it exists.
rm_and_mkdir("out")
# . # ├── test_1.fq.gz # └── test_2.fq.gz for my $pe ( get_paired_fq_gz_file_from_dir($indir) ) { # test_1.fq.gz, test_1.fq.gz, test my ( $fqfile1, $fqfile2, $id ) = @$pe; }
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in '±'. Assuming UTF-8
To install BioUtil, copy and paste the appropriate command in to your terminal.
cpanm
cpanm BioUtil
CPAN shell
perl -MCPAN -e shell install BioUtil
For more information on module installation, please visit the detailed CPAN module installation guide.