The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

libraries_overlap_stats.pl

SYNOPSIS

libraries_overlap_stats.pl [options/parameters]

Measure the number of records of a library A that overlap those of one or more reference libraries B1, B2, B3, ...

    Input options for primary library.
        -p_type <Str>          input type (eg. DBIC, BED).
        -p_file <Str>          input file. Only works if p_type specifies a file type.
        -p_driver <Str>        driver for database connection (eg. mysql, SQLite). Only works if 
                               p_type is DBIC.
        -p_database <Str>      database name or path to database file for file based databases (eg. SQLite). Only works if p_type is DBIC.
        -p_table <Str>         database table. Only works if p_type is DBIC.
        -p_host <Str>          hostname for database connection. Only works if p_type is DBIC.
        -p_user <Str>          username for database connection. Only works if p_type is DBIC.
        -p_password <Str>      password for database connection. Only works if p_type is DBIC.
        -p_records_class <Str> type of records stored in database (Default:
                               GenOO::Data::DB::DBIC::Species::Schema::SampleResultBase::v3).

    Input options for reference library.
        -r_type <Str>          input type (eg. DBIC, BED).
        -r_file <Str>          input file. Only works if r_type specifies a file type. If used more
                               than once, reference libraries are merged.
        -r_driver <Str>        driver for database connection (eg. mysql, SQLite). Only works if 
                               r_type is DBIC.
        -r_database <Str>      database name or path to database file for file based databases
                               (eg. SQLite). Only works if r_type is DBIC.
        -r_table <Str>         database table. Only works if r_type is DBIC. If used more
                               than once, reference libraries are merged.
        -r_host <Str>          hostname for database connection. Only works if r_type is DBIC.
        -r_user <Str>          username for database connection. Only works if r_type is DBIC.
        -r_password <Str>      password for database connection. Only works if r_type is DBIC.
        -r_records_class <Str> type of records stored in database (Default:
                               GenOO::Data::DB::DBIC::Species::Schema::SampleResultBase::v3).

    Other input.
        -rname_sizes <Str>     file with sizes for reference alignment sequences (rnames). Must be tab
                               delimited (chromosome\tsize) with one line per rname.

    Output.
        -o_file <Str>          filename for output file. If path does not exist it will be created.
    
    Input Filters (only for DBIC input type).
        -p_filter <Filter>     filter primary collection. Option can be given multiple times. 
        -r_filter <Filter>     filter reference collection. Option can be given multiple times.
                               Syntax: column_name="pattern"
                                 e.g. -p_filter deletion="def" -p_filter rmsk="undef" to keep only reads with deletions and not repeat masked.
                                 e.g. -r_filter query_length=">31" -r_filter query_length="<=50" to keep reads longer than 31 and shorter or equal to 50.
                               Supported operators: ">", ">=", "<", "<=", "=", "!=","def", "undef"

    Other options.
        -v                     verbosity. If used progress lines are printed.
        -h                     print help message
        -man                   show man page

DESCRIPTION

Measure the number of records of a library A that overlap those of one or more reference libraries B1, B2, B3, ... If more than one reference libraries are given then they are merged into a single one and the overlap is calculated afterwards.