A role containing attibutes and methods common to all of group, subgroup, and stage classes.


The attributes for this class are pulled in from HPCI::Super and HPCI::Sub. Super contains all the characteristics common to container classes (Group and Subgroup), while Sub contains all the characteristics common to contained classes (Subgroup and Stage).

_full_name (composed internally)

The name of a group/subgroup/stage is expanded into a directory-like name that is composed of the parent subgroups' and the final object's names all joined together with double underscores. E.g. 'subgroupA__subgroupB__stage1'


        dep      => 'a_dep',                  ## one of these two
        deps     => ['dep1', 'dep2', ...],
        pre_req  => 'a_pre_req',              ## and one of these two
        pre_reqs => ['pre_req1', 'pre_req2', ...],

    # A scalar value, either provided alone or in a list, can be
    # any of:
    #    - an existing stage object
    #    - a string - the exact value of some existing stage's name
    #    - a regexp - all of the stages whose name matches the regexp

The add_deps method marks the pre_req (or all of the pre_reqs) as being pre-requisites to the dep (or all of the deps). When the group is executed, stages may be run in parallel, but a dependent stage will not be permitted to start executing until all of its prerequisites stages have completed successfully.

It is permitted to list the same dependency multiple times. This can be convenient in that you do not need to be careful about providing non-overlapping groups when you specify sets of prerequisites.

So, you could write:

    $group->add_deps( pre_req=>'stage1', deps=>[qw(stage2 stage3)] );
    $group->add_deps( pre_reqs=>[qw(stage1 stage2)], dep=>'stage3' );

instead of:

    $group->add_deps( pre_req=>'stage1', deps=>qr(^stage[23]$) );
    $group->add_deps( pre_req=>'stage2', dep=>'stage3' );


    $group->add_deps( pre_req=>'stage1', dep=>'stage2' );
    $group->add_deps( pre_req=>'stage2', dep=>'stage3' );

All three forms will provide the same ordering, the last is clearer for this simple sequence, but when there are many stages that have it may be easier to specify collections of dependencies at once.

However, you must be careful to avoid dependency loops. That would be a chain of dependencies stages that include the same stage multiple times (stage1 -> stage2 -> stage1). Since a dependency indicates that the prerequisite stage must be finished executing before the dependent stage can start executing, this loop would mean that the stage1 cannot start until stage2 has completed, but also that stage2 cannot start until stage1 has completed. So, neither one can ever start and they will both never complete.

Such a loop will eventually be detected, when the group has reached a point where there are no stages running, and no stages can be started - but there could have been a lot of time wasted executing stages that were not part of the loop before this is noticed and the run aborted.

Each stage argument passed can be either a reference to the stage object or the name of the stage, or a regexp that select all of the stages whose name matches the regexp. (If no stage name matches a regexp, then no stages are selected. This allows using a regexp to match against an optional stage without having to check whether that optional stage was actually used in this run. The downside is that a mistyped regexp will give no complaint when it matches nothing, but it is certainly not possibly to give a complaint if a mistyped regexp matches more stages than the user intended so checking the regexp carefully is necessary in any case.)

group_dir (optional)

The directory which will contain all output pertaining to the entire group. By default, this is a new directory under the parent group's group_dir named for this subgroup, or, for the top level group, under base_dir a name combining the name of the group and the timestamp when the group was created (e.g. EXAMPLEGROUP-YYMMDD-HHMMSS).


A hash containing the param list for files that need params is kept in the internal attribute _file_info. That attribute is initialized to the value of the patent (sub)group (or to an empty hash for the top level group).

The internal method _add_file_params is used to add any values passed in this file_params attribute into the _file_info hash.

Providing file info this way means that it does not have to be written out in full every time the file is used through the program.

Defaults to an empty hash.


Augment the file_params list with additioal files. Provide either a hashref or a list of value pairs, in either case, the pairs are filename as the key, and params as the value.


John Macdonald - Boutros Lab


Paul Boutros, Phd, PI - Boutros

The Ontario Institute for Cancer Research