The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

OpenMP::Environment - Perl extension managing OpenMP variables in %ENV within a script.

SYNOPSIS

This module is most effective when used along with OpenMP::Simple:

  use strict;
  use warnings;

  use OpenMP::Simple;
  use OpenMP::Environment;

  use Inline (
      C    => 'DATA',
      with => qw/OpenMP::Simple/,
  );

  my $env = OpenMP::Environment->new;

  for my $want_num_threads ( 1 .. 8 ) {
      $env->omp_num_threads($want_num_threads);

      $env->assert_omp_environment; # (optional) validates %ENV

      # call parallelized C function
      my $got_num_threads = _check_num_threads();

      printf "%0d threads spawned in ".
              "the OpenMP runtime, expecting %0d\n", $got_num_threads, $want_num_threads;
  }

  __DATA__
  __C__

  /* C function parallelized with OpenMP */
  int _check_num_threads() {
    int ret = 0;

    PerlOMP_UPDATE_WITH_ENV__NUM_THREADS /* <~ MACRO x OpenMP::Simple */

    #pragma omp parallel
    {
      #pragma omp single
      ret = omp_get_num_threads();
    }

    return ret;
  }

But it can be used alone also to manage the environment for OpenMP paralellized executables that are called as external processes.

  use strict;
  use warnings;
 
  use OpenMP::Environment;
  my $env = OpenMP::Environment->new;

  foreach my $i (1 2 4 8 16 32 64 128 256) {
    $env->set_omp_num_threads($i); # Note: validated
    foreach my $sched (qw/static dynamic auto/) {
      # compute chunk size
      my $chunk = get_baby_ruth($i);
      
      # set schedule using prescribed format
      $env->set_omp_schedule(qq{$sched,$chunk});
      # Note: format is OMP_SCHED_T[,CHUNK] where OMP_SCHED_T is: 'static', 'dynamic', 'guided', or 'auto'; CHUNK is an integer >0 
      
      my $exit_code = system(qw{/path/to/my_prog_r --opt1 x --opt2 y});
       
      if ($exit_code == 0) {
        # ... do some post processing
      }
      else {
        # ... handle failed execution
      }
    }
  }

DESCRIPTION

OpenMP::Environment provides accessors for affecting the OpenMP/GOMP environmental variables that affect some aspects of OpenMP programs and shared libraries at libary load and run times. There are setters, getters, and unsetters for all published OpenMP (and GOMP) environmental variables, in additional to some utility methods.

The environment variables which beginning with OMP_ are defined by section 4 of the OpenMP specification in version 4.5, while those beginning with GOMP_ are GNU extensions

The author of this module is also the author of OpenMP::Simple, and it is recommended that these two modules be used together for maximum ease of creating Perl programs that contains C code that has been parallelized using OpenMP. "Example 4" illustrates for the curious how to use Alien::OpenMP and directly query %ENV in a way that mimicks the OpenMP runtime's expected behavior of querying the environment for some important information like OMP_NUM_THREADS explicitly at the start of the execution.

However, the recommended approach is illustrated in "Example 5", which uses both OpenMP::Simple and OpenMP::Environment to incorporate an %ENV aware OpenMP into a Perl programs as seamlessly as possible.

ABOUT THIS DOCUMENT

Most provided methods are meant to manipulate a particular OpenMP environmental variable. Each has a setter, getter, and unsetter (i.e., deletes the variable from %ENV directly.

Each method is documented, and it is noted if the setter will validate the provided value. Validation occurs whenever the set of values is a simple numerical value or is a limited set of specific strings. It is clearly noted below when a setter does not validate. This is extended to assert_omp_environment, which will validate the variables it is able if they are already set in %ENV.

https://gcc.gnu.org/onlinedocs/libgomp/Environment-Variables.html

Help with improving this document and helping to advocate for the use of OpenMP in the Perl world, is all very much appreciated.

USES AND USE CASES

Intended Audience

OpenMP::Environment used alone is for individuals wishing to write code for managing complex workflows involving standalone executables parallelized using OpenMP where managing the OpenMP run time parameters via the environment is important. It helps to be familiar with Perl, however this is already a highly technical endeavor and the introduction of Perl for this is a good decision. This module makes it a better one.

Coupled with OpenMP::Simple, the targeted audience broadens more deeply to include OpenMP C coders who may not have considered Perl as a host environment for their parallel programs. It also provides a path for experienced Perl programmers to consider using OpenMP in their programs that would benefit from the targeted, real parallelization of subroutines that might be already including via Inline::C.

Benchmarks

This module is ideal to support benchmarks and test suites that are implemented using OpenMP. As a small example, there is an example of such a script, in benchmarks/demo-dc-NASA.pl that shows the building and execution of the DC benchmark. Distributed with this source is are the C and Fortran protions of NASA's NPB (version 3.4.1) benchmarking suite for OpenMP. It's okay, technically I in addition to all US Citizens own this code since we paid for it :). The link to the benchmark suite is https://www.nas.nasa.gov/publications/npb.html, but it is one of many such OpenMP benchmarks and validation suites.

Supporting XS Modules Using OpenMP

Because C code that is including into a Perl program directly as a shared library (via XS or indirectly via Inline::C based methods - which Alien::OpenMP and OpenMP::Simple both are); it can be surprising that OpenMP's runtime start up behavior queries the user's environment (%ENV) only once for important variables.

While surprising, this is consistent with how an OpenMP program behaves when run directly as a binary executable. Since we usually want to deal differently with shared libraries included in a Perl program (i.e., we wish to call the functions more than once), we need to explicitly re-read %ENV and apply any updates to the variables we care about in the C code using OpenMP's runtime functions. That's where OpenMP::Simple helps - by providing C MACROS that are written to do this for all OpenMP runtime functions that are available at start up.

Please jump to "Example 5" to see exactly what the recommend way to solve this looks like. The alternative is to do something messy and non-intuitive like in Example 4.

EXAMPLES

There is a set of example scripts in the distribution's, examples/ directory.

The number and breadth of testing is also growing, so for more examples on using it and this module's flexibility; please see those.

Lastly, the Section "SUPPORTED OpenMP ENVIRONMENTAL VARIABLES" provides the full description of each environmental variable available in the OpenMP and GOMP documentation. It also describes the range of values that are deemed valid for each variable.

Example 1

Ensure an OpenMP environment is set up properly already (externally)

  use OpenMP::Environment;
  my $env = OpenMP::Environment->new;
  $env->assert_omp_environment;

Example 2

Managing a range of thread scales (useful for benchmarking, testing, etc)

    use OpenMP::Environment;
    my $env = OpenMP::Environment->new;
  
    foreach my $i (1 2 4 8 16 32 64 128 256) {
      $env->set_omp_num_threads($i); # Note: validated
      my $exit_code = system(qw{/path/to/my_prog_r --opt1 x --opt2 y});
       
      if ($exit_code == 0) {
        # ... do some post processing
      }
      else {
        # ... handle failed execution
      }
    }

Example 3

Extended benchmarking, affecting OMP_SCHEDULE in addition toOMP_NUM_THREADS.

    use OpenMP::Environment;
    my $env = OpenMP::Environment->new;
  
    foreach my $i (1 2 4 8 16 32 64 128 256) {
      $env->set_omp_num_threads($i); # Note: validated
      foreach my $sched (qw/static dynamic auto/) {
        # compute chunk size
        my $chunk = get_baby_ruth($i);
        
        # set schedule using prescribed format
        $env->set_omp_schedule(qq{$sched,$chunk});
        # Note: format is OMP_SCHED_T[,CHUNK] where OMP_SCHED_T is: 'static', 'dynamic', 'guided', or 'auto'; CHUNK is an integer >0 
        
        my $exit_code = system(qw{/path/to/my_prog_r --opt1 x --opt2 y});
         
        if ($exit_code == 0) {
          # ... do some post processing
        }
        else {
          # ... handle failed execution
        }
      }
    }

Note: While it has not been tested, theoretically any Perl module that utilizes compiled libraries (via C::Inline, XS, FFIs, etc) that are OpenMP aware should also be at home within the context of this module.

Example 4

Use with an XS module that itself is OpenMP aware:

Note: OpenMP::Environment has no effect on Perl interfaces that utilize compiled code as shared objects, that also contain OpenMP constructs.

The reason for this is that OpenMP implemented by compilers, gcc (gomp), anyway, only read in the environment once. In our use of Inline::C, this corresponds to the actual loading of the .so that is linked to the XS-based Perl interface it presents. As a result, a developer must use the OpenMP API that is exposed. In the example below, we're using the omp_set_num_threads rather than setting OMP_NUM_THREADS via %ENV or using OpenMP::Environment's omp_num_threads method.

This example uses OpenMP::Environment, but shows that it works with two caveats:

First, it must be called in a BEGIN block that contains the invocation of Inline::C. Second, it as only this single opportunity to effect the variables that it sets

    use OpenMP::Environment ();
    use constant USE_DEFAULT => 0;
    
    BEGIN {
        my $oenv = OpenMP::Environment->new;
        $oenv->omp_num_threads(16);     # serve as "default" (actual standard default is 4)
        $oenv->omp_thread_limit(32);    # demonstrate setting of the max number of threads

        use Alien::OpenMP;
        use Inline (
            C    => 'DATA',
            with => qw/Alien::OpenMP/,
        );
    
        # Note: Alien::OpenMP replaces:
        #  use Inline (
        #    C           => 'DATA',
        #    name        => q{Test},
        #    ccflagsex   => q{-fopenmp},
        #    lddlflags   => join( q{ }, $Config::Config{lddlflags}, q{-fopenmp} ),
        #    BUILD_NOISY => 1,
        #  );
    }
    
    # use default
    test(USE_DEFAULT);
    
    for my $num_threads (qw/1 2 4 8 16 32 64 128 256/) {
        test($num_threads);
    }
    
    exit;
    
    __DATA__
    
    __C__
    #include <omp.h>
    #include <stdio.h>
    void test(int num_threads) {
    
      // invoke default set at library load time if a number less than 1 is provided
      if (num_threads > 0)
        omp_set_num_threads(num_threads);
    
      #pragma omp parallel
      {
        if (0 == omp_get_thread_num())
          printf("wanted '%d', got '%d' (max number is %d)\n", num_threads, omp_get_num_threads(), omp_get_thread_limit()); 
      }
    }

"Example 5" in the following section demostrates how get around this restriction somewhat. The caveat is that the respective environmental variable must also come with a corresponding setter function in the OpenMP run time. OpenMP::Simple was written to do exactly that as seemlessly as possible and is currently the recommended approach.

Example 5

Writing C functions that are aware of the OpenMP run time methods that are able to be affected by the set of omp_set_* functions:

The following is an example of emulating the familiar behavior of compiled OpenMP programs that respect a number of environmental variables at run time. The key difference between running a compiled OpenMP program at the commandline and a compiled subroutine in Perl that utilizes OpenMP, is that subsequent calls to the subroutine in the Perl script do not have an opportunity to relead the binary or shared library.

The "user experience" of one running an OpenMP program from the shell is that it the number of threads used in the program may be set implicitly using the OMP_NUM_THREADS environmental variable. Therefore, one may run the binary in a shell loop and update OMP_NUM_THREADS environmentally. Using OpenMP::Simple (itself a wrapper around Alien::OpenMP) makes it extremely clean and easy to begin adding OpenMP parallelized C code into Perl programs which contain the kind of environmental runtime controls one familiar with OpenMP has come to expect.

  use strict;
  use warnings;

  use OpenMP::Simple;
  use OpenMP::Environment;

  use Inline (
      C    => 'DATA',
      with => qw/OpenMP::Simple/,
  );

  my $env = OpenMP::Environment->new;

  for my $want_num_threads ( 1 .. 8 ) {
      $env->omp_num_threads($want_num_threads);

      $env->assert_omp_environment; # (optional) validates %ENV

      # call parallelized C function
      my $got_num_threads = _check_num_threads();

      printf "%0d threads spawned in ".
              "the OpenMP runtime, expecting %0d\n", $got_num_threads, $want_num_threads;
  }

  __DATA__
  __C__

  /* C function parallelized with OpenMP */
  int _check_num_threads() {
    int ret = 0;

    PerlOMP_UPDATE_WITH_ENV__NUM_THREADS /* <~ MACRO x OpenMP::Simple */

    #pragma omp parallel
    {
      #pragma omp single
      ret = omp_get_num_threads();
    }

    return ret;
  }

Additional Discussion

OpenMP benchmarks are often written in this fashion. It is possible to affect the number of threads in the binary, but only through the use of run time methods. In the case of OMP_NUM_THREADS, this function is omp_set_num_threads. The issue here is that using run time setters breaks the veil that is so attractive about OpenMP; the pragmas offer a way to implicitly define OpenMP threads *if* the compiler can recognize them; if it can't, the pragmas are designed to appear as normal comments.

Using run time functions is an explicit act, and therefore can't be hidden in the same manner. This requires the compiler to link against OpenMP run time libraries, even if there is no intention to run in parallel. There are 2 options here - hide the run time call from the compiler using ifdef or the like; or link the OpenMP library and just ensure OMP_NUM_THREADS is set to 1 (as in a single thread).

Using OpenMP::Environment introduces the consideration that the compiled subroutine is loaded only once when the Perl script is executed. It is true that in this situation, the environment is read in as expected - but, it is only considered once and at library load time.

To get away from this restriction and emulate more closely the user experience of the commandline with respect to OpenMP environmental variable controls, we present the following example to show how to re-read certain environmental variables.

Interestingly, there are only 6 run time setters that correspond to OpenMP environmental variables to work with:

omp_set_num_threads

Corresponds to omp_set_num_threads.

omp_set_default_device

Corresponds to OMP_DEFAULT_DEVICE

omp_set_dynamic

Corresponds to OMP_DYNAMIC.

omp_set_max_active_levels

Corresponds to OMP_MAX_ACTIVE_LEVELS

omp_set_nested

Corresponds to OMP_NESTED

omp_set_schedule

Corresponds to OMP_SCHEDULE.

METHODS

Note: Due to the libary load time of functions compiled and exported (e.g., using Inline::C), only environmental variables that are provided with a standard set function for affecting at run time can be made to emulate the effective behavior that those familiar with executing OpenMP binaries my find familiar. See examples 4 and 5 above for more information about what this means.

new

Constructor

assert_omp_environment

Validates OpenMP related environmental variables that might happen to be set in %ENV directly. Useful as a guard in launcher scripts to ensure the variables that are validated in this module are valid.

As is the case for all variables, an Environment completely devoid of any related variables being set is considered valid. In other words, only variables that are already set in the Environment are validated.

vars

Returns a list of all supported OMP_* and GOMP_* environmental variables.

vars_unset

Returns a list of all unset supported variables.

vars_set

Returns a list of hash references of all set variables, of the form,

    (
       VARIABLE1 => value1,
       VARIABLE2 => value2,
       ...
    )

Prints summary of all unset variable.

Uses internal method, _omp_summary_unset to get string to print.

Prints summary of all set variables, including values.

Uses internal method, _omp_summary_set to get string to print.

Prints summary of all set and unset variables; including values where applicable.

Uses internal method, _omp_summary to get string to print.

omp_cancellation

Setter/getter for OMP_CANCELLATION.

Validated.

Note: it appears that the OpenMP Specification (any version) does not define a runtime method to set this. When used with OpenMP::Simple, which makes it a little easier to deal with Inline::C'd OpenMP routines, this must be set before the shared libraries are loaded from Inline::C. The only real opportunity to do this is in the BEGIN block. However, if dealing with a standalone binary executable; this environmental variable will do what you mean when updated between calls to the external executable.

unset_omp_cancellation

Unsets OMP_CANCELLATION, deletes it from localized %ENV.

omp_display_env

Setter/getter for OMP_DISPLAY_ENV.

Validated.

unset_omp_display_env

Unsets OMP_DISPLAY_ENV, deletes it from localized %ENV.

omp_default_device

Setter/getter for OMP_DEFAULT_DEVICE.

Validated.

Note: The other environmental variables presented in this module do not have run time setters. Dealing with tese dynamically presents some additional hurdles and considerations; this will be addressed outside of this example.

unset_omp_default_device

Unsets OMP_DEFAULT_DEVICE, deletes it from localized %ENV.

omp_dynamic

Setter/getter for OMP_DYNAMIC.

Validated. If set to a falsy value, the key $ENV{OMP_DYNAMIC} is deleted entirely, because this seems to be how GCC's GOMP needs it to be presented. Simply setting it to 0 or false will not work. It has to be unset. So setting it to a falsy value is the same as calling unset_omp_dynamic.

'true' | 1
'false' | 0 | unset

Note: The other environmental variables presented in this module do not have run time setters. Dealing with tese dynamically presents some additional hurdles and considerations; this will be addressed outside of this example.

unset_omp_dynamic

Unsets OMP_DYNAMIC, deletes it from localized %ENV.

omp_max_active_levels

Setter/getter for OMP_MAX_ACTIVE_LEVELS.

Validated.

Note: The other environmental variables presented in this module do not have run time setters. Dealing with tese dynamically presents some additional hurdles and considerations; this will be addressed outside of this example.

unset_omp_max_active_levels

Unsets OMP_MAX_ACTIVE_LEVELS, deletes it from localized %ENV.

omp_max_task_priority

Setter/getter for OMP_MAX_TASK_PRIORITY.

Validated.

unset_omp_max_task_priority

Unsets OMP_MAX_TASK_PRIORITY, deletes it from localized %ENV.

Validated.

omp_nested

Setter/getter for OMP_NESTED.

Validated.

Note: The other environmental variables presented in this module do not have run time setters. Dealing with tese dynamically presents some additional hurdles and considerations; this will be addressed outside of this example.

unset_omp_nested

Unsets OMP_NESTED, deletes it from localized %ENV.

omp_num_threads

Setter/getter for OMP_NUM_THREADS.

Validated.

Note: This environmental variable has a Standards defined run time function associated with it. Therefore, the approach of rereading the environment demostrated in "Example 5" may be used to use this module for affecting this setting at run time.

For more information on this environmental variable, please see:

https://gcc.gnu.org/onlinedocs/libgomp/openmp-environment-variables/ompnumthreads.html

unset_omp_num_threads

Unsets OMP_NUM_THREADS, deletes it from localized %ENV.

omp_num_teams

Setter/getter for OMP_NUM_TEAMS.

Validated.

Note: This environmental variable has a Standards defined run time function associated with it. Therefore, the approach of rereading the environment demostrated in "Example 5" may be used to use this module for affecting this setting at run time.

For more information on this environmental variable, please see:

https://gcc.gnu.org/onlinedocs/libgomp/openmp-environment-variables/ompnumteams.html

unset_omp_num_teams

Unsets OMP_NUM_TEAMS, deletes it from localized %ENV.

omp_proc_bind

Setter/getter for OMP_PROC_BIND.

Not validated.

unset_omp_proc_bind

Unsets OMP_PROC_BIND, deletes it from localized %ENV.

omp_places

Setter/getter for OMP_PLACES.

Not validated.

unset_omp_places

Unsets OMP_PLACES, deletes it from localized %ENV.

omp_stacksize

Setter/getter for OMP_STACKSIZE.

Not validated.

unset_omp_stacksize

Unsets OMP_STACKSIZE, deletes it from localized %ENV.

omp_schedule

Setter/getter for OMP_SCHEDULE.

Not validated.

Note: The format for the environmental variable is omp_sched_t[,chunk] where omp_sched_t is: 'static', 'dynamic', 'guided', or 'auto'; chunk is an integer >0

For contrast to the value of OMP_SCHEDULE, the runtime function used to set this in an OpenMP program, set_omp_schedule that expects constant values not exposed via the environmental variable OMP_SCHEDULE.

E.g.,

  #include<omp.h>
  ...
  set_omp_schedule(omp_sched_static, 10); // Note: this is the C runtime function call

For more information on this particular environmental variable please see:

https://gcc.gnu.org/onlinedocs/libgomp/openmp-environment-variables/ompschedule.html

Also, see the tests in OpenMP::Simple.

Note: The other environmental variables presented in this module do not have run time setters. Dealing with tese dynamically presents some additional hurdles and considerations; this will be addressed outside of this example.

unset_omp_schedule

Unsets OMP_SCHEDULE, deletes it from localized %ENV.

omp_target_offload

Setter/getter for OMP_TARGET_OFFLOADS.

Validated.

unset_omp_target_offload

Unsets OMP_TARGET_OFFLOADS, deletes it from localized %ENV.

omp_thread_limit

Setter/getter for OMP_THREAD_LIMIT.

Validated.

unset_omp_thread_limit

Unsets OMP_THREAD_LIMIT, deletes it from localized %ENV.

omp_teams_thread_limit

Setter/getter for OMP_TEAMS_THREAD_LIMIT.

Validated.

unset_omp_teams_thread_limit

Unsets OMP_TEAMS_THREAD_LIMIT, deletes it from localized %ENV.

omp_wait_policy

Setter/getter for OMP_WAIT_POLICY.

Validated.

unset_omp_wait_policy

Unsets OMP_WAIT_POLICY, deletes it from localized %ENV.

gomp_cpu_affinity

Setter/getter for GOMP_CPU_AFFINITY.

Not validated.

unset_gomp_cpu_affinity

Unsets GOMP_CPU_AFFINITY, deletes it from localized %ENV.

gomp_debug

Setter/getter for GOMP_DEBUG.

Validated.

unset_gomp_debug

Unsets GOMP_DEBUG, deletes it from localized %ENV.

gomp_stacksize

Setter/getter for GOMP_STACKSIZE.

Not validated.

unset_gomp_stacksize

Unsets GOMP_STACKSIZE, deletes it from localized %ENV.

gomp_spincount

Setter/getter for GOMP_SPINCOUNT.

Not validated.

unset_gomp_spincount

Unsets GOMP_SPINCOUNT, deletes it from localized %ENV.

gomp_rtems_thread_pools

Setter/getter for GOMP_RTEMS_THREAD_POOLS.

Not validated.

unset_gomp_rtems_thread_pools

Unsets GOMP_RTEMS_THREAD_POOLS, deletes it from localized %ENV.

SUPPORTED OpenMP ENVIRONMENTAL VARIABLES

The following is essentially direct copy from the URL in DESCRIPTION:

OMP_CANCELLATION

If set to TRUE, the cancellation is activated. If set to FALSE or if unset, cancellation is disabled and the cancel construct is ignored.

This variable is validated via setter.

OMP_DISPLAY_ENV

If set to TRUE, the OpenMP version number and the values associated with the OpenMP environment variables are printed to stderr. If set to VERBOSE, it additionally shows the value of the environment variables which are GNU extensions. If undefined or set to FALSE, this information will not be shown.

This variable is validated via setter.

OMP_DEFAULT_DEVICE

Set to choose the device which is used in a target region, unless the value is overridden by omp_get_set_assert_default_device or by a device clause. The value shall be the nonnegative device number. If no device with the given device number exists, the code is executed on the host. If unset, device number 0 will be used.

This variable is validated via setter.

OMP_DYNAMIC

Enable or disable the dynamic adjustment of the number of threads within a team. The value of this environment variable shall be TRUE or FALSE. If undefined, dynamic adjustment is disabled by default.

This variable is validated via setter.

OMP_MAX_ACTIVE_LEVELS

Specifies the initial value for the maximum number of nested parallel regions. The value of this variable shall be a positive integer. If undefined, then if OMP_NESTED is defined and set to true, or if OMP_NUM_THREADS or OMP_PROC_BIND are defined and set to a list with more than one item, the maximum number of nested parallel regions will be initialized to the largest number supported, otherwise it will be set to one.

This variable is validated via setter.

OMP_MAX_TASK_PRIORITY

Specifies the initial value for the maximum priority value that can be set for a task. The value of this variable shall be a non-negative integer, and zero is allowed. If undefined, the default priority is 0.

This variable is validated via setter.

OMP_NESTED

Enable or disable nested parallel regions, i.e., whether team members are allowed to create new teams. The value of this environment variable shall be TRUE or FALSE. If set to TRUE, the number of maximum active nested regions supported will by default be set to the maximum supported, otherwise it will be set to one. If OMP_MAX_ACTIVE_LEVELS is defined, its setting will override this setting. If both are undefined, nested parallel regions are enabled if OMP_NUM_THREADS or OMP_PROC_BINDS are defined to a list with more than one item, otherwise they are disabled by default.

This variable is validated via setter.

OMP_NUM_THREADS

Specifies the default number of threads to use in parallel regions. The value of this variable shall be a comma-separated list of positive integers; the value specifies the number of threads to use for the corresponding nested level. Specifying more than one item in the list will automatically enable nesting by default. If undefined one thread per CPU is used.

This variable is validated via setter.

OMP_PROC_BIND

Specifies whether threads may be moved between processors. If set to TRUE, OpenMP theads should not be moved; if set to FALSE they may be moved. Alternatively, a comma separated list with the values MASTER, CLOSE and SPREAD can be used to specify the thread affinity policy for the corresponding nesting level. With MASTER the worker threads are in the same place partition as the master thread. With CLOSE those are kept close to the master thread in contiguous place partitions. And with SPREAD a sparse distribution across the place partitions is used. Specifying more than one item in the list will automatically enable nesting by default.

When undefined, OMP_PROC_BIND defaults to TRUE when OMP_PLACES or GOMP_CPU_AFFINITY is set and FALSE otherwise.

This module provides access to, but does NOT validate this variable.

OMP_PLACES

The thread placement can be either specified using an abstract name or by an explicit list of the places. The abstract names threads, cores and sockets can be optionally followed by a positive number in parentheses, which denotes the how many places shall be created. With threads each place corresponds to a single hardware thread; cores to a single core with the corresponding number of hardware threads; and with sockets the place corresponds to a single socket. The resulting placement can be shown by setting the OMP_DISPLAY_ENV environment variable.

Alternatively, the placement can be specified explicitly as comma separated list of places. A place is specified by set of nonnegative numbers in curly braces, denoting the denoting the hardware threads. The hardware threads belonging to a place can either be specified as comma separated list of nonnegative thread numbers or using an interval. Multiple places can also be either specified by a comma separated list of places or by an interval. To specify an interval, a colon followed by the count is placed after after the hardware thread number or the place. Optionally, the length can be followed by a colon and the stride number - otherwise a unit stride is assumed. For instance, the following specifies the same places list: "{0,1,2}, {3,4,6}, {7,8,9}, {10,11,12}"; "{0:3}, {3:3}, {7:3}, {10:3}"; and "{0:2}:4:3".

If OMP_PLACES and GOMP_CPU_AFFINITY are unset and OMP_PROC_BIND is either unset or false, threads may be moved between CPUs following no placement policy.

This module provides access to, but does NOT validate this variable.

OMP_STACKSIZE

Set the default thread stack size in kilobytes, unless the number is suffixed by B, K, M or G, in which case the size is, respectively, in bytes, kilobytes, megabytes or gigabytes. This is different from pthread_attr_get_set_assertstacksize which gets the number of bytes as an argument. If the stack size cannot be set due to system constraints, an error is reported and the initial stack size is left unchanged. If undefined, the stack size is system dependent.

This module provides access to, but does NOT validate this variable.

OMP_SCHEDULE

Allows to specify schedule type and chunk size. The value of the variable shall have the form: type[,chunk] where type is one of static, dynamic, guided or auto The optional chunk size shall be a positive integer. If undefined, dynamic scheduling and a chunk size of 1 is used.

This module provides access to, but does NOT validate this variable.

OMP_TARGET_OFFLOAD

Specifies the behaviour with regard to offloading code to a device. This variable can be set to one of three values - MANDATORY, DISABLED or DEFAULT.

If set to MANDATORY, the program will terminate with an error if the offload device is not present or is not supported. If set to DISABLED, then offloading is disabled and all code will run on the host. If set to DEFAULT, the program will try offloading to the device first, then fall back to running code on the host if it cannot.

If undefined, then the program will behave as if DEFAULT was set.

This variable is validated via setter.

OMP_THREAD_LIMIT

Specifies the number of threads to use for the whole program. The value of this variable shall be a positive integer. If undefined, the number of threads is not limited.

This variable is validated via setter.

OMP_TEAMS_THREAD_LIMIT

Specifies the number of threads to use for the whole program. The value of this variable shall be a positive integer. If undefined, the number of threads is not limited.

This variable is validated via setter.

OMP_WAIT_POLICY

Specifies whether waiting threads should be active or passive. If the value is PASSIVE, waiting threads should not consume CPU power while waiting; while the value is ACTIVE specifies that they should. If undefined, threads wait actively for a short time before waiting passively.

This variable is validated via setter.

GOMP_CPU_AFFINITY

Binds threads to specific CPUs. The variable should contain a space-separated or comma-separated list of CPUs. This list may contain different kinds of entries: either single CPU numbers in any order, a range of CPUs (M-N) or a range with some stride (M-N:S). CPU numbers are zero based. For example, GOMP_CPU_AFFINITY="0 3 1-2 4-15:2" will bind the initial thread to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12, and 14 respectively and then start assigning back from the beginning of the list. GOMP_CPU_AFFINITY=0 binds all threads to CPU 0.

There is no libgomp library routine to determine whether a CPU affinity specification is in effect. As a workaround, language-specific library functions, e.g., getenv in C or GET_ENVIRONMENT_VARIABLE in Fortran, may be used to query the setting of the GOMP_CPU_AFFINITY environment variable. A defined CPU affinity on startup cannot be changed or disabled during the run time of the application.

If both GOMP_CPU_AFFINITY and OMP_PROC_BIND are set, OMP_PROC_BIND has a higher precedence. If neither has been set and OMP_PROC_BIND is unset, or when OMP_PROC_BIND is set to FALSE, the host system will handle the assignment of threads to CPUs.

This module provides access to, but does NOT validate this variable.

GOMP_DEBUG

Enable debugging output. The variable should be set to 0 (disabled, also the default if not set), or 1 (enabled).

If enabled, some debugging output will be printed during execution. This is currently not specified in more detail, and subject to change.

This variable is validated via setter.

GOMP_STACKSIZE

Determines how long a threads waits actively with consuming CPU power before waiting passively without consuming CPU power. The value may be either INFINITE, INFINITY to always wait actively or an integer which gives the number of spins of the busy-wait loop. The integer may optionally be followed by the following suffixes acting as multiplication factors: k (kilo, thousand), M (mega, million), G (giga, billion), or T (tera, trillion). If undefined, 0 is used when OMP_WAIT_POLICY is PASSIVE, 300,000 is used when OMP_WAIT_POLICY is undefined and 30 billion is used when OMP_WAIT_POLICY is ACTIVE. If there are more OpenMP threads than available CPUs, 1000 and 100 spins are used for OMP_WAIT_POLICY being ACTIVE or undefined, respectively; unless the GOMP_SPINCOUNT is lower or OMP_WAIT_POLICY is PASSIVE.

This module provides access to, but does NOT validate this variable.

GOMP_SPINCOUNT

Set the default thread stack size in kilobytes. This is different from pthread_attr_get_set_assertstacksize which gets the number of bytes as an argument. If the stack size cannot be set due to system constraints, an error is reported and the initial stack size is left unchanged. If undefined, the stack size is system dependent.

This module provides access to, but does NOT validate this variable.

GOMP_RTEMS_THREAD_POOLS

This environment variable is only used on the RTEMS real-time operating system. It determines the scheduler instance specific thread pools. The format for GOMP_RTEMS_THREAD_POOLS is a list of optional <thread-pool-count>[$<priority>]@<scheduler-name> configurations separated by : where:

1. thread-pool-count is the thread pool count for this scheduler instance.

2. $<priority> is an optional priority for the worker threads of a thread pool according to pthread_get_set_assertschedparam. In case a priority value is omitted, then a worker thread will inherit the priority of the OpenMP master thread that created it. The priority of the worker thread is not changed after creation, even if a new OpenMP master thread using the worker has a different priority.

3. @<scheduler-name> is the scheduler instance name according to the RTEMS application configuration.

In case no thread pool configuration is specified for a scheduler instance, then each OpenMP master thread of this scheduler instance will use its own dynamically allocated thread pool. To limit the worker thread count of the thread pools, each OpenMP master thread must call set_num_threads.

This module provides access to, but does NOT validate this variable.

SEE ALSO

OpenMP::Simple is a module that aims at making it easier to bootstrap Perl+OpenMP programs. It is designed to work together with this module.

This module heavily favors the GOMP implementation of the OpenMP specification within gcc. In fact, it has not been tested with any other implementations.

Youtube videos on using OpenMP and MO's talks at Perl Conferences about Perl+OpenMP exist, the reader is encouraged to learn and try!

https://gcc.gnu.org/onlinedocs/libgomp/index.html

Please also see the rperl project for a glimpse into the potential future of Perl+OpenMP, particularly in regards to thread-safe data structures.

https://www.rperl.org

AUTHOR

Brett Estrade <oodler@cpan.org>

ACKNOWLEDGEMENTS

So far I've received great help on irc.perl.org channels, #pdl and #native. Specificially, sivoais, mohawk_pts, and plicease; and specifically in regards to the use of Inline::C above and investigating the issues related to shared library load time versus run time; and when the environment is initialized.

COPYRIGHT AND LICENSE

Same as Perl.