NAME
IBM::LoadLeveler - Perl Access to IBM LoadLeveler API
SYNOPSIS
use IBM::LoadLeveler;
$version = ll_version();
# Workload Management API
$rc=ll_control($control_op,\@host_list,\@user_list,\@job_list,\@class_list,$priority);
$rc=llctl(LL_CONTROL_START|LL_CONTROL_STOP|LL_CONTROL_RECYCLE|LL_CONTROL_RECONFIG|LL_CONTROL_DRAIN|LL_CONTROL_DRAIN_SCHEDD|LL_CONTROL_DRAIN_STARTD|LL_CONTROL_FLUSH|LL_CONTROL_PURGE_SCHEDD|LL_CONTROL_SUSPEND|LL_CONTROL_RESUME|LL_CONTROL_RESUME_STARTD|LL_CONTROL_RESUME_SCHEDD,\@host_list,\@class_list);
$rc=llfavorjob(LL_CONTROL_FAVOR_JOB|LL_CONTROL_UNFAVOR_JOB,\@job_list);
$rc=llfavorjob(LL_CONTROL_FAVOR_USER|LL_CONTROL_UNFAVOR_USER,\@user_list);
$rc=llhold(LL_CONTROL_HOLD_USER|LL_CONTROL_HOLD_SYSTEM|LL_CONTROL_HOLD_RELEASE,\@host_list,\@user_list,\@job_list);
rc=llprio(LL_CONTROL_PRIO_ABS|LL_CONTROL_PRIO_ADJ,\@job_list,$priority);
$rc=ll_start_job($cluster,$proc,$from_host,\@node_list);
$rc=ll_terminate_job($cluster,$proc,$from_host,$msg);
($rc,$errObj)=ll_preempt($job_step_id, PREEMPT_STEP|RESUME_STEP);
# Error API
ll_error($errObj,1 | 2 );
# Submit API function
($job_name,$owner,$groupname,$uid,$gid,$submit_host,$numsteps,$ref)=llsubmit($job_cmd_file,$monitor_program,$monitor_args);
# Data Access API functions
$query = ll_query( JOBS|MACHINES|CLUSTER|WLMSTAT|MATRIX );
$return = ll_set_request( $query,QUERY_ALL|QUERY_JOBID|QUERY_STEPID|QUERY_GROUP|QUERY_CLASS|QUERY_HOST|QUERY_STARTDATE|QUERY_ENDDATE, \@filter,ALL_DATA|Q_LINE|STATUS_LINE );
$object = ll_get_objs( $query, LL_STARTED|LL_SCHED|LL_CM|LL_MASTER|LL_STARTER|LL_HISTORY_FILE, $hostname, $number_of_objs, $error_code);
$return = ll_reset_request( $object );
$next_object = ll_next_obj ( $object );
$return = ll_free_objs ( $object );
$return = ll_deallocate ( $object );
$result = ll_get_data( $object, $LLAPI_Specification );
# Query API functions ( deprecated )
my ($version_num,$numnodes,$noderef)=ll_get_nodes();
my ($version_num,$numjobs,$ref)=ll_get_jobs();
DESCRIPTION
This module provides access to the APIs of the IBM LoadLeveler Workload Management System. The APIs currently implemented are:
Error Handling
This version has only been tested with LoadLeveler 3.1.0 under AIX 5.1.
This module is not for the faint hearted. The LoadLeveler API returns a huge amount of information, the ll_get_data call has over 300 different specifications that can be supplied. To use this module you really need a copy of the the IBM documentation on using LoadLeveler and maybe a copy of the llapi.h header file.
Data Access API
The Data Access API has the following functions:
ll_query
ll_set_request
ll_reset_request
ll_get_objs
ll_get_data
ll_next_obj
ll_free_objs
ll_deallocate
A minimal example of using the Data Access API is:
use IBM::LoadLeveler;
# Query Job information
$query = ll_query(JOBS);
# Ask for all data on all jobs
$return=ll_set_request($query,QUERY_ALL,undef,ALL_DATA);
if ($return != 0 )
{
print STDERR "ll_set_request failed Return = $return\n";
}
# Query the scheduler for information
# $number will contain the number of objects returned
$job=ll_get_objs($query,LL_CM,NULL,$number,$err);
while ( $job)
{
# Get the Job submit time
$SubmitTime=ll_get_data($job,LL_JobSubmitTime);
$job=ll_next_obj($query);
}
# Free up space allocated by LoadLeveler to hold the job object
ll_free_objs($job);
# Free up space used by the Query Object
ll_deallocate($query);
- ll_query
-
$query = ll_query( JOBS | MACHINES | CLUSTER | WLMSTAT | MATRIX );
ll_query is the first call you make to access the API it establishes the type of information you receive.
- ll_set_request
-
$return=ll_set_request($query,$QueryFlags,$ObjectFilter, $DataFilter);
ll_set_request is used to determine the range of data returned by ll_get_objs.
Parameters
- 1 $query
-
The return from ll_query
- 2 $QueryFlags
-
The permissable Query Flags depends on the type of query being made. The flags are:
QUERY_ALL : Query all jobs.
QUERY_JOBID : Query by job ID.
QUERY_STEPID : Query by step ID.
QUERY_USER : Query by user ID.
QUERY_GROUP : Query by LoadLeveler group.
QUERY_CLASS : Query by LoadLeveler class.
QUERY_HOST : Query by machine name.
QUERY_STARTDATE : Query by job start dates
QUERY_ENDDATE : Query by job end dates.
They can be used with the following query types:
- 3 $ObjectFilter
-
Specifies the search criteria:
- 4 $DataFilter
-
Filters the amount of data you get back from the query. Permitted values are:
ALL_DATA
Q_LINE
Valid only for a JOBS query, not using the history file, returns the same data as llq -f
STATUS_LINE
Valid only for a MACHINES query, returns the same data as llstatus -f
- ll_reset_request
-
$return=ll_reset_request($query);
This is used to reset the request (surprise!) associated with a query object, you use it if you want to do another ll_set_request using different parameters.
- ll_get_objs
-
$data=ll_get_objs($query,$query_daemon,$host,$number,$err);
Sends a query request to LoadLeveler
Parameters
- 1 $query
-
Data from ll_query
- 2 $query_daemon
-
The LoadLeveler Daemon you want to query, permitted values are:
LL_STARTD
LL_SCHEDD
LL_CM (negotiator)
LL_MASTER
LL_STARTER
LL_HISTORY_FILE
- 3 $host
-
Should be NULL unless you are querying LL_STARTD or LL_SCHED and want to query a machine other than the localhost. If you are querying LL_HISTORY_FILE then this should be the name of the history file.
- 4 $number
-
The number of query objects returned.
- 5 $error
-
If there is an error this is it. Possible values are:
- -1 query_element not valid
- -2 query_daemon not valid
- -3 Cannot resolve hostname
- -4 Request type for specified daemon not valid
- -5 System error
- -6 No valid objects meet the request
- -7 Configuration error
- -9 Connection to daemon failed
- -10 Error processing history file (LL_HISTORY_FILE query only)
- -11 History file must be specified in the hostname argument (LL_HISTORY_FILE query only)
- -12 Unable to access the history file (LL_HISTORY_FILE query only)
- -13 DCE identity of calling program can not be established
- -14 No DCE credentials
- -15 DCE credentials within 300 secs of expiration
- -16 64-bit API is not supported when DCE is enabled
- ll_get_data
-
$data=ll_get_data($element,$specification);
This call differs from the IBM documentation, the data you request is returned as the return value and not as a third parameter eg
$lav=ll_get_data($machine,LL_MachineLoadAverage); # Returns a double @msl=ll_get_data($machine,LL_MachineStepList); # Returns an array of strings.
To know what you are getting you really need to know about the LoadLeveler Job Object Model. All of this is in the IBM LoadLeveler for AIX 5L: Using and Administering book and html. Sorry for not including it here, but there is an awful lot of it.
enum types
Returns from some query types may be, in C terms, enumerated types. In perl these all return as SCALAR Integers. Return values are shown below:
- LL_AdapterReqUsage
-
SHARED, NOT_SHARED, SLICE_NOT_SHARED
- LL_StepHoldType
-
NO_HOLD, HOLDTYPE_USER, HOLDTYPE_SYSTEM, HOLDTYPE_USERSYS
- LL_StepNodeUsage
-
SHARED, NOT_SHARED, SLICE_NOT_SHARED
- LL_StepState
-
STATE_IDLE, STATE_PENDING, STATE_STARTING, STATE_RUNNING, STATE_COMPLETE_PENDING, STATE_REJECT_PENDING, STATE_REMOVE_PENDING, STATE_VACATE_PENDING, STATE_COMPLETED, STATE_REJECTED, STATE_REMOVED, STATE_VACATED, STATE_CANCELED, STATE_NOTRUN, STATE_TERMINATED, STATE_UNEXPANDED, STATE_SUBMISSION_ERR, STATE_HOLD, STATE_DEFERRED, STATE_NOTQUEUED, STATE_PREEMPTED, STATE_PREEMPT_PENDING, STATE_RESUME_PENDING
- ll_next_obj
-
$job=ll_next_obj($query);
This returns the next object from the query object.
- ll_free_objs
-
$return=ll_free_objs($query);
This frees up the the space taken by the ll_get_objs routine for the LoadLeveler Data. Since there is probably an awful lot of this this is a very important call.
- ll_deallocate
-
$return=ll_deallocate($query);
Frees the query object itself, this is the last LoadLeveler action.
Submit API
The Submit API has one function:
llsubmit
On successful submission this function returns a `perlised' version of the LL_job structure. See the llsubmit example and the LoadLeveler API header file llapi.h for full information on how to use it. Currently the following LL_job_step structure members are not returned:
usage_info64
adapter_req
llfree_job_info is not implemented because it is done by in the llsubmit call after the data has been transfered to Perl.
A minimal example of using the Submit API is:
use IBM::LoadLeveler;
my ($job_name,$owner,$group,$uid,$gid,$host,$steps,$job_step)=llsubmit("/home/gmhk/test_job/test_job.cmd",NULL,NULL);
print "Job Name = $job_name\n";
print "Owner = $owner\n";
print "Group = $group\n";
print "UID = $uid\n";
print "GID = $gid\n";
print "HOST = $host\n";
print "STEPS = $steps\n";
print "JOB_STEP = $job_step\n";
@steps=@{$job_step};
print "JOB_STEP = $#{$job_step}\n";
foreach $stepref (@steps)
{
%step=%{$stepref};
print "STEP_NAME = $step{'step_name'}\n";
print "REQUIREMENTS = $step{'requirements'}\n";
%usage_info = %{$step{'usage_info'}};
print "USAGE INFO = --------------------\n";
print " STARTER_RUSAGE = $usage_info{'starter_rusage'}\n";
%rusage=%{$usage_info{'starter_rusage'}};
print " RU_UTIME = $rusage{'ru_utime'}\n";
print " RU_MAXRSS = $rusage{'ru_maxrss'}\n";
print " STEP_RUSAGE = $usage_info{'step_rusage'}\n";
print " MACH_USAGE ITEMS = $#{$usage_info{'mach_usage'}}\n";
}
- llsubmit
-
($job_name,$owner,$groupname,$uid,$gid,$submit_host,$numsteps,$ref)=llsubmit($job_cmd_file,$monitor_program,$monitor_args);
Parameters
$job_cmd_file
A string containing the name of the Job Command File
$monitor_program
Is a pointer to a string containing the name of the monitor program to be invoked when the state of the job is changed. Set to NULL if a monitoring program is not provided.
$monitor_args
Is a pointer to a string which is stored in the job object and is passed to the monitor program. The maximum length of the string is 1023 bytes. If the length exceeds this value, it is truncated to 1023 bytes. Set to NULL if an argument is not provided.
Return
$job_name
$owner
$groupname
$uid
$gid
$submit_host
$numsteps
$ref
$ref is a reference to an array of job step information, each job step is a hash, the key is the name of the element in the LL_job_step structure, eg:
@steps = @{$ref}; foreach $stepref ( @steps ) { %step=%{stepref}; print "STEP_NAME = $step{'step_name'}\n"; print "REQUIREMENTS = $step{'requirements'}\n"; print "PREFERENCES = $step{'preferences'}\n"; }
Query API
The Query API has been deprecated by IBM.
The Query API has the following functions:
- ll_get_jobs
- ll_get_nodes
- ll_get_jobs
-
The return from ll_get_jobs is a perlized version of the LL_job structure. In perl terms it is a horror, this is how to decode the infoirmation for one step.
# -*- Perl -*- # Use ll_get_jobs to print out information about one job use IBM::LoadLeveler; my ($version_num,$numjobs,$ref)=ll_get_jobs(); print "Version : $version_num\n"; print "Number of Jobs : $numjobs\n"; print "----------------------------------------\n"; @jobs=@{$ref}; # Get The reference to the first Job $job=pop @jobs; # Get The Job information my($job_name,$owner,$groupname,$uid,$gid,$submit_host,$job_steps,$job_step)=@{$job}; print "Job Name : $job_name\n"; print "Owner : $owner\n"; print "Group Name : $groupname\n"; print "UID : $uid\n"; print "GID : $gid\n"; print "Submit Host : $submit_host\n"; print "Number of Steps : $job_steps\n"; print "----------------------------------------\n"; # Print Information about first Job Step $ref = pop @{$job_step}; %step= %{$ref}; print "Step Name : $step{'step_name'}\n"; print "Requirements : $step{'requirements'}\n"; print "Preferences : $step{'preferences'}\n"; print "User Step Pri : $step{'prio'}\n"; print "Step Dependency : $step{'dependency'}\n"; print "Group Name : $step{'group_name'}\n"; print "Step Class : $step{'stepclass'}\n"; print "Start Date : ", scalar localtime($step{'start_date'}),"\n"; print "Step Flags : $step{'flags'}\n"; print "Minimum # Procs : $step{'min_processors'}\n"; print "Maximum # Procs : $step{'max_processors'}\n"; print "Account Number : $step{'account_no'}\n"; print "User Comment : $step{'comment'}\n"; print "Step ID : @{$step{'id'}}\n"; print "Submit Date : ", scalar localtime($step{'q_date'}),"\n"; print "Status : $step{'status'}\n"; print "Actual # Procs : $step{'num_processors'}\n"; print "Assigned Procs : @{$step{'processor_list'}}\n"; print "Command : $step{'cmd'}\n"; print "Arguments : $step{'args'}\n"; print "Environment : $step{'env'}\n"; print "stdin : $step{'in'}\n"; print "stdout : $step{'out'}\n"; print "stderr : $step{'err'}\n"; print "Initial Dir : $step{'iwd'}\n"; print "Notify User : $step{'notify_user'}\n"; print "Shell : $step{'shell'}\n"; print "Command : $step{'cmd'}\n"; print "User Tracker Exit : $step{'tracker'}\n"; print "Tracker Args : $step{'tracker_arg'}\n"; print "Notification : $step{'notification'}\n"; print "Image Size : $step{'image_size'}\n"; print "Executable Size : $step{'exec_size'}\n"; print "Step Res Limits : @{$step{'limits'}}\n"; print "NQS Info : @{$step{'nqs_info'}}\n"; print "Dispatch Date : ", scalar localtime($step{'dispatch_time'}),"\n"; print "Start Time : $step{'start_time'}\n"; print "Completion Code : $step{'completion_code'}\n"; print "Completion Date : ", scalar localtime($step{'completion_date'}),"\n"; print "Start Count : $step{'start_count'}\n"; %usage_info = %{$step{'usage_info'}}; print "Starter rusage ru_utime : @{$usage_info{'starter_rusage'}{'ru_utime'}}\n"; print "Starter rusage ru_stime : @{$usage_info{'starter_rusage'}{'ru_stime'}}\n"; print "Starter rusage ru_maxrss : $usage_info{'starter_rusage'}{'ru_maxrss'}\n"; print "Starter rusage ru_ixrss : $usage_info{'starter_rusage'}{'ru_ixrss'}\n"; print "Starter rusage ru_majflt : $usage_info{'starter_rusage'}{'ru_majflt'}\n"; print "Starter rusage ru_nswap : $usage_info{'starter_rusage'}{'ru_nswap'}\n"; print "Starter rusage ru_maxrss : $usage_info{'starter_rusage'}{'ru_maxrss'}\n"; print "Starter rusage ru_inblock : $usage_info{'starter_rusage'}{'ru_inblock'}\n"; print "Starter rusage ru_oublock : $usage_info{'starter_rusage'}{'ru_oublock'}\n"; print "Starter rusage ru_msgsnd : $usage_info{'starter_rusage'}{'ru_msgsnd'}\n"; print "Starter rusage ru_msgrcv : $usage_info{'starter_rusage'}{'ru_msgrcv'}\n"; print "Starter rusage ru_nsignals : $usage_info{'starter_rusage'}{'ru_nsignals'}\n"; print "Starter rusage ru_nvcsw : $usage_info{'starter_rusage'}{'ru_nvcsw'}\n"; print "Starter rusage ru_nivcsw : $usage_info{'starter_rusage'}{'ru_nivcsw'}\n"; print "Step rusage ru_utime : @{$usage_info{'step_rusage'}{'ru_utime'}}\n"; print "Step rusage ru_stime : @{$usage_info{'step_rusage'}{'ru_stime'}}\n"; print "Step rusage ru_maxrss : $usage_info{'step_rusage'}{'ru_maxrss'}\n"; print "Step rusage ru_ixrss : $usage_info{'step_rusage'}{'ru_ixrss'}\n"; print "Step rusage ru_majflt : $usage_info{'step_rusage'}{'ru_majflt'}\n"; print "Step rusage ru_nswap : $usage_info{'step_rusage'}{'ru_nswap'}\n"; print "Step rusage ru_maxrss : $usage_info{'step_rusage'}{'ru_maxrss'}\n"; print "Step rusage ru_inblock : $usage_info{'step_rusage'}{'ru_inblock'}\n"; print "Step rusage ru_oublock : $usage_info{'step_rusage'}{'ru_oublock'}\n"; print "Step rusage ru_msgsnd : $usage_info{'step_rusage'}{'ru_msgsnd'}\n"; print "Step rusage ru_msgrcv : $usage_info{'step_rusage'}{'ru_msgrcv'}\n"; print "Step rusage ru_nsignals : $usage_info{'step_rusage'}{'ru_nsignals'}\n"; print "Step rusage ru_nvcsw : $usage_info{'step_rusage'}{'ru_nvcsw'}\n"; print "Step rusage ru_nivcsw : $usage_info{'step_rusage'}{'ru_nivcsw'}\n"; $first_mach_usage_info = $#{$usage_info{'mach_usage'}}; print "Step machine Usage : $first_mach_usage_info\n"; print "User System Prio : $step{'user_sysprio'}\n"; print "Group System Prio : $step{'group_sysprio'}\n"; print "Class System Prio : $step{'class_sysprio'}\n"; print "User Number : $step{'number'}\n"; print "CPUS requested : $step{'cpus_requested'}\n"; print "Virutal Mem Req : $step{'virtual_memory_requested'}\n"; print "Memory Requested : $step{'memory_requested'}\n"; print "Adapter Used mem : $step{'adapter_used_memory'}\n"; print "Adapter Reg count : $step{'adapter_req_count'}\n"; print "Image Size : $step{'image_size64'}\n"; print "Executable Size : $step{'exec_size64'}\n"; print "Step Res Limits : @{$step{'limits64'}}\n"; print "Virutal Mem Req : $step{'virtual_memory_requested64'}\n"; print "Memory Requested : $step{'memory_requested64'}\n"; print "Last Checkpoint : $step{'good_ckpt_start_time'}\n"; print "Time Spent ckpting: $step{'accum_ckpt_time'}\n"; print "Checkpoint Dir : $step{'ckpt_dir'}\n"; print "Checkpoint File : $step{'ckpt_file'}\n"; print "Large Page Req : $step{'large_page'}\n";
- ll_get_nodes
-
ll_get_nodes is almost as bad as ll_get_jobs. The following is an example of decoding the data returned:
# -*- Perl -*- use IBM::LoadLeveler; # Use the deprecated ll_get_nodes call to find information on all nodes in the system # similar to llstatus -l my ($version_num,$numnodes,$ref)=ll_get_nodes(); print "LoadLeveler Version : $version_num\n"; print "Number of Nodes : $numnodes\n"; @nodes=@{$ref}; foreach $node (@nodes) { my($node_name,$version,$configtimestamp,$timestamp,$vmem,$memory,$disk,$loadavg,$speed,$max_starters,$pool,$cpus,$state,$keywordidle,$totaljobs,$arch,$opsys,$adapters,$feature,$job_class,$initiators,$steplist,$vmem64,$memory64,$disk64)=@{$node}; print "----------------------------------------\n"; print "Node Name : $node_name\n"; print "Proc Version : $version\n"; print "Date of reconfig : ",scalar localtime $configtimestamp,"\n"; print "Data timestamp : ",scalar localtime $timestamp,"\n"; print "Virtual Memory (KB) : $vmem\n"; print "Physical Memory (KB) : $memory\n"; print "Avail Disk Space (KB): $disk\n"; print "Load Avgerage : $loadavg\n"; print "Node Speed : $speed\n"; print "Max Jobs allowed : $max_starters\n"; print "Pool Number : $pool\n"; print "Number of CPUs : $cpus\n"; print "Startd state : $state\n"; print "Since keyboard active: $keywordidle\n"; print "Total jobs : $totaljobs\n"; print "Hardware Architecture: $arch\n"; print "Operating System : $opsys\n"; print "Available Adapters : @{$adapters}\n"; print "Available Features : @{$feature}\n"; %classes=(); foreach $class ( @{$job_class} ) { $classes{$class}++; } print "Job Classes Allowed : "; foreach $class ( keys %classes ) { print "$class\($classes{$class}\) "; } print "\n"; %classes=(); foreach $class ( @{$initiators} ) { $classes{$class}++; } print "Initiators Available : "; foreach $class ( keys %classes ) { print "$class\($classes{$class}\) "; } print "\n"; @steps=@{$steplist}; print "Steps Allocated : "; if ( $#steps < 0 ) { print "None."; } else { foreach $step ( @steps ) { @id=@{$step}; print "$id[2].$id[0].$id[1] "; } } print "\n"; print "Virtual Memory (KB) : $vmem64\n"; print "Physical Memory (KB) : $memory64\n"; print "Avail Disk Space (KB): $disk64\n"; }
WorkLoad Management API
The Workload Management API has the following functions:
- ll_control
- llctl
- llfavorjob
- llfavoruser
- llhold
- llprio
- ll_preempt
- ll_start_job
- ll_terminate_job
-
The
llctl, llfavorjob, llfavoruser, llhold & llprio
functions are all really wrappers for ll_control.The functions ll_start_job & ll_trermiate_job are designed for people wanting to produce an external scheduler, they are totally untested in this module.
64 bit types and 32 bit perl
ll_get_data has a whole set of 64 bit return types, this poses a problem for perl when it is compiled in 32 bit mode. This module will return correct values if the value is than 2^31 otherwise the value will be truncated to 2^31..
Build/Installation
The module currently relies on the llapi.h file supplied with LoadLeveler for definitions of constants. The make file automatically processes the llapi.h file into a llapi.ph file and installs it as part of the build process.
You will probably need to edit Makefile.PL to change the value of $LoadL to point to where LoadLeveler is installed
Standard build/installation supported by ExtUtils::MakeMaker(3)...
perl Makefile.PL
make
make test
make install
To convert the pod documentation (what there is of it) to html:
make html
Known Problems
Large History files
This module has been observed to crash when given a history file of >92MB and <132MB ( the killer value is probably 128MB ).
Workaround
The solution is to increase the bmaxdata value of the Perl executable. If you are using the installp version of perl it is recommended to copy the executable to another directory, and modify using ldedit to increase the number of data segments.
cp /usr/opt/perl/bin/perl /global/bin/llperl
/usr/bin/ldedit -o bmaxdata:0x20000000 /global/bin/llperl
Then modify any scripts that have exhibited this behaviour to use the new executable. If this fails then increase the bmaxdata value until successful.
AUTHOR
Mike Hawkins <Mike.Hawkins@awe.co.uk>
SEE ALSO
perl. IBM LoadLeveler for AIX 5L: Using and Administering