Net::AsycnZ sets options by means of named parameters for both the parent process and each of its child processes. Options for the parent are set in Net::AsycnZ->new. Options for the child processes are set via the options parameter of Net::AsycnZ->new; the value of this parameter must be an array of Net::Z3950::AsyncZ::Options::_params objects.
Net::AsycnZ
Net::AsycnZ->new
options
Net::Z3950::AsyncZ::Options::_params
If a _params object doesn't exist for a child process, Net::AsycnZ->new will create it with a set of default options. There will always be a _params object for every server in the servers array, and they are cross-indexed, that is $_params_object[0] is used for $server[0], etc. So, if you are creating your own array of _params objects, you must keep this parallelism in mind.
_params
servers
$_params_object[0]
$server[0]
[1] Options set in Net::Z3950::AsyncZ::new which control the parent process and selected features of the child processes for which no alternatives are present: the alternatives are set as indicated in [2] and [3].
Net::Z3950::AsyncZ::new
[2] Options set in a Net::Z3950::AsyncZ:Options::_params object: this is returned by Net::Z3950::AsyncZ::asyncZOptions(). There is one _params object for each server: if you don't create one, it is created for you with the default values. If you don't create a _params object for a server, then log and query options set in the AsyncZ constructor will be used. The rationale behind this is that you usually will be asking one question across all servers and will usually be using only one log file for debugging.
Net::Z3950::AsyncZ:Options::_params
Net::Z3950::AsyncZ::asyncZOptions()
log
query
AsyncZ
But in all other cases where it is possible to set an option for the child in both the AsyncZ constructor or _params, the _params setting will be used. At the moment this affects the format and num_to_fetch options.
format
num_to_fetch
[3] Options set in the Net::Z3950::Manager by using the Z3950_options option of the _params object. These take precedence over any others and must be passed in with the first _params object, that is, $_params_object[0], because AsyncZ uses only one Net::Z3950::Manager. The Manager is created when setting up the first server passed into the constructor.
Net::Z3950::Manager
Z3950_options
Default values for options are shown to the right of the =>operator:
=>
HTML=>0
In some instances, the type of variable is shown and defaults detailed in commentary:
format=>\&format
cb=>\&cb callback function to which records will be sent as available. See Output Callback.
cb=>\&cb
format=>\&format callback function to format individual lines of records. See Format Callback. If you create a _params object for a server and do not set its format option, then the default format will be used, even if you set the format option of the AsyncZ constructor to another value.
interval=>1 Event loop timer interval in seconds: This controls how frequently AsyncZ checks to see if servers have responded and if the timeout period is up.
interval=>1
timeout
log=>undef controls how extended error messages are handled. There are two sets of error messages--those handled through Net::Z3950::AsyncZ::ErrMsg and which are meant for the user and those meant for debugging. The latter are generated by both AsyncZ and the Perl library and can accumlulate at a rapid clip. AsyncZ writes its debugging messages to STDOUT, while those coming from library routines almost always go to STDERR. There are 3 options for log.
log=>undef
[1] undef, the default, in which case all debugging messages go to the terminal, and those written to STDOUT will end up in a browser if you are on the web.
undef
[2]log=>Net::Z3950::AsyncZ::Errors::suppressErrors() (or log=>suppressErrors() if you import the function)--in which case these messages will be suppressed
log=>Net::Z3950::AsyncZ::Errors::suppressErrors()
log=>suppressErrors()
[3]log=>$filespec, in which case all of these messages will go to the file specified in $filespec
log=>$filespec
$filespec
The Net::Z3950::AsyncZ::Options::_param object also has a log option--which means that you can specify a log file for each child process--ie. for each server queried-- while keeping a separate one for the parent. Or you can set up a system where parent and child_1 write to log.1, while child_2 and child_3 write to log.2, etc.
Net::Z3950::AsyncZ::Options::_param
Note: All error logs are automatically opened and closed. Do NOT open or close them yourself!
Do NOT open or close log files yourself!
maxpipes=>4 maximum number of forks to be executed at one time--the greater the number the more resources are used--both of memory and cpu.
maxpipes=>4
monitor=>0 timeout in seconds for a monitoring child process, or 0, in which case a monitor is not set.
monitor=>0
The monitor is a child process which runs a timer and kills the parent process, if it exceeds the timeout period. You run the monitor only if your software hangs. An orderly shutdown of all runnning processes is put into effect, the purpose of which is to prevent the development of zombie processes and to release all shared memory.
num_to_fetch->5 number of records to fetch; this setting will be used only if you have not created a _params object. This means that if you create _params object for the server and do not set its num_to_fetch option, then num_to_fetch will default to 5 even if you have set another value for num_to_fetch in the AsyncZ constructor.
num_to_fetch->5
options=>\@options reference to an array of references to "Net::Z3950::AsyncZ::Options::_params" objects. Each reference is obtained from a call to "Net::Z3950::AsyncZ::asyncZOptions". For instance:
options=>\@options
@options = ( asyncZOptions(option_1=>opt_1,option_2=>opt_2, . . .), undef, asyncZOptions(option_1=>opt_1,option_2=>opt_2, . . .) );
This array parallels the servers array:
@servers = ( [$host_1, $port_1, $database_1], [$host_2, $port_2, $database_2], [$host_3, $port_3, $database_3] );
$options[0] is used for $server[0] and $options[2] for $server[2]. If a _params object is not found or if it is not defined, as for $server[1], then a default _params object is created for the server.
$options[0]
$options[2]
$server[2]
query=> undef the query string: its format depends on Z3950 querytype and defaults to 'prefix' (as in Net::Z3950). You can set a separate Z3950 querytype for each query, or you can change the querytype for all servers by using Z3950_options.
query=> undef
Net::Z3950
If you create a _params for a server but do not set the query option in _params, then this query will be used. This means that you can set one query for all of your servers without having to re-set it for each of the _params objects you create. But if you create a _params with a different query, then the query set in _params will be used.
servers=>\@servers array of references to servers in form: [ $host, $port, $database]
servers=>\@servers
See options above and AsyncZ.pod: "The Basic Script".
AsyncZ.pod: "The Basic Script"
swap_attempts=>5 the number of times that a swap check will be done before exiting; see swap_check for details.
swap_attempts=>5
swap_check
swap_check=>0 the number of seconds between checks for swapping activity-- used when querying a great number of servers and requesting large amounts of data. It instructs AsyncZ to sleep for swap_check number of seconds before processing any further connections. If you are attempting to process too much data for the size of your RAM, the system will have to swap out of memory into the swap space on your disk; too much swapping causes loss of data and disk "thrashing"--i.e. repeated disk access--and will overburden the system. When swap_check is set, AsyncZ will check for signs of swap activity; if it finds swap activity it will go to sleep for the number of seconds set in swap_check and then re-check for swap_attempts number of times. If the swap activity continues beyond this number of checks, AsyncZ dies. For large throughput, you will probably want to set the monitor, and to set it for a long period of time, for instance, 3000 seconds. This means that you can set swap_check to a period of 10,20, 30 seconds. The values you set on these variables will depend on your own system memory resources and the amount of data you are processing. Note: This has been tested only on Linux but should also work on Unix, at least on Solaris.
swap_check=>0
swap_attempts
timeout=>25 total timeout in seconds for all processes to complete their work.
timeout=>25
timeout_min=>5 minumum timeout in secs to exit Event loop if all processes are finished; a security blanket to make sure all processes get a chance to report their results to the parent process before exiting the loop.
timeout_min=>5
Where a _param option duplicates an AsyncZ::new option, consult the AsyncZ::new description for more details.
_param
AsyncZ::new
HTML=>0 if true use default HTML formatting for records, if false format as plain text; see "Row Formatting Priorities".
Z3950_options=>undef reference to hash of additional Z3950 options.
Z3950_options=>undef
These options are passed to the Z3950 Manager and take precedence over _param options and options set in Net::Z3950::AsyncZ->new. Z3950_options makes it possible to implement Z3950 options which may not be specifically accounted for in any of the options to the AsyncZ module. For instance, to ask for "full" as opposed to "brief" records (which is the Z3950 default):
Net::Z3950::AsyncZ->new
@options = (asyncZOptions(Z3950_options=>{elementSetName =>'f'}) <, (asyncZOptions(. . .), . . >);
Note: To use this option, it must appear in the first _params object of the _params array, $options[0], as in the above example. It is ignored in any subsequent uses. This means that you cannot set these options on a per-server basis; they apply across to board to all the servers you are querying. In the above exmaple, for instance, you could not ask for brief records from some servers and full from others.
See "Types of Options"
cb=>\&cb reference to callback function to which records will be sent as available
format=>\&format reference to a callback function that formats each row of a record
interval=>5 timer interval for this forked process. See interval above under Net::Z3950::AsyncZ::new.
interval=>5
interval
log=>undef controls how extended error messages are handled for this child process. A separate log file can be opened for each process.
Note: All error logs are automatically opened and closed.
See log above under Net::Z3950::AsyncZ::new.
These options are fully described and illustrated in Report.pod under the heading "MARC Bibliographic Format".
num_to_fetch=>5 number of records to fetch from this server.
num_to_fetch=>5
pipetimeout=>20 timeout in seconds for this child process
pipetimeout=>20
preferredRecordSyntax=>Net::Z3950::RecordSyntax::USMARC the Z3950 preferredRecordSyntax for this child process
preferredRecordSyntax=>Net::Z3950::RecordSyntax::USMARC
query=>undef the query for this process
query=>undef
querytype=>'prefix' Z3950 querytype for this child process; it can be set to'ccl', or 'ccl2rpn'.
querytype=>'prefix'
raw=>0 (boolean) if true the raw record data for this process is returned; its format is dependent on the render option.
raw=>0
render
render=>1 (boolean) if true the raw record data for this process is returned filtered through the Z3950 Record::render function; this is the default. If false the raw data is returned unfiltered in its original state. The unfiltered raw data can be read using Net::Z3950::AsyncZ::prep_Raw and Net::AsyncZ::get_ZRawRec.
render=>1
true
Record::render
false
Net::Z3950::AsyncZ::prep_Raw
Net::AsyncZ::get_ZRawRec
startrec=>1 number of the record with which to start result from Record Set.
startrec=>1
utf8=>0 when set to true conversions will be made to utf8/unicode characters from the character codes used in MARC records to represent non-latin1 and accented latin1 chatacters. When ouputting utf8, you must call binmode on the ouput stream, for example:
utf8=>0
utf8/unicode
utf8
binmode
binmode(STDOUT, ":utf8");
When outputting to a browser, you should also notify the browser:
print "Content-type: text/html;charset=utf-8'\n\n"; print '<head><META http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body>';
See the sample script: MARC_HTML.pl.
MARC_HTML.pl
Note: To use utf8 you must have the MARC::Charset module installed. Otherwise, the utf8 option will be ignored.
MARC::Charset
If more than one option is set that affects the formatting of a record's rows, the following priority squence is in effect:
raw, format, HTML, plaintext (default)
Net::Z3950::AsyncZ::Options::_params provides a full range of get_option / set_option methods, enabling the dynamic setting of option values.
get_option
set_option
$_params_object->set_HTML(0); $num_to_fetch = $_params_object->get_num_to_fetch();
In addition there are functions for setting options with fixed values:
Function Equivalent
set_marc_xtra() set_marc_fields($Net::Z3950::AsyncZ::Report::xtra) set_marc_all() set_marc_fields($Net::Z3950::AsyncZ::Report::all) set_marc_std() set_marc_fields($Net::Z3950::AsyncZ::Report::std) set_raw_on() set_raw(1) set_raw_off() set_raw(0) set_plaintext() set_HTML(0) set_HTML() set_HTML(1) set_prefix() set_querytype('prefix') set_ccl=>() set_querytype('ccl') set_GRS1() set_preferredRecordSyntax(Net::Z3950::RecordSyntax::GRS1) set_USMARC() set_preferredRecordSyntax(Net::Z3950::RecordSyntax::USMARC)
The get/set methods guarantee that you have in fact set or queried the option you are interested in and, in the case of the fixed value options, that you have set it to the value required. You don't have to be concerned that a meaningless hash key will spring into existence through misspelling:
$_params_object = asyncZoptions(leg=>Error.LOG, num_to_fish=>3);
In the case of the some of the fixed value methods, one advantage is the obvious simplicity of calling set_GRS1() instead of set_preferredRecordSyntax(Net::Z3950::RecordSyntax::USMARC).
set_GRS1()
set_preferredRecordSyntax(Net::Z3950::RecordSyntax::USMARC)
This method works to both get and set values.
$value = $_params_obj->option('option'); $old_options_ref = $_params_obj->option(option=>value,option=>value,option=>value. . . );
params
in get mode: 'option' to be queried in set mode: list of option=>value pairs to be set (or %hash)
returns
in get mode: $value of option being queried in set mode: $old_options_ref -- reference to a hash of option=>value pairs which have been replaced by list or %hash
$bool = $_params_obj->validOption('option');
$bool = $_params_obj->invalidOption('option');
Both of the above methods will enable you to determine whether an option you choose to set is a valid option. Useful when using Net::Z3950::AsyncZ::Option::_params::option.
Net::Z3950::AsyncZ::Option::_params::option
$option = 'num_to_fetch'; $_params_obj->validOption($option) ? $_params_obj->option($option=>3) : die "invalid option: $option";
$_params_obj->test();
Calling this function will print a listing of defined options and values for $_params_obj.
$_params_obj
Myron Turner <turnermm@shaw.ca> or <mturner@ms.umanitoba.ca>
Copyright 2003 by Myron Turner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Net::Z3950::AsyncZ::ZSend, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Net::Z3950::AsyncZ::ZSend
CPAN shell
perl -MCPAN -e shell install Net::Z3950::AsyncZ::ZSend
For more information on module installation, please visit the detailed CPAN module installation guide.