NAME
Oracle::SQLLoader - object interface to Oracle's sqlldr
DESCRIPTION
Oracle::SQLLoader provides an object wrapper to the most commonly used functionality of Oracle's SQL*Loader bulk loader utility. It tries to dwim as best as possible by using defaults to all of the various sqlldr options.
The module currently supports the loading of a single table from a single file. The file can be either fixed length or delimited. For a delimited file load, just add the names of the destination columns in the order the fields appears in the data file and optionally supply a data type. For a fixed length load, supply the destination column name; the combination of the field starting offset and field length, or the field start and end offsets in the data file; and an optional data type.
Besides letting you skip the Oracle docs, the module provides a lot of useful stats and return codes by parsing the sqlldr output.
SYNOPSIS
### load a simple comma-delimited file to a single table
$ldr
= new Oracle::SQLLoader(
infile
=>
'/tmp/test.dat'
,
terminated_by
=>
','
,
username
=>
$user
,
password
=>
$pass
,
sid
=>
$sid
);
$ldr
->addTable(
table_name
=>
'test_table'
);
$ldr
->addColumn(
column_name
=>
'first_col'
);
$ldr
->addColumn(
column_name
=>
'second_col'
);
$ldr
->addColumn(
column_name
=>
'third_col'
);
$ldr
->executeLoader() ||
warn
"Problem executing sqlldr: $@\n"
;
# stats
$skipped
=
$ldr
->getNumberSkipped();
$read
=
$ldr
->getNumberRead();
$rejects
=
$ldr
->getNumberRejected();
$discards
=
$ldr
->getNumberDiscarded();
$loads
=
$ldr
->getNumberLoaded();
$beginTS
=
$ldr
->getLoadBegin();
$endTS
=
$ldr
->getLoadEnd();
$runtimeSecs
=
$ldr
->getElapsedSeconds();
$secsOnCpu
=
$ldr
->getCpuSeconds();
#### a fixed length example
$flldr
= new Oracle::SQLLoader(
infile
=>
'/tmp/test.fixed'
,
username
=>
$user
,
password
=>
$pass
,
);
$flldr
->addTable(
table_name
=>
'test_table'
);
$flldr
->addColumn(
column_name
=>
'first_col'
,
field_offset
=> 0,
field_length
=> 4,
column_type
=>
$INT
);
$flldr
->addColumn(
column_name
=>
'second_col'
,
field_offset
=> 4,
field_end
=> 9);
$flldr
->addColumn(
column_name
=>
'third_col'
,
field_offset
=> 9,
field_end
=> 14,
column_type
=>
$CHAR
);
$flldr
->addColumn(
column_name
=>
'timestamp'
,
field_offset
=> 9,
field_length
=> 13,
column_type
=>
$DATE
,
date_format
=>
"YYYYMMDD HH24:MI"
);
$flldr
->executeLoader() ||
warn
"Problem executing sqlldr: $@\n"
;
# stats
$skipped
=
$ldr
->getNumberSkipped();
$read
=
$ldr
->getNumberRead();
$rejects
=
$ldr
->getNumberRejected();
$discards
=
$ldr
->getNumberDiscarded();
$loads
=
$ldr
->getNumberLoaded();
$beginTS
=
$ldr
->getLoadBegin();
$endTS
=
$ldr
->getLoadEnd();
$runtimeSecs
=
$ldr
->getElapsedSeconds();
$secsOnCpu
=
$ldr
->getCpuSeconds();
AUTHOR
Ezra Pagel <ezra@cpan.org>
CONTRIBUTIONS
John Huckelba - fix for single record files scenario
Craig Pearlman <cpearlman@healthmarketscience.com> - added various fixes, most importantly initializing and generating the options clause correctly.
COPYRIGHT
The Oracle::SQLLoader module is Copyright (c) 2006 Ezra Pagel.
The Oracle::SQLLoader module is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
Modifications to Oracle::SQLLoader are Copyright (c) 2006 Health Market Science, though this code remains free under the terms of the original license.
SQL*Loader is Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
API
IMPORTABLE VARIABLES
- Column Data Types
-
- $CHAR - data to be interpreted as character. empty data is treated as null
- $INT - data to be interpreted as integer. empty data is treated as 0
- $DECIMAL - data to be interpreted as floating point. empty data is treated as null
- $DATE - date fields. these want a valid format, but will default to DD-MON-YY
- Table Load Modes
PUBLIC METHODS
new()
create a new Oracle::SQLLoader object
- mandatory arguments
- common arguments
-
- control_file - supply an explicit file name to use when generating the control file; useful when cleanup is set to false.
- terminated_by - if you're planning on loading a delimited file, just provide the character used as a delimiter and Oracle::SQLLoader will presume that it's a delimited load from here on out. if this option is not supplied, a fixed length file format will be used.
- direct - bypass parsing sql and write directly to the tablespace on the filesystem. this is considerably faster than conventional loads, but can only be performed on a local instance. if this argument is not supplied, conventional load will be used.
- loadmode - one of the above Load Modes: $APPEND, $INSERT, $REPLACE, $TRUNCATE (default $APPEND)
- other optional arguments - you can ususally accept the defaults and pretend that these don't exist.
-
- bindsize - byte size of conventional load buffer; also sets readsize (default 256000)
- columnarrayrows - number of rows to buffer for direct loads (default 5000)
- errors - the maximum number of errors to accept before failing the load (default 50)
- load - the maximum number of records to load (default ALL)
- rows - the number of rows to load before committing (default 64)
- skip - the number of records to allow to be skipped (default 0)
- skip_index_maintenance - for direct loads, don't rebuild indexes (default false)
- skip_unusable_indexes - skips rebuilding unusable indexes (default false)
- streamsize - (direct) the byte size of the load stream (default 256000)
- badfile -
- discardfile -
- eol -
- offset_from - do your offsets start at position 0 or 1
- enclosed_by - are there some sort of enclosing characters, double-quotes perhaps?
- cleanup - want to leave the load files on disk, maybe for testing/auditing?
addTable()
identify a table to be loaded. multiple adds with the same table name clobbers any old definition.
addColumn()
add a column to be loaded
- mandatory arguments for all load types
- mandatory arguments for fixed length load types
- optional arguments for all load types
-
- table_name - the name of the table that this column belongs to. if no table name is specified, default is the last known table name. if no previous table name exists, croak.
- column_type - $CHAR, $INT, $DECIMAL, or $DATE; defaults to $CHAR
- date_format - the TO_DATE format for a $DATE column; defaults to "DD-MON-YY"
- column_length - on occassion, it's useful to specify the length of the field; for some reason, this is required when loading large strings (e.g. CHAR(3000))
executeLoader()
generate a control file and execute an sqlldr job. this is a blocking call. if you don't care about load statistics, you can always fork it off. returns 1 if sqlldr ran successfully, 0 if there were errors or warnings
checkLogfile()
parse an sqlldr logfile and be store results in object status
STATUS METHODS
getNumberSkipped()
returns the number of records skipped , or undef if no stats are known
getNumberRead()
returns the number of read from all input files, or undef if no stats are known
getNumberRejected()
returns the number of records rejected, or undef if no stats are known
getNumberDiscarded()
returns the number of records discarded, or undef if no stats are known
getNumberLoaded()
returns the number of records successfully loaded, or undef if no stats are known
getLastRejectMessage()
returns the last known rejection message, if any
getLoadBegin()
the time that the job began represented as epoch timestamp
getLoadEnd()
the time that the job finished represented as epoch timestamp
getElapsedSeconds()
returns the number if seconds elapsed during load
getCpuSeconds()
returns the number if seconds on cpu during load
getErrors()
returns a listref of any SQL*Loader-specific error codes and messages that were reported in the load logs, e.g the instance is down (SQL*Loader-128), the table does not exist (SQL*Loader-941), or the username/password was wrong (SQL*Loader-101). see the 'SQL*Loader Messages' section of your Oracle docs
getLastError()
returns the last known SQL*Loader error code and message, or an empty string
CONTENT GENERATION METHODS
generateControlfile()
based on the current configuration options, generate a control file
UTILITY METHODS
findProgram()
searches ORACLE_HOME and PATH environment variables for an executable program. returns the full path and file name of the first match, or undef if not found. can be invoked as a class or instance method.
Oracle::SQLLoader->findProgram('sqlldr') or $ldr->findProgram('sqlldr.exe')
checkEnvironment()
ensure that ORACLE_HOME is set and that the sqlldr binary is present and executable. can be invoked as a class or instance method.
Oracle::SQLLoader->findProgram('sqlldr') or $ldr->findProgram('sqlldr.exe')
PRIVATE METHODS
_generateSessionClause()
generate the session-wide information for a control file
_generateTablesClause()
generate table and column information for a control file
_generateDataClause()
generate any input data for a control file
initDefaults()
the loader defaults are almost directly from the sqlldr usage dumps