The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

FILE VALIDATION

The validate plugin provides functionality to validate files against the definition in the control file. It relies on a control file to define the data file validation rules. The following parameters must be set in the config section:

  • controlfile_dir. The directory that contains the control file.

  • control_file. The name of the [wiki:ControlFiles control file]

The item parameters are:

  • name. The name of this item.

  • ignore_field_count. whether to continue if the number fields in the file don't match the number of fields in the control file.

  • skip. The number of rows to skip in the file before loading begins. This allows header records to be ignored.

  • localize. A boolean setting that instructs the loader to localize the end-of-line markers for the current file system.

  • file_type. At present, the only type supported is ''csv''.

  • csv_options. A section containing additional options for processing. See http://search.cpan.org/dist/Text-CSV/lib/Text/CSV.pm#new_%28\%attr%29.

  • email_alerts. A comma-delimited list of addresses to receive validation error emails. These will typically be interested users or the suppliers of the files. The ETLp admin address will receive the validation errors regardless.

  • on_error. Override the setting for the job.

Example

    <item>
        name        = validate customer file
        type        = validate
        file_type   = csv
        skip        = 1
        <csv_options>
            allow_whitespace = 1
            sep_char         = |
        </csv_options>
    </item>