- COMMAND LINE INTERFACE
Datahub::Factory::Command::transport - Implements the 'transport' command.
This command allows datamanagers to (a) fetch data from a (local) source (b) transform the data to LIDO using a fix (c) upload the LIDO transformed data to a Datahub instance.
Location of the pipeline configuration file.
Location of the general configuration file.
Location of the importer configuration file.
Location of the fixer configuration file.
Location of the exporter configuration file.
Set this flag for pretty output of the ETL processing.
The pipeline configuration file is in the INI format and its location is provided to the application using the
The file is broadly divided in two parts: the first (shortest) part configures the pipeline itself and sets the plugins to use for the import, fix and export actions. The second part sets options specific for the used plugins.
This part has three sections:
[Exporter]. Every section has just one option:
plugin. Set this to the plugin you want to use for every action.
All current supported plugins are in the
Exporter folders. For the
[Fixer], only the Fix plugin is supported.
Supported Importer plugins:
Supported Exporter plugins:
[Importer] plugin = OAI id_path = 'lidoRecID.0._' [plugin_importer_OAI] endpoint = https://oai.my.museum/oai [Fixer] plugin = Fix [plugin_fixer_Fix] file_name = '/home/datahub/my.fix' [Exporter] plugin = YAML [plugin_exporter_YAML]
All plugins have their own configuration options in sections called
type can be importer, exporter or fixer and
name is the name of the plugin.
All plugins define their own options as parameters to the respective plugin. All possible parameters are valid items in the configuration section.
If a plugin requires no options, you still need to create the (empty) configuration section (e.g.
[plugin_exporter_LIDO] in the above example).
id_path option contains the path (in Fix syntax) of the identifier of each record in your data after the fix has been applied, but before it is submitted to the Exporter. It is used for reporting and logging.
[plugin_fixer_Fix] condition = record.institution_name fixers = FOO, BAR [plugin_fixer_Fix] file_name = /home/datahub/my.fix
[plugin_fixer_Fix] can directly load a fix file (via the option
file_name) or can be configured to conditionally load a different fix file to support multiple fix files for the same data stream (e.g. when two institutions with different data models use the same API endpoint). This is done by setting the
[plugin_fixer_Fix] condition = record.institution_name fixers = FOO, BAR [plugin_fixer_FOO] condition = 'Museum of Foo' file_name = '/home/datahub/foo.fix' [plugin_fixer_BAR] condition = 'Museum of Bar' file_name = '/home/datahub/bar.fix'
If you want to separate the data stream into multiple (smaller) streams with a different fix file for each stream, you can do this by setting the appropriate options in the
[plugin_fixer_Fix] block. Note that
id_path is still mandatory.
condition to the Fix-compatible path in the original stream that holds the condition you want to use to split the stream.
Provide a comma-separated list of fixer plugins in
For every fixer plugin in
fixers, create a configuration block called
[plugin_fixer_name] and provide the following options:
The value that the
[plugin_fixer_Fix]must have for the record to belong to this block.
The location of the fix file that must be executed for every record in this block.
[Importer] plugin = Adlib id_path = 'record.id' [Fixer] plugin = Fix [Exporter] plugin = Datahub [plugin_importer_Adlib] file_name = '/tmp/adlib.xml' data_path = 'recordList.record.*' [plugin_fixer_Fix] file_name = '/tmp/msk.fix' [plugin_exporter_Datahub] datahub_url = https://my.thedatahub.io datahub_format = LIDO oauth_client_id = datahub oauth_client_secret = datahub oauth_username = datahub oauth_password = datahub
Matthias Vandermaesen <firstname.lastname@example.org> Pieter De Praetere <email@example.com>
Copyright 2016 - PACKED vzw, Vlaamse Kunstcollectie vzw
This library is free software; you can redistribute it and/or modify it under the terms of the GPLv3.