The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

SYNOPSIS

  Usage: pmltq COMMAND [OPTIONS]

    pmltq version
    pmltq configuration
    pmltq init schema1.xml schema2.xml
    pmltq convert
    pmltq load
    pmltq initdb
    pmltq delete
    pmltq webload
    pmltq webdelete
    pmltq webtreebank
    pmltq webverify

  Options (for all commands):
    -c, --config      Config file, by default commands will look
                      for config file called C<pmltq.yml> in the
                      current directory.

COMMANDS

These commands are available by default.

configuration

  $ pmltq convert

Uses PMLTQ::Command::configuration to to get current configuration

convert

  $ pmltq convert

Uses PMLTQ::Command::convert to convert data in the data_dir based on layers configuration

delete

  $ pmltq delete

Uses PMLTQ::Command::delete to delete the database for current treebank

init

  $ pmltq init resources/schema1.xml resources/schema2.xml

Uses PMLTQ::Command::init to generate initial configuration file skeleton based on given schemas. This command can help you quickly bootstrap the layers configuration

initdb

  $ pmltq initdb

Uses PMLTQ::Command::initdb to create and initialize new database for given treebank

load

  $ pmltq load

Uses PMLTQ::Command::load to load the data generated by convert command

query

Uses PMLTQ::Command::query to run a query on given treebank.

verify

Uses PMLTQ::Command::verify to check if database exists and contains some data. For now the checking is very simple

version

Uses PMLTQ::Command::version to display current PMLTQ version

webdelete

Uses PMLTQ::Command::webdelete to delete treebank from web interface

webload

Uses PMLTQ::Command::webload to load treebank to web interface

webtreebank

Uses PMLTQ::Command::webtreebank to get list of treebanks or single treebank info

webverify

Uses PMLTQ::Command::webverify to verify treebank visibility in web interface

CONFIG FILE

Options

treebank_id

ID of the treebank. Can contain only [a-zA-Z0-9_]. It will be default for the database name and treebank name in web interface.

data_dir

Directory where the data are (this is also base directory for data layers)

Defaults: data

resources

Base directory for PML schemas

Defaults: resources

output_dir

Directory for all sql dump files. The files generated by convert and used by load command

Defaults: sql_dump

db
name

Database name

Defaults: treebank_id if defined

host

Database server hostname or IP address

Defaults: localhost

port

Database port

Defaults: 5432

user, password

Database credentials

sys_db

Name of the 'system database' used for administration commands such as CREATE and DROP.

layers

The configuration of treebank's layers and references for each layer.

name

Schema root name

data

A glob path name matching pattern relative to data_dir

path

A path name matching pattern relative to PML-TQ server data directory

List of related schemas that contain node types required in this layer's reference configuration

references

This is key-value hash where key is path to the member of the node structure and value is node type or '-' (dash) if you intend to ignore that particular reference. If the node type is not in the current layer schema you have to prefix node type with the schema name and the appropriate schema have to be listed in related-schema list.

Examples:

  references:
    path/attr1: '-' #--> ignore this reference
    path/attr2: ref-node #--> reference node type 'ref-node'
    path/attr3: schema:other-node #--> reference node type 'other-node' in schema 'schema'
title

Treebank title visible at the web.

homepage

Treebank homepage url.

description

Treebank description visible at the web.

isFree

Boolean value (true, false) sets if the logging in is not required for querying treebank.

Defaults: false

isAllLogged

Boolean value (true, false) sets if treebank is queryable for all logged in users.

Defaults: false

isPublic

Boolean value (true, false) sets if the treebank is visible for not logged users.

Defaults: false

isFeatured

Boolean value (true, false) sets if the treebank is featured.

Defaults: false

web_api
dbserver

Name of database server setted in web interface.

url

Link to PML-TQ web service.

user, password

Web API credentials of user with admin privileges.

manuals

List of manuals

title, url

Title and link to manual

tags

List of tags

language

Language code

test_query
result_dir

Directory where should be saved results of test queries (SVG and text files)

queries
filename

Name of the file where the results should be saved

query

PML-TQ query

Change values using CLI

You can use command line parameters to modify any configuration options.

For example you can use

  pmltq load --output_dir='/some/path' --data_dir='some/other/path' --db-name='abc'

Dash - in the parameter's name means dive into the hash, so --db-name='abc' is going to change db: name: 'abc' while --db_name='abc' would just set configuration option db_name: 'abc'.

Commands and options

Following table shows which options are used in commands.

* required
+ optional (defaults or nothing is used)
- not used
convertdeleteinitinitdbloadverifywebdeletewebloadwebverify
treebank_id**-******
data_dir+--------
resources+--------
output_dir+---+----
db-name-+-+++-+-
db-host-+-+++---
db-port-+-+++---
db-user---***---
db-password---***---
sys_db-*-*-----
layers*---*--*-
title-------*-
homepage-------+-
description-------+-
isFree-------+-
isPublic-------+-
isFeatured-------+-
web_api-dbserver-------*-
web_api-url------***
web_api-user------***
web_api-password------***
manuals-------+-
tags-------+-
language-------+-
test_query-result_dir--------+
test_query-queries--------+

Example:

  data_dir: /pmltq/data/dir/ # directory where the data are (this is also base directory for data layers)
  resources: /pmltq/resources/ # main directory with PML schemas

  treebank_id: tbid

  db: # typical DB auth stuff
    name: treebank_db_name
    host: localhost
    port: 5432
    user: pmltq
    password: pwd

  layers: # description of all data layers
    - name: adata
      data: ./relative/to/data_dir/**/*.a.gz
      related-schema:
        - adata_schema.xml
      references:
        t-node/val_frame.rf: '-'
        t-a/aux.rf: 'adata:a-node'
        t-node/coref_gram.rf: t-node
    - name: tdata
      data: **/*.t.gz

  web_api:
    user: webuser
    password: pwd
    url: 'https://serviceurl.com/'
    dbserver: myserver

  title: 'Treebank name'
  description: 'Short trebank descrioprion'
  language: cs
  tags:
    - mytag1
    - mytag2