Treex::Manual::FAQ - Frequently asked questions about Treex
version 2.20150928
write a minimal test, write the version....
Let's say you have English and French corpus, sentence aligned, stored in plain text format, one sentence per line.
Read::AlignedSentences en=sample_en.txt fr=sample_fr.txt
For the official attributes (defined in the Treex PML schema), there are accessor methods, so e.g. for lemma:
my $old_lemma = $anode->lemma; $node->set_lemma('new');
If you want to use your own new attributes, you can use so-called wild attributes:
$node->wild->{name_of_my_new_attribute} = $value; $value = $node->wild->{name_of_my_new_attribute};
You can also use methods get_attr and set_attr to access whatever attribute you want:
get_attr
set_attr
$node->set_attr('my_new_attribute', 'my_value'); print $node->get_attr('my_new_attribute'); # prints "my_value"
However, when saving the document (to *.treex), only attributes which are described in treex PML schema (stored in treex/lib/Treex/Core/share/tred_extension/treex/resources) will be saved (including all wild attributes).
treex/lib/Treex/Core/share/tred_extension/treex/resources
This means that you should not expect that a "non-schema" attribute (except for the wild attributes) saved in one block will be accessible in another block. (It would work with both block in one scenario, but it wouldn't work if saved in one and loaded in another scenario, which is hard to debug.)
Note, that for temporary information you can also use a separate hash variable as an alternative to new attributes:
my %my_new_attribute_of; $my_new_attribute_of{$node} = 'my_value';
It is quite usual that a Treex block or a tool (Treex::Tool::...) needs a pre-trained model, database, dictionary etc. These files can be huge (several GiB) so we cannot store them in svn repository and upload to CPAN. We store them in the Treex shared data directory -- either in resources (for treebanks, corpora and other officially published resources) or in models (pre-trained statistical models) subdirectory.
Treex::Tool::...
resources
models
These files will be automatically downloaded from the UFAL server when they are first needed. To achieve this behavior, override the get_required_share_files method in your block, so it returns a list of filenames, e.g.:
my $input_file = 'data/models/tagger/my_tagger/en_penntb.model' sub get_required_share_files { return $input_file; } sub BUILD { open my $I, "<:encoding(utf-8)", "$ENV{TMT_ROOT}/share/$input_file"; # now load the model ... }
If you want to use shared files in a tool:
package Treex::Tools use Treex::Core::Resource; my $input_file = 'data/models/tagger/my_tagger/en_penntb.model' sub BUILD { my ($self) = @_; Treex::Core::Resource::require_file_from_share($input_file, ref($self)); # now load the model ... }
The second parameter of require_file_from_share is a name of the tool that needs the file, it is just for information -- will be printed when downloading the file.
require_file_from_share
head2 Using ttred
ttred
TrEd is a tree editor developed by Petr Pajas - see http://ufal.mff.cuni.cz/tred. ttred is Treex-modified TrEd capable of showing *.treex files. Actually, it is a just light-weight wrapper which executes tred with a path to the pre-installed Treex extension.
*.treex
tred
Press c and drag&drop the zones in a matrix. (c is a shortcut for a macro, which you can find also in the menu: Macros - all modes - treex_mode - Configuration.) You can choose both horizontal and vertical position, which is handy e.g. for word-aligned corpora, where you usually want to have the aligned trees above each other, so the alignment links are not too long.
Zdeněk Žabokrtský <zabokrtsky@ufal.mff.cuni.cz>
Martin Popel <popel@ufal.mff.cuni.cz>
David Mareček <marecek@ufal.mff.cuni.cz>
Copyright © 2011 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Treex::Manual::FAQ, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Treex::Manual::FAQ
CPAN shell
perl -MCPAN -e shell install Treex::Manual::FAQ
For more information on module installation, please visit the detailed CPAN module installation guide.