The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Text::Tradition::Collation - a software model for a text collation

SYNOPSIS

  use Text::Tradition;
  my $t = Text::Tradition->new( 
    'name' => 'this is a text',
    'input' => 'TEI',
    'file' => '/path/to/tei_parallel_seg_file.xml' );

  my $c = $t->collation;
  my @readings = $c->readings;
  my @paths = $c->paths;
  my @relationships = $c->relationships;
  
  my $svg_variant_graph = $t->collation->as_svg();
    

DESCRIPTION

Text::Tradition is a library for representation and analysis of collated texts, particularly medieval ones. The Collation is the central feature of a Tradition, where the text, its sequence of readings, and its relationships between readings are actually kept.

CONSTRUCTOR

new

The constructor. Takes a hash or hashref of the following arguments:

  • tradition - The Text::Tradition object to which the collation belongs. Required.

  • linear - Whether the collation should be linear; that is, whether transposed readings should be treated as two linked readings rather than one, and therefore whether the collation graph is acyclic. Defaults to true.

  • baselabel - The default label for the path taken by a base text (if any). Defaults to 'base text'.

  • wit_list_separator - The string to join a list of witnesses for purposes of making labels in display graphs. Defaults to ', '.

  • ac_label - The extra label to tack onto a witness sigil when representing another layer of path for the given witness - that is, when a text has more than one possible reading due to scribal corrections or the like. Defaults to ' (a.c.)'.

  • wordsep - The string used to separate words in the original text. Defaults to ' '.

ACCESSORS

tradition

linear

wit_list_separator

baselabel

ac_label

wordsep

Simple accessors for collation attributes.

start

The meta-reading at the start of every witness path.

end

The meta-reading at the end of every witness path.

readings

Returns all Reading objects in the graph.

reading( $id )

Returns the Reading object corresponding to the given ID.

add_reading( $reading_args )

Adds a new reading object to the collation. See Text::Tradition::Collation::Reading for the available arguments.

del_reading( $object_or_id )

Removes the given reading from the collation, implicitly removing its paths and relationships.

merge_readings( $main, $second, $concatenate, $with_str )

Merges the $second reading into the $main one. If $concatenate is true, then the merged node will carry the text of both readings, concatenated with either $with_str (if specified) or a sensible default (the empty string if the appropriate 'join_*' flag is set on either reading, or else $self->wordsep.)

The first two arguments may be either readings or reading IDs.

has_reading( $id )

Predicate to see whether a given reading ID is in the graph.

reading_witnesses( $object_or_id )

Returns a list of sigils whose witnesses contain the reading.

paths

Returns all reading paths within the document - that is, all edges in the collation graph. Each path is an arrayref of [ $source, $target ] reading IDs.

add_path( $source, $target, $sigil )

Links the given readings in the collation in sequence, under the given witness sigil. The readings may be specified by object or ID.

del_path( $source, $target, $sigil )

Links the given readings in the collation in sequence, under the given witness sigil. The readings may be specified by object or ID.

has_path( $source, $target );

Returns true if the two readings are linked in sequence in any witness. The readings may be specified by object or ID.

relationships

Returns all Relationship objects in the collation.

add_relationship( $reading, $other_reading, $options )

Adds a new relationship of the type given in $options between the two readings, which may be specified by object or ID. Returns a value of ( $status, @vectors) where $status is true on success, and @vectors is a list of relationship edges that were ultimately added. See Text::Tradition::Collation::Relationship for the available options.

clear_witness( @sigil_list )

Clear the given witnesses out of the collation entirely, removing references to them in paths, and removing readings that belong only to them. Should only be called via $tradition->del_witness.

reading_witnesses( $reading )

Return a list of sigils corresponding to the witnesses in which the reading appears.

OUTPUT METHODS

as_svg( \%options )

Returns an SVG string that represents the graph, via as_dot and graphviz. See as_dot for a list of options. Must have GraphViz (dot) installed to run.

as_dot( \%options )

Returns a string that is the collation graph expressed in dot (i.e. GraphViz) format. Options include:

  • from

  • to

  • color_common

path_witnesses( $edge )

Returns the list of sigils whose witnesses are associated with the given edge. The edge can be passed as either an array or an arrayref of ( $source, $target ).

witnesses_at_rank

Returns a list of witnesses that are not lacunose, for a given rank.

as_graphml

Returns a GraphML representation of the collation. The GraphML will contain two graphs. The first expresses the attributes of the readings and the witness paths that link them; the second expresses the relationships that link the readings. This is the native transfer format for a tradition.

as_csv

Returns a CSV alignment table representation of the collation graph, one row per witness (or witness uncorrected.)

alignment_table( $use_refs, $include_witnesses )

Return a reference to an alignment table, in a slightly enhanced CollateX format which looks like this:

 $table = { alignment => [ { witness => "SIGIL", 
                             tokens => [ { t => "TEXT" }, ... ] },
                           { witness => "SIG2", 
                             tokens => [ { t => "TEXT" }, ... ] },
                           ... ],
            length => TEXTLEN };

If $use_refs is set to 1, the reading object is returned in the table instead of READINGTEXT; if not, the text of the reading is returned.

If $include_witnesses is set to a hashref, only the witnesses whose sigil keys have a true hash value will be included.

NAVIGATION METHODS

reading_sequence( $first, $last, $sigil, $backup )

Returns the ordered list of readings, starting with $first and ending with $last, for the witness given in $sigil. If a $backup sigil is specified (e.g. when walking a layered witness), it will be used wherever no $sigil path exists. If there is a base text reading, that will be used wherever no path exists for $sigil or $backup.

next_reading( $reading, $sigil );

Returns the reading that follows the given reading along the given witness path.

prior_reading( $reading, $sigil )

Returns the reading that precedes the given reading along the given witness path.

common_readings

Returns the list of common readings in the graph (i.e. those readings that are shared by all non-lacunose witnesses.)

path_text( $sigil, $mainsigil [, $start, $end ] )

Returns the text of a witness (plus its backup, if we are using a layer) as stored in the collation. The text is returned as a string, where the individual readings are joined with spaces and the meta-readings (e.g. lacunae) are omitted. Optional specification of $start and $end allows the generation of a subset of the witness text.

INITIALIZATION METHODS

These are mostly for use by parsers.

make_witness_path( $witness )

Link the array of readings contained in $witness->path (and in $witness->uncorrected_path if it exists) into collation paths. Clear out the arrays when finished.

make_witness_paths

Call make_witness_path for all witnesses in the tradition.

calculate_ranks

Calculate the reading ranks (that is, their aligned positions relative to each other) for the graph. This can only be called on linear collations.

flatten_ranks

A convenience method for parsing collation data. Searches the graph for readings with the same text at the same rank, and merges any that are found.

calculate_common_readings

Goes through the graph identifying the readings that appear in every witness (apart from those with lacunae at that spot.) Marks them as common and returns the list.

text_from_paths

Calculate the text array for all witnesses from the path, for later consistency checking. Only to be used if there is no non-graph-based way to know the original texts.

UTILITY FUNCTIONS

common_predecessor( $reading_a, $reading_b )

Find the last reading that occurs in sequence before both the given readings.

common_successor( $reading_a, $reading_b )

Find the first reading that occurs in sequence after both the given readings.

LICENSE

This package is free software and is provided "as is" without express or implied warranty. You can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Tara L Andrews <aurum@cpan.org>