NAME

DiaColloDB::Document::JSON - diachronic collocation db, source document, raw JSON

SYNOPSIS

 ##========================================================================
 ## PRELIMINARIES
 
 use DiaColloDB::Document::JSON;
 
 ##========================================================================
 ## Constructors etc.
 
 $doc = CLASS_OR_OBJECT->new(%args);
 
 ##========================================================================
 ## API: I/O: parse
 
 $bool = $doc->fromFile($filename_or_fh, %opts);
 

DESCRIPTION

DiaColloDB::Document::JSON provides a DiaColloDB::Document-compliant API for parsing corpus files in raw JSON format, assuming the stored data maps 1:1 onto the DiaColloDB::Document structure.

Globals & Constants

Variable: @ISA

DiaColloDB::Document::JSON inherits from DiaColloDB::Document and supports the DiaColloDB::Document API.

Constructors etc.

new
 $doc = CLASS_OR_OBJECT->new(%args);

%args, object structure:

 ##-- document data
 date   =>$date,     ##-- year
 tokens =>\@tokens,  ##-- tokens, including undef for EOS
 meta   =>\%meta,    ##-- document metadata (e.g. author, title, collection, ...)

Each token in @tokens is a HASH-ref {w=>$word,p=>$pos,l=>$lemma,...}, or undef for EOS.

API: I/O: parse

fromFile
 $bool = $doc->fromFile($filename_or_fh, %opts);

parse tokens from $filename_or_fh. %opts: clobbers %$doc.

EXAMPLE

The following is an example file in the format accepted by this module:

 {
    "date" : "2016",
    "meta" : {
       "author" : "Jurish, Bryan",
       "collection" : "tiny",
       "date_" : "2016-02-25",
       "genre" : "dummy",
       "textClass" : "dummy:test-data",
       "title" : "test document"
    },
    "tokens" : [
       "#s",
       "#p",
       "#file",
       {
          "l" : "this",
          "p" : "DT",
          "w" : "This"
       },
       {
          "l" : "be",
          "p" : "VBZ",
          "w" : "is"
       },
       {
          "l" : "a",
          "p" : "DT",
          "w" : "a"
       },
       {
          "l" : "test",
          "p" : "NN",
          "w" : "test"
       },
       {
          "l" : ".",
          "p" : "SENT",
          "w" : "."
       },
       null,
       "#s",
       {
          "l" : "this",
          "p" : "DT",
          "w" : "This"
       },
       {
          "l" : "be",
          "p" : "VBZ",
          "w" : "is"
       },
       {
          "l" : "only",
          "p" : "RB",
          "w" : "only"
       },
       {
          "l" : "a",
          "p" : "DT",
          "w" : "a"
       },
       {
          "l" : "test",
          "p" : "NN",
          "w" : "test"
       },
       {
          "l" : ".",
          "p" : "SENT",
          "w" : "."
       },
       null,
       "#s",
       "#p",
       {
          "l" : "this",
          "p" : "DT",
          "w" : "This"
       },
       {
          "l" : "be",
          "p" : "VBZ",
          "w" : "is"
       },
       {
          "l" : "still",
          "p" : "RB",
          "w" : "still"
       },
       {
          "l" : "a",
          "p" : "DT",
          "w" : "a"
       },
       {
          "l" : "test",
          "p" : "NN",
          "w" : "test"
       },
       {
          "l" : ".",
          "p" : "SENT",
          "w" : "."
       },
       null
    ]
 }

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2016-2020 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

DiaColloDB::Document(3pm), DiaColloDB(3pm), perl(1), ...