lib/CWB/Web/Cache.pm - metacpan.org


            
              1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
—
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
              package CWB::Web::Cache;
# -*-cperl-*-
##
## Short overview of module and cache architecture:
##
## The CWB::Web::Cache module uses a cache directory to keep selected named queries
## persistent between CQP sessions.  It is mainly intended for use in simple Web front-ends.
## Therefore, it is neither completely safe in heavy-duty CGI applications (where race
## conditions may occur) nor optimised for speed.  First, a CWB::Web::Cache object is created
## for an existing CQP process (using the CWB::CQP module) and must be initialised with settings
## for the cache directory path and caching strategy.  In order to make a named query result
## (of the running CQP process) persistent, it is stored in the disk cache directory, and a unique
## identifier is returned to the calling program.  This identifier can then be used to recover
## the persistent named query in a subsequent session (unless the result has already expired from
## the cache, which case must be handled by the caller).
##
## The CWB::Web::Cache module can also execute simple CQP queries and make their results
## persistent.  The query results are identified by corpus, query string, and an optional sort
## (stored as metadata) rather than a single unique identifier, and can be shared among different
## processes using the same cache directory.  When a persistent query result has expired from the
## cache, it is re-created in a way transparent to the calling program (by re-executing the query
## expression in the CQP process).
##
## The cache directory contains two subdirectories and an optional CONFIG file:
##   index/  ...  text files as 'markers' for cached queries (may contain 'metadata' about the cached query)
##   data/   ...  named query results stored in CQP's internal format
## A persistent named query is stored in a file with the name <corpus>:<query_name> in the data/
## subdirectory (e.g. DICKENS:ResultA-1121, where the numerical suffix is used to create a unique filename
## if necessary).  A text file with the same name is created in the index/ directory and may hold
## meta-information about the cached query result.  Storing a named query proceeds in the following steps:
##   1. create cache directory and subdirectories if they do not exist
##   1a recover cache settings (size, expiration time, ...) from CONFIG file unless initialised by program (OPTIONAL)
##   2. if cache directory size exceeds allowed maximum (checked with "du -k"), delete oldest query results
##      from cache until size has been reduced sufficiently (index file is always deleted first to keep other
##      processes from trying to read data while it is being erased);  if expiration time is set, all files
##      older than the specified limit are removed first
##   3. extend query name with numerical suffix to unique filename if necessary (checking files in index directory)
##   4. create empty file with unique name in index directory to "lock" data file for other processes
##   5. set CQP DataDirectory to data subdirectory, copy named query to unique name and save to disk
##   6. overwrite index file with meta-information (OPTIONAL)
## Recovering the named query result requires the following steps:
##   1. if cache directory and subdirectories do not exist, return "expired" status
##   2. check index directory for unique filename specified by caller, return "expired" if not found
##   3. touch index file so it won't be deleted by another process cleaning up the cache while we're reading it;
##      this also ensures that frequently accessed query results do not expire from the cache (or makes it
##      very unlikely, at least)
##   4. set CQP DataDirectory to data subdirectory and load cached query results from disk into CQP process
##      (force loading with "size <unique_name>;" command)
##   5. return internal name of restored query result to calling process
##
## Running a persistent query involves the following steps:
##   1. create a numeric hash key from the query expression (perhaps also the optional sort clause)
##   2. find all index files that match the given corpus and the hash key (possibly extended with numerical suffix)
##   3. check metadata stored in index files for exact query expression and sort clause
##   4. if found, recover query result with the retrieve() method (which touches the index file)
##      and return unique query name to caller (which may copy it to a simple name if desired)
##   5. otherwise, execute the specified query in the CQP process;  if an index file matching the query
##      string but not the sort clause has been found, the corresponding query result may be loaded instead,
##      which is presumably faster than re-running the query (OPTIONAL, unless sort clause included in hash key)
##   6. sort the query result according to the specified sort clause (otherwise, unsort the result)
##   7. use the store() method to create unique filename and cache query result, storing query expression
##      and sort clause as metadata in index file;  note that it is important to create an empty index file first,
##      to keep other processes (executing the same query) from loading the data file while it is being written
##   8. return name of query result to caller (need not be the unique identifier)
##
## Note that there is no support for a user-defined error handler, as all error conditions are serious
## internal faults and deserve to die (or rather croak).  A named query that has expired from the cache
## is a normal result rather than an error condition and must be expected by the caller.
use Carp;
use DirHandle;
use CWB;
use CWB::CQP;
## CWB::Web::Cache object struct:
##   'cqp'    =>  CWB::CQP object
##   'dir'    =>  cache directory
##   'index'  =>  index subdirectory
##   'data'   =>  data subdirectory
##   'size'   =>  maximum size of data directory (in MBytes, default = 5MB)
##   'expire' =>  time after which query results expire from cache (in hours, default = 24h)
## $cache = new CWB::Web::Cache -cqp => $cqp, -cachedir => $dir, [-cachesize => $cache_size,] [-cachetime => $expiration_time];
## object constructor, must be initialised with CQP object and cache directory (optional parameters have default values)
sub new {
  my $class = shift;
  my %arg = (-cqp => undef, -cachedir => undef, -cachesize => 5, -cachetime => 24);
  # default cache size is 5 MB, default expiration time is 24 hours
  croak 'Usage:  $cache = new CWB::Web::Cache -cqp => $cqp, -cachedir => $dir, [-cachesize => $cache_size,] [-cachetime => $expiration_time];'
    unless named_args(\%arg, @_);
  my $self = {};
  my $cqp = $arg{-cqp};
  croak 'Error: argument -cqp must be CWB::CQP object.'
    unless ref $cqp eq "CWB::CQP";
  $self->{'cqp'} = $cqp;
  my $dir = $arg{-cachedir};
  $self->{'dir'} = $dir;
  $self->{'index'} = "$dir/index";
  $self->{'data'} = "$dir/data";
  my $size = $arg{-cachesize};
  croak 'Error: maximal cache size must be valid number > 0.'
    unless $size + 0 > 0;
  $self->{'size'} = $size;
  my $expire = $arg{-cachetime};
  croak 'Error: expiration time must be valid number > 0.'
    unless $expire + 0 > 0;
  $self->{'expire'} = $expire;
  return bless($self, $class);
}
## use default object destructor (no cleaning up to be done)
## INTERNAL:  $ok = named_args(\%arghash, @pairs);
## process named args (-name => $value) from list @pairs; carps about wrong or missing arguments and returns false in this case;
## values are stored in %arghash which must be initialised with acceptable parameters and their default values (undef for required parameter)
sub named_args {
  my $arg = shift;
  if ((@pairs % 2) != 0) {
    carp "Odd number of elements in named argument list (@_).";
    return 0;
  }
  while (@_) {
    my $name = shift;
    my $value = shift;
    if (not exists $arg->{$name}) {
      carp "Illegal named argument: $name => '$value'.";
      return 0;
    }
    $arg->{$name} = $value;
  }
  my @missing = grep { not defined $arg->{$_} } keys %$arg;
  if (@missing) {
    carp "Required argument(s) [@missing] not specified.";
    return 0;
  }
  return 1;
}
## INTERNAL:  @files = list_directory($dir);
## get directory listing from DirHandle module (local file names only!), skipping anything that doesn't look like an identifier
sub list_directory {
  my $dir = shift;
  my $dh = new DirHandle $dir;
  return ()
    unless defined $dh;
  my @files = grep {/^[A-Z0-9_-]+:/} $dh->read;
  undef $dh;
  return @files;
}
## INTERNAL:  $kbytes = directory_size($dir);
## determine size of directory with "du -ks" (returns size in kbytes)
sub directory_size {
  my $dir = shift;
  my @output = ();
  CWB::Shell::Cmd("du -ks $dir", \@output);
  croak "Can't parse output of 'du -ks $dir' (@output)"
    unless @output == 1 and $output[0] =~ /^([0-9]+)\s+\S+/;
  my $size = $1;
  return $size;
}
## INTERNAL:  delete_files(@files);
## unlink list of files and croak() in case of an error
sub delete_files {
  foreach my $file (@_) {
    croak "Error: can't unlink file $file ($!)"
      unless 1 == unlink $file;
  }
}
## INTERNAL:  @filenames = $cache->index_files;
## INTERNAL:  @matching_files = $cache->index_files($substring);
sub index_files {
  my $self = shift;
  my @files = list_directory($self->{'index'});
  if (@_) {
    my $sub = shift;
    @files = grep { index($_, $sub) >= 0 } @files;
  }
  return @files;
}
## INTERNAL:  $cache->make_dirs;
## ensure that cache directory and subdirectories exist, otherwise create them
sub make_dirs {
  my $self = shift;
  my $dir = $self->{'dir'};
  my $index = $self->{'index'};
  my $data = $self->{'data'};
  unless (-d $dir) {
    CWB::Shell::Cmd("mkdir $dir");
    CWB::Shell::Cmd("chmod 777 $dir");
  }
  unless (-d $index) {
    CWB::Shell::Cmd("mkdir $index");
    CWB::Shell::Cmd("chmod 777 $index");
  }
  unless (-d $data) {
    CWB::Shell::Cmd("mkdir $data");
    CWB::Shell::Cmd("chmod 777 $data");
  }
}
## INTERNAL:  $cache->sweep_cache;
## check size of cache data directory and delete old files when it exceeds the limit
sub sweep_cache {
  my $self = shift;
  my $force = shift;
  $force = (defined $force and $force eq "-force") ? 1 : 0;
  my $index = $self->{'index'};
  my $data = $self->{'data'};
  my $max_size = $self->{'size'} * 1024;        # maximal allowed size in kBytes
  my $expire = $self->{'expire'} / 24;          # maximal storage time in days
  my $size = directory_size($data);
  return                                        # return if cache size is below threshold (unless called with "-force")
    unless $force or $size > $max_size;
  # remove 'dangling' data files where index is missing (if there are any)
  my @files = grep { not -f "$index/$_" } list_directory($data);
  if (@files) {
    delete_files(map { "$index/$_"} @files);
  }
  # delete all files that are older than the maximal expiration time
  @files = grep { -M "$index/$_" > $expire } list_directory($index);
  if (@files) {
    delete_files(map { "$index/$_"} @files);
    delete_files(map { "$data/$_"} @files);
  }
  $size = directory_size($data);                # return if these steps have already reduced the cache size
  return unless $size > $max_size;
  # finally, sort remaining files by age (oldest first) and find out how many must be removed to reduce cache size below limit
  @files =
    sort { $b->[1] <=> $a->[1] }
      map { [$_, (-M "$index/$_"), (-s "$data/$_" || 0) / 1024] }
        list_directory($index);
  my $req_size = $size - 0.8 * $max_size;       # reduce to 80% of maximal cache size to reduce number of relatively slow cache sweeps
  my $cum_size = 0;
  while (@files and $cum_size < $req_size) {
    my $f = shift @files;
    my $file = $f->[0];
    unless (-M "$index/$file" < $f->[1]) {      # just to be sure that index file hasn't been touched in the meantime
      delete_files("$index/$file");
      if (-f "$data/$file") {
        delete_files("$data/$file");
      }
      $cum_size += $f->[2];
    }
  }
  $size = directory_size($data);                # check that cache sweep was successful
  croak 'Error: cache sweep failed for directory '.$self->{'dir'}
    if $size > $max_size;
}
## @lines = $cache->get_metadata($unique_name);
## read metadata for unique filename (read from index file), returns () if index file does not exist
sub get_metadata {
  my $self = shift;
  my $file = shift;
  my $index = $self->{'index'};
  my @lines = ();
  if (-f "$index/$file") {
    my $fh = CWB::OpenFile "$index/$file";
    while (<$fh>) {
      chomp;
      push @lines, $_;
    }
    $fh->close;
  }
  return @lines;
}
## $unique_name = $cache->store($named_query [, @metadata]);
## make named query result $named_query (which must be fully qualified with corpus name) persistent,
## returning unique name for retrieval; optional metadata is s(tored in index file when the data file has been created
sub store {
  my $self = shift;
  my $name = shift;
  my $index = $self->{'index'};
  my $data = $self->{'data'};
  my $cqp = $self->{'cqp'};
  croak "Error: query name $name illegal or not fully specified."
    unless $name =~ /^([A-Z0-9_-]+):([A-Za-z0-9_-]+)$/;
  my $corpus = $1;
  my $localname = $2;
  # create cache directory and subdirectories if necessary
  $self->make_dirs;
  # check size of cache directory and remove old entries if necessary
  $self->sweep_cache;
  # extend $name to unique filename (in index directory)
  my %index_file = map {$_ => 1} $self->index_files("$name-");
  my $ext = 1;
  while (exists $index_file{"$name-$ext"}) {
    $ext++;
  }
  my $unique = "$name-$ext";
  my $unique_local = "$localname-$ext";
  # now create empty index file to "lock" the unique name from other processes
  my $fh = CWB::OpenFile "> $index/$unique";
  $fh->close;
  CWB::Shell::Cmd("chmod 666 $index/$unique");
  # set DataDirectory in CQP session and re-activate base corpus
  $cqp->exec("set DataDirectory '$data'");
  $cqp->exec("$corpus");
  # copy named query to unique name and store it in data directory
  $cqp->exec("$unique_local = $name");
  $cqp->exec("save $unique");
  CWB::Shell::Cmd("chmod 666 $data/$unique");
  # re-write index file if caller has passed metadata
  if (@_) {
    $fh = CWB::OpenFile "> $index/$unique";
    while (@_) {
      my $line = shift;
      chomp $line;                              # normalize whitespace in metadata (esp. linebreaks)
      $line =~ s/\s+/ /g;
      print $fh "$line\n";
    }
    $fh->close;
  }
  return $unique;
}
## $size = $cache->retrieve($unique_name);
## load persistent query result identified by $unique_name into CQP process,
## returning number of matches (>= 0) or 'undef' if the query result has expired from the cache
sub retrieve {
  my $self = shift;
  my $unique = shift;
  my $index = $self->{'index'};
  my $data = $self->{'data'};
  my $cqp = $self->{'cqp'};
  my $expire = $self->{'expire'} / 24;          # maximal storage time in days
  croak "Error: illegal unique query name $unique"
    unless $unique =~ /^([A-Z0-9_-]+):([A-Za-z0-9_-]+)$/;
  my $corpus = $1;
  my $unique_local = $2;
  return undef                                  # may have expired from cache during sweep ...
    unless -f "$index/$unique";
  if (-M "$index/$unique" > $expire) {          # ... or because it exceeded the maximal storage time
    $self->sweep_cache("-force");               # delete expired files so they won't hide exact matches
    return undef;
  }
  CWB::Shell::Cmd("touch $index/$unique");      # touch index file to mark last access time
  # set DataDirectory in CQP session and re-activate base corpus
  $cqp->exec("set DataDirectory '$data'");
  $cqp->exec("$corpus");
  # force loading of named query result in CQP session
  my ($size) = $cqp->exec("size $unique");
  return $size;
}
## ($unique_name [, $matching_lines]) = $cache->retrieve_matching($prefix, ["-partial",] @metadata);
## load persistent query result which is a unique extension of $prefix and matches the specified metadata;
## if flag -partial is specified, a query result matching only the first N lines of @metadata will also
## be accepted; N is returned as $matching_lines and must be checked by the caller
sub retrieve_matching {
  my $self = shift;
  my $prefix = shift;
  my $partial = 0;
  if (@_ and lc($_[0]) eq "-partial") {
    $partial = 1;
    shift @_;
  }
  my $index = $self->{'index'};
  my $data = $self->{'data'};
  my $cqp = $self->{'cqp'};
  my $expire = $self->{'expire'} / 24;          # maximal storage time in days
  croak "Error: query name prefix $prefix is illegal or not fully specified."
    unless $prefix =~ /^([A-Z0-9_-]+):([A-Za-z0-9_-]+)$/;
  my $corpus = $1;
  my $prefix_local = $2;
  my @unique = $self->index_files($prefix);     # find matching unique names in index directory
  # normalise specified metadata and compare to each matching index file in turn
  my @metadata = ();
  while (@_) {
    my $line = shift;
    chomp $line;                                # normalize whitespace in metadata (esp. linebreaks)
    $line =~ s/\s+/ /g;
    push @metadata, $line;
  }
  my $N = @metadata;
  my $best_N = 0;                               # find best match for metadata among matching filenames
  my $best_unique = "";
  foreach my $unique (@unique) {
    my @md_unique = $self->get_metadata($unique);
    my $i = 0;
    while ($i < $N and $i < @md_unique) {
      last unless $metadata[$i] eq $md_unique[$i];
      $i++;
    }
    # $i now holds the number of matching lines of metadata
    if ($i > $best_N) {
      $best_N = $i;
      $best_unique = $unique;
    }
  }
  if ($partial) {
    return ($best_unique, $best_N)              # retrieve query result if (partial) match has been found, return unique name and matching lines
      if $best_N > 0 and defined $self->retrieve($best_unique);
  }
  else {
    return $best_unique                         # retrieve query result if match has been found and return unique name if successful
      if $best_N == $N and defined $self->retrieve($best_unique);
  }
  return undef;                                 # return 'undef' if there was no match (or the match could not be retrieved from the cache)
}
## $query_name = $cache->query(-corpus => $corpus, -query => $query_expr, ...);
## execute CQP query (and optional subquery) on corpus $corpus and make its results persistent;
## if query has already been executed, the results are retrieved from the disk cache unless they're expired;
## results are automatically sorted with optional sort clause (e.g. 'by word %cd reverse') and stored
## in sort order; optional cut specifies maximum number of matches (so a single query cannot occupy the
## entire cache directory) and defaults to a value allowing at least 4 queries to be cached simultaneously
sub query {
  my $self = shift;
  my %arg = (-corpus => undef,
             -query => undef,
             -subquery => "",
             -sort => "", -keyword => "",
             -cut => int(($self->{'size'} * 1024 * 1024 / 4) / 16));
  # -cut defaults to average file size of at most 1/4 of cache size (in bytes, assuming approx. 16 bytes per match)
  croak 'Usage:  $cache->query(-corpus => $corpus, -query => $query, [-subquery => $subquery,] [-sort => $sort_clause,] [-keyword => $keyword_command,] [-cut => $max_matches]);'
    unless named_args(\%arg, @_);
  my $corpus = $arg{-corpus};
  my $query = $arg{-query};
  my $subquery = $arg{-subquery};
  my $sort = $arg{-sort};
  my $keyword = $arg{-keyword};
  my $cut = $arg{-cut};
  croak "Error: invalid corpus name $corpus"
    unless $corpus =~ /^[A-Z0-9_-]+$/;
  $query =~ s/(\s*\;\s*)+$//;                   # delete trailing ";" from query so we can append cut statement
  if ($subquery) {
    # don't limit first query if we want to run a subquery, because (i) we may be filtering out matches with the subquery
    # and (ii) the subquery might have more matches than the original, so a cut clause on the subquery is required
    if ($subquery =~ s/!\s*$//) {
      # if subquery is used as a filter, we need to move the "!" behind the "cut" clause
      $subquery .= "cut $cut !";
    }
    else {
      $subquery .= " cut $cut";
    }
  }
  else {
    $query .= " cut $cut";
  }
  # compute hash key from normalised query string
  my $hv = 0;
  my $normal = $query;
  $normal =~ s/\s+/ /g;
  for my $i (0 .. (length($normal) - 1)) {
    use integer;
    $hv = ($hv * 33) ^ ($hv >> 27) ^ ord(substr($normal, $i, 1));
  }
  $hv = abs($hv);
  my $name = "PQ$hv";
  # try to retrieve cached query result matching query string, sort clause, and set keyword command
  my ($unique, $matching_lines) = $self->retrieve_matching("$corpus:$name", "-partial", $query, $subquery, $keyword, $sort);
  if (defined $unique) {
    return $unique
      if $matching_lines == 4;                  # return exact match
  }
  else {
    $matching_lines = 0;
  }
  # if query result was not found in cache, run query and subquery, sort result, and execute set keyword command
  my $cqp = $self->{'cqp'};
  $cqp->exec("$corpus");
  if ($matching_lines >= 2) {                   # copy query result from partial match ...
    $cqp->exec("$name = $unique");
  }
  else {                                        # ... or execute query in query lock mode (highly recommended for CGI scripts)
    if ($subquery) {
      $cqp->exec_query("$name-TEMP = $query");       # run primary query, assign to temporary name, then run subquery
      my ($N) = $cqp->exec("size $name-TEMP");
      if ($N > 0) {
        $cqp->exec("$name-TEMP");
        $cqp->exec_query("$name = $subquery");
        $cqp->exec("$corpus");
      }
      else {                                    # don't run subquery if primary returns no matches
        $cqp->exec("$name = $name-TEMP");
      }
      $cqp->exec("discard $name-TEMP");
    }
    else {
      $cqp->exec_query("$name = $query");            # run primary query only
    }
  }
  my ($N) = $cqp->exec("size $name");
  if ($N > 0) {
    if ($matching_lines < 3) {                  # no match or partial match without appropriate keyword
      if ($keyword eq "") {
        $cqp->exec("set $name keyword NULL")    # if $keyword == "", make sure to delete keywords that may have been loaded from partial match
        if $matching_lines >= 2;
        ## but only if the result has been loaded from cache with a non-empty set keyword clause ($matching_lines >= 2)
        ## in order to avoid overwriting keyword anchors that have been set directly in the query
      }
      else {
        $cqp->exec("set $name keyword $keyword"); # otherwise execute keyword command
      }
    }
    ## sort command cannot have matched in partial match -> execute
    $cqp->exec("sort $name $sort");           # "unsorts" (i.e. sorts in cpos order) if $sort == ""
  }
  # make query result persistent and return its unique name
  return $self->store("$corpus:$name", $query, $subquery, $keyword, $sort);
}
return 1;
__END__
=head1 NAME
CWB::Web::Cache - A simple shared cache for CQP query results
=head1 SYNOPSIS
  use CWB::CQP;
  use CWB::Web::Cache;
  $cqp = new CWB::CQP;
  $cache = new CWB::Web::Cache -cqp => $cqp, -cachedir => $dir,
    [-cachesize => $cache_size,] [-cachetime => $expiration_time];
  # transparently execute and cache simple CQP queries
  $id = $cache->query(-corpus => "DICKENS", -query => '[pos="NN"] "of" "England"');
  ($size) = $cqp->exec("size $id");
  # optional features: sort clause, set keyword, subquery, and maximal number of matches
  $id = $cache->query(
    -corpus => "DICKENS", -query => $query,
    -sort => $sort_clause,
    -keyword => $set_keyword_command,
    -subquery => $subquery,
    -cut => $max_nr_of_matches  # resonable default calculated from cache size
  );
  ## The functions below are for internal use only and subject to change in future releases!
  $id = $cache->store("DICKENS:Query1");        # activates DICKENS corpus
  $id = $cache->store("DICKENS:Query1", "Metadata line #1", ...);
  $size = $cache->retrieve($id);                # (re-)activates DICKENS corpus
  die 'Sorry, named query has expired from the cache.'
    unless defined $size;
  $cqp->exec("Query1 = $id");                   # copy query result to desired name
  $id = $cache->retrieve("DICKENS:Query", "Metadata line #1", ...);
  die 'Sorry, no named query matching your metadata found in cache.'
    unless defined $id;
  $cqp->exec("Query = $id");
=head1 DESCRIPTION
The B<CWB::Web::Cache> module provides a simple shared caching meachnism
for CQP query results, making them persistent across multiple CQP sessions.
Old data files are automatically deleted when they pass the specified I<$expiration_time>, or
to keep the cache from growing beyond the specified I<$cache_size> limit.
Note that a B<CWB::Web::Cache> handle must be created with a pre-initialised CQP backend (i.e.
a B<CWB::CQP> object), which will be used to access the cache and (re-)run a query when necessary.
Most scripts will access the cache through the B<query()> method, which executes and caches CQP queries
in a fully transparent way (with optional C<sort> clause, C<set keyword> command, subquery,
and C<cut> to limit the maximal number of matches).  After successful execution, the query result is
loaded into the CQP backend, the appropriate corpus is activated, and the I<$id> of the named query is
returned.
The C<sort> clause is executed I<after> a C<set keyword> command 
so that C<keyword> anchors can be used in sorting.
Direct access to cache entries is provided by the low-level methods B<store()> and B<retrieve()>.
Note that these are intended for internal use only and may change in future releases.
=head1 METHODS
B<TODO>
=head1 COPYRIGHT
Copyright (C) 1999-2022 Stephanie Evert [http::/purl.org/stephanie.evert]
This software is provided AS IS and the author makes no warranty as to
its use and performance. You may use the software, redistribute and
modify it under the same terms as Perl itself.
=cut
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)