The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

relate_complex - relates multiple search terms, quickly returning a list of files that match all of them.

SYNOPSIS

   relate_complex [options] [term1 [ term2 [ term3 ... ]]]

   An example:

      relate_complex bozotech lib report data

   Would match:

      /usr/lib/bozotech/data/report
      /usr/lib/reports/bozotech/data
      /usr/local/lib/bozotech/report/data

   Another example:

      relate_complex bin ^ppp home -bak -old

   Would match:

     /home/doom/bin/pppon
     /home/doom/bin/pppoff

   But not:

     /home/doom/bin/pppon.bak
     /home/doom/old/bin/pppoff
     /home/doom/bin/debug_ppp

OPTIONS

-help

Print a brief help message and exits.

-man

Prints the manual page and exits.

a

Show "all" items, suppresses the default filter pattern (':skipdull'). Still uses additional filters specified with -F.

F

One or more named filters to apply to the output in addition to the default. To enter more than one, separate them with spaces and put the string in single-quotes, e.g. -F 'filter1 filter2 filter3'

L

List the names of the available filters.

r

Indicates that the first term is also a regular expression, albeit a POSIX one.

DESCRIPTION

The mnemonic is that relate_complex makes the file system a little more relational and a little less hierarchical.

Essentially this script is the equivalent of

      locate primary_term | egrep term2 | egrep term3 [....| termN]

Though by default relate_complex also automatically screens out some "uninteresting" files (emacs backups, CVS/RCS repositories, etc.).

An important performance hint: you can speed up relate_complex tremendously by using a relatively unique first term. For example, if you're on a unix box, you don't want to start with something like "lib" which is going to match a huge number of files in the locate database. You'll find that "relate_complex gdk lib" is much faster than "relate_complex lib gdk".

Further: the first term should be a simple string (unless you've used the "-r" option), but all the following terms can be perl regexps. (This inelegant state of affairs is the result of working as a wrapper around the locate system; the first term is fed to it directly, then the output is filtered using perl pattern matches.)

Note that while you can use a regexp in the first position with the -r option, but it has to be a POSIX regexp, not a perl one.

A negative match feature can be used on the secondary terms: a term with a leading '-' will filter out lines that match it. Example:

   relate_complex my_site index -htm$

will screen out files ending in "htm" (but not "html").

This script has the extremely useful feature of automatically omitting uninteresting files, but it's guaranteed that you'll be confused by this some day, so don't say I didn't warn you. As of this writing, by default files are ignored that match these patterns:

      '~$';      # emacs backups
      '/\#';     # emacs autosaves
      ',v$';     # cvs/rcs repository files
      '/CVS$'
      '/CVS/'
      '/RCS$'
      '\.elc$';  # compiled elisp

This default filter is named ":skipdull", and can be modified by editing your personal copy of it in ~/.list-filter/filters.yaml.

A few command line arguments exist to modify this filtering behavior:

     -a: returns "all" matches, overriding the default filter.

      -F <filename>: The "add_filters" option let's you supply a
         list of named filter.

      You I<can> use "-F" in conjunction with "-a": the "-a" suppresses
      the default filter and the "-F" replaces them with your own.

dwim upcarets

The use of a leading "^" achor in a pattern is allowed, but is silently transformed into a boundary match "\b". Otherwise "^" wouldn't be very useful (consider that with full paths *all* listings match "^/") So instead we DWIM and turn "^" into a beginning-of-name match. An embedded or trailing "^" is left alone. Ditto for a "^" in front of a slash: if you ask for '^/home/bin', maybe that's really what you want (as opposed to, say '/tmp/backup/home/bin').

alternate filters

You can always determine which filters are available with the "-L" option:

   relate_complex -L

By convention, the filters defined in the List::Filter::Library::* modules are named with a leading colon.

You can apply any of these filters using the -F option:

  relate_complex -F:jpeg site www images

(Note that this should match the common variant extensions, ".jpeg", ".jpg", ".JPG", and so on.)

If you'd like to try your hand at defining your own filters you should look at the file:

  ~/.list-filters/filters.yaml

Which, if you are familar with YAML, you will probably recognize as a hashref of hashrefs keyed by name, where the heart of the matter is the aref of patterns called "terms". If you're not familiar with YAML: make your edits carefully, and watch it, because it's whitespace sensitive.

When you choose a name for a new filter, remember that only the "standard" filters should be named with a leading colon.

A feature of the system is that after any filter is invoked by the "relate_complex" command, a copy of it written to your personal "filters.yaml" file. These copies then take precedence over the original definitions: this allows you to modify them for your particular needs. The default ":skipdull" filter is a likely candidate for this treatment, because different files will be considered "dull" by different people. (Consider that a ".o" file is boring to a C programmer, but might be critically important to a sysadmin.)

DISCUSSION

There's More Than One Way to Relate

I've tried a number of alternate approaches to writing relate_complex...

Notably:

(1) Automatically generating all possible permutations to build an alternation pattern that gets fed to locate with the -r option (this turns out to be grossly slow).

(2) An efficiency hack where terms are sorted by length to find a good one to use as the primary_term to hand to locate... but you need to screen out any regexp terms when you do that, and the overall behavior becomes too hard to remember (it's eaisier to just say "the first term is different, deal with that").

Relate Isn't Really

I like this slogan as a mnemonic: "relate makes the file system a little more relational", though really this is an abuse of the term "relational".

The point though, is that using "relate_complex" is something like doing a database query where you specify certain constraints in any order and get all records that match them. But if you can't relate, that's okay.

(Just be glad I don't think of it as "feeling your way around".)

Anyway, I'm of the opinion that there's more to life than hierarchies; our filesystems are ripe for revolution (insert olfactory humor), and I suggest keeping an eye on ReiserFS (try a web search on "namespace unification" and "Hans Reiser"). And by the way: (1) he has not been convicted as of yet, and (2) there are other programmer's at NameSys continuing the work.

But perhaps this is beyond the scope of this documentation...

SEE ALSO

App::Relate relate

List::Filter List::Filter::FileSystem::FileSystem List::Filter::FileSystem::FileExtensions

AUTHOR

Joseph Brenner, <doom@kzsu.stanford.edu>

COPYRIGHT AND LICENSE

Copyright (C) 2007 by Joseph Brenner

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.

BUGS

None reported... yet.