The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

whouses - impact analysis in a clearmake build environment

MOTTO

"You give me a clean CR, I'll give you a clean impact analysis."

SYNOPSIS

Run this script with the -help option for usage details. Here are some additional sample usages with explanations:

  whouses foobar.h

Shows all DOs that make use of any file matching /foobar.h/.

  whouses -recurse foobar.h

Same as above but follows the chain of derived files recursively.

  whouses -exact foobar.h

Shows all DOs that make use of the specified file. The -exact flag suppresses pattern matching and shows only DOs which reference the exact file.

DESCRIPTION

Whouses provides a limited form of "impact analysis" in a clearmake build environment. This is different from traditional impact analysis (see TRUE CODE ANALYSIS COMPARED below for details). In particular, it operates at the granularity of files rather than language elements.

Whouses is best described by example. Imagine you have a VOB /vobs_sw in which you build the incredibly simple application foo from foo.c. You have a Makefile which compiles foo.c to foo.o and then links it to produce foo. And let's further assume you've just done a build using clearmake.

Thus, foo is a derived object (DO) which has a config record (CR) showing how it was made. Whouses analyzes that CR and prints the data in easy-to-read indented textual format. For instance:

        % whouses -do foo foo.c
        /vobs_sw/src/foo.c  =>
          /vobs_sw/src/foo.o

The -do foo points to the derived object from which to extract and analyze the CR; it will be implicit in the remaining examples. The output indicates that foo.o uses foo.c, or in other words that foo.c is a contributor to foo.o. If we add the -recurse flag:

        % whouses -r foo.c
        /vobs_sw/src/foo.c =>
          /vobs_sw/src/foo.o
            /vobs_sw/src/foo

We see all files to which foo.c contributes, indented according to how many generations removed they are. If we now add -terminals

        % whouses -r -t foo.c
        /vobs_sw/src/foo.c =>
          /vobs_sw/src/foo

Intermediate targets such as foo.o are suppressed so we see only the "final" targets descended from foo.c.

We can also go in the other direction using -backward:

        % whouses -b -e foo
        /vobs_sw/src/foo <=
          /vobs_sw/src/foo.o

Which shows foo.o as a progenitor of foo. Note that the arrow (<=) is turned around to indicate -backward mode. We also introduced the -exact flag here. By default, arguments to whouses are treated as patterns, not file names, and since foo has no extension it would have matched foo.c and foo.o as well. Use of -exact suppresses pattern matching.

Of course we can go backward recursively as well:

        % whouses -back -exact -recurse foo
        /vobs_sw/src/foo <=
          /vobs_sw/src/foo.o
              /vobs_sw/src/foo.c
              /vobs_sw/src/foo.h
              /vobs_sw/src/bar.h

And discover that foo.h and bar.h were also used.

When used recursively in the forward direction, this script answers the question "if I change file X, which derived files will need to be rebuilt"? This is the classic use, the one for which it was written. When used recursively in the backward direction it can depict the entire dependency tree in an easily readable format.

Because extracting a recursive CR can be quite slow for large build systems, whouses provides ways of dumping the CR data to a file representation for speed. Use of -save:

        % whouses -do foo -save ...

will write out the data to foo.crdb. Subsequently, if foo.crdb exists it will be used unless a new the -do flag is used. See also the -db and -fmt flags.

The default save format is that of Data::Dumper. It was chosen because it results in a nicely indented, human-readable text format file.

SELECTING DERIVED OBJECTS TO ANALYZE

If a -do flag is given, the CRs are taken from the specified derived object(s). Multiple DOs may be specified with multiple -do flags or as a comma-separated list. Alternatively, if the CRDB_DO environment variable exists, its value is used as if specified with -do.

If no DOs are specified directly, whouses will look for stored DO data in files specified with -db or the CRDB_DB EV. The format is the same as above.

Failing that, whouses will search for files named *.crdb along a path specified with -dir or CRDB_PATH, defaulting to the current directory.

.AUDIT FILES

As a special case, derived objects matching the Perl regular expression /\.AUDIT/i are ignored while traversing the recursive config spec. These are presumed to be meta-DOs by convention, which aren't part of the production build per se but rather pseudo-targets whose only purpose is to hold CRs referring back to all real deliverables. Thus if you construct your Makefile to create a meta-DO, you might want to name it .AUDIT or .prog.AUDIT or something.

ClearCase::CRDB

Most of the logic is actually in the ClearCase::CRDB module; the whouses program is just a wrapper which uses the module. It's done this way so ClearCase::CRDB can provide an API for other potential tools which need to do CR analysis.

TRUE CODE ANALYSIS COMPARED

Whouses is somewhat different from "real" impact analysis products. There are a number of such tools on the market, for example SNiFF+ from WindRiver. Typically these work by parsing the source code into some database representation which they can then analyze. It's a powerful technique but entails some tradeoffs:

MINUSES

  • A true code analysis tool must have knowledge of each programming language in use. I.e. to add support for Java, a Java parser must be added.

  • A corollary of the above is that it requires lot of work by expert programmers. Thus the tools tend to be large, complex and expensive. Note: there is also cscope which is free, and maybe others. But as the name implies cscope is limited to C-like languages.

  • Another corollary is that the tool must track each advance in each language, usually with significant lag time, and may not be bug-for-bug compatible with the compiler.

  • Also, since analysis basically entails compiling the code, analysis of a large code base can take a long time, potentially as long or longer than actually building it.

  • If some part of the application is written in a language the tool doesn't know (say Python or Visual Basic or Perl or an IDL), no analysis of that area can take place.

PLUSES

  • The analysis can be as granular and as language-knowledgeable as its developers can make it. If you change the signature of a C function, it can tell you how many uses of that function, in what files and on what lines, will need to change.

  • A code analysis tool may be tied to a set of languages but by the same token it's NOT tied to a particular SCM or build system.

The minuses above are not design flaws but inherent tradeoffs. For true code impact analysis you must buy one of these tools and accept the costs.

Whouses doesn't attempt code analysis per se. As noted above, true code analysis programs are tied to language but not to an SCM system. Whouses flips this around; it doesn't care about language but it only works with build systems that use clearmake within a ClearCase VOB.

Whouses takes the config records generated by clearmake, analyzes them, and tells you which files depend on which other files according to the CRs. Both techniques have fuzziness of different kinds: code analysis predicts what the real compiler will do based on what the analysis compiler found; divergence is possible. Whouses predicts what the next build will do based on what the last build did. If changes have taken place since, divergence is possible here too.

AUTHOR

David Boyce <dsbperl AT boyski.com>

COPYRIGHT

Copyright (c) 2000-2006 David Boyce. All rights reserved. This Perl program is free software; you may redistribute and/or modify it under the same terms as Perl itself.

STATUS

This is currently ALPHA code and thus I reserve the right to change the UI incompatibly. At some point I'll bump the version suitably and remove this warning, which will constitute an (almost) ironclad promise to leave the interface alone.

PORTING

I've tried to write this in a platform independent style but it hasn't been heavily tested on Windows (actually it hasn't been all that heavily tested anywhere). It does pass make test on Windows and appears to work fine in limited testing.

SEE ALSO

perl(1), ClearCase::CRDB(3), cleartool man catcr