use File::Basename;
use File::Spec 0.82;
use Getopt::Long;

use ClearCase::CRDB;

use constant MSWIN => $^O =~ /MSWin32|Windows_NT/i ? 1 : 0;

use strict;

my $prog = basename($0, qw(.pl));

sub usage {
    my $msg = shift;
    my $rc = (defined($msg) && !$msg) ? 0 : 2;
    if ($rc) {
	select STDERR;
	print "$prog: Error: $msg\n\n" if $msg;
    print <<EOF;
Usage: $prog [flags] pattern ...
   -help		Print this message and exit
   -backward		Work from objects back towards sources
   -do DO,...		Comma-separated list of DOs to analyze
   -db file,...		Comma-separated list of *.crdb files to read in
   -dir	<dir>		Dir to search for *.crdb files (default: the cwd)
   -exact		Treat <pattern> as an absolute path, not a pattern
   -interactive 	Run like a shell: prompt for new filenames until EOF
   -Indent <str>	String to indent recursive output (default: 3 spaces)
   -recurse		Chase down dependencies recursively
   -save		Save DB derived from DO 'foo' as 'foo.crdb'
   -terminals		Like -recurse but print only terminal targets
   -verbose		Provide informational messages
    All flags may be abbreviated to their shortest unique name.
    Flags may be turned off by prepending 'no', e.g. -nor/ecurse.
    $prog -do /vobs_foo/solaris/.AUDIT libxyz.a
    $prog -backward -rec foo.o
    exit $rc if $rc;

# Make a new CRDB object and initialize it from @ARGV.
my $crdb = ClearCase::CRDB->new(@ARGV);

my @interopts = qw(backward! exact! Indent=s number! recurse! terminals!);

my %opt;
$Getopt::Long::ignorecase = 0;  # global override for dumb default
GetOptions(\%opt, qw(dir=s help interactive verbose), @interopts);
usage(1) if $opt{help};
usage(1) if !@ARGV && !$opt{interactive};

sub note {
    print STDERR "$prog: @_\n" if $opt{verbose};

my $ln = 0;
my %seen;

sub whouses {
    my($obj, $level, $do) = @_;
    my $pad = $opt{Indent} x $level;
    my $query = $opt{backward} ? 'needs' : 'makes';
    for my $used ($obj->$query($do)) {
	next if $used =~ m%/\.AUDIT%i;	# See POD
	if ($opt{terminals} && grep !m%/\.AUDIT%i, $obj->$query($used)) {
	    whouses($obj, $level, $used);
	} else {
	    print ++$ln if $opt{number};
	    print "$pad$used\n" unless $opt{terminals} && $seen{$used};
	    $seen{$used} = 1;
	    whouses($obj, $level+1, $used) if $opt{recurse} || $opt{terminals};

$opt{Indent} ||= '   ';

while (1) {
    $ln = 0;
    %seen = ();
    my @matches = ();
    if (!@ARGV) {
	# Skip to next iteration of loop.
    } elsif ($opt{exact}) {
	@matches = ();
	for my $path (@ARGV) {
	    if (File::Spec->file_name_is_absolute($path)) {
		push(@matches, $path);
	    } else {
		# Consider all possible permutations of the supplied
		# basename and the known IWDs.
		    map {File::Spec->rel2abs($path, $_)} $crdb->iwd);
    } else {
       @matches = $crdb->matches(@ARGV);

    for (@matches) {
	print(++$ln, ' ') if $opt{number};
	print $_, $opt{backward} ? ' <=' : ' =>', "\n";
	whouses($crdb, 1, $_);

} continue {
    last unless $opt{interactive};
    print STDERR "\n>> ";
    my $line = <STDIN>;
    last if !$line;
    chomp $line;
    @ARGV = split ' ', $line;
    next if !@ARGV;
    last if $ARGV[0] =~ m%(?:q|quit)%i;
    if ($ARGV[0] eq '?') {
	my %filter = map { $_ => 1 }
	    map { (m%\W*(\w+)%)[0] }
	my @current = map { "-$_='$opt{$_}'" }
	    grep { exists $filter{$_} }
		keys %opt;
	print "\nCURRENT: @current\n";
	@ARGV = ();
    } else {
	@ARGV = () if !GetOptions(\%opt, @interopts);

exit 0;


=head1 NAME

whouses - impact analysis in a clearmake build environment

=head1 MOTTO

I<B<"You give me a clean CR, I'll give you a clean impact analysis.">>


Run this script with the C<-help> option for usage details. Here are
some additional sample usages with explanations:

  whouses foobar.h

Shows all DOs that make use of any file matching /foobar.h/.

  whouses -recurse foobar.h

Same as above but follows the chain of derived files recursively.

  whouses -exact foobar.h

Shows all DOs that make use of the specified file. The C<-exact> flag
suppresses pattern matching and shows only DOs which reference the
exact file.


B<Whouses> provides a limited form of "impact analysis" in a clearmake
build environment. This is different from traditional impact analysis
(see B<TRUE CODE ANALYSIS COMPARED> below for details). In particular,
it operates at the granularity of files rather than language elements.

Whouses is best described by example. Imagine you have a VOB
F</vobs_sw> in which you build the incredibly simple application C<foo>
from C<foo.c>.  You have a Makefile which compiles C<foo.c> to C<foo.o>
and then links it to produce C<foo>. And let's further assume you've
just done a build using clearmake.

Thus, C<foo> is a derived object (I<DO>) which has a config record
(I<CR>) showing how it was made. Whouses analyzes that CR and prints
the data in easy-to-read indented textual format.  For instance:

	% whouses -do foo foo.c
	/vobs_sw/src/foo.c  =>

The C<-do foo> points to the derived object from which to extract and
analyze the CR; it will be implicit in the remaining examples.  The
output indicates that C<foo.o B<uses> foo.c>, or in other words that
C<foo.c> is a I<contributor> to C<foo.o>. If we add the C<-recurse> flag:

	% whouses -r foo.c
	/vobs_sw/src/foo.c =>

We see all files to which C<foo.c> contributes, indented according to
how many generations removed they are. If we now add C<-terminals>

	% whouses -r -t foo.c
	/vobs_sw/src/foo.c =>

Intermediate targets such as C<foo.o> are suppressed so we see only the
"final" targets descended from C<foo.c>.

We can also go in the other direction using C<-backward>:

	% whouses -b -e foo
	/vobs_sw/src/foo <=

Which shows C<foo.o> as a progenitor of C<foo>. Note that the arrow
(B<E<lt>=>) is turned around to indicate C<-backward> mode. We also
introduced the C<-exact> flag here. By default, arguments to whouses
are treated as patterns, not file names, and since C<foo> has no
extension it would have matched C<foo.c> and C<foo.o> as well. Use of
C<-exact> suppresses pattern matching.

Of course we can go backward recursively as well:

	% whouses -back -exact -recurse foo
	/vobs_sw/src/foo <=

And discover that C<foo.h> and C<bar.h> were also used.

When used recursively in the forward direction, this script answers the
question "if I change file X, which derived files will need to be
rebuilt"? This is the classic use, the one for which it was written.
When used recursively in the backward direction it can depict the
entire dependency tree in an easily readable format.

Because extracting a recursive CR can be quite slow for large build
systems, whouses provides ways of dumping the CR data to a file
representation for speed. Use of C<-save>:

	% whouses -do foo -save ...

will write out the data to F<foo.crdb>. Subsequently, if F<foo.crdb>
exists it will be used unless a new the C<-do> flag is used.  See also
the C<-db> and C<-fmt> flags.

The default save format is that of B<Data::Dumper>. It was chosen
because it results in a nicely indented, human-readable text format


If a C<-do> flag is given, the CRs are taken from the specified derived
object(s).  Multiple DOs may be specified with multiple C<-do> flags
or as a comma-separated list. Alternatively, if the C<CRDB_DO>
environment variable exists, its value is used as if specified with

If no DOs are specified directly, C<whouses> will look for stored DO
data in files specified with C<-db> or the C<CRDB_DB> EV. The format is
the same as above.

Failing that, C<whouses> will search for files named C<*.crdb> along a
path specified with C<-dir> or C<CRDB_PATH>, defaulting to the current


As a special case, derived objects matching the Perl regular expression
C</\.AUDIT/i> are ignored while traversing the recursive config spec.
These are presumed to be I<meta-DOs> by convention, which aren't part
of the production build per se but rather pseudo-targets whose only
purpose is to hold CRs referring back to all real deliverables. Thus
if you construct your Makefile to create a meta-DO, you might want to
name it C<.AUDIT> or C<.prog.AUDIT> or something.

=head1 ClearCase::CRDB

Most of the logic is actually in the C<ClearCase::CRDB> module; the
C<whouses> program is just a wrapper which uses the module. It's done
this way so ClearCase::CRDB can provide an API for other potential
tools which need to do CR analysis.


Whouses is somewhat different from "real" impact analysis products.
There are a number of such tools on the market, for example SNiFF+ from
WindRiver.  Typically these work by parsing the source code into some
database representation which they can then analyze. It's a powerful
technique but entails some tradeoffs:

=head2 MINUSES

=over 4

=item *

A true code analysis tool must have knowledge of each programming
language in use. I.e. to add support for Java, a Java parser must be

=item *

A corollary of the above is that it requires lot of work by expert
programmers. Thus the tools tend to be large, complex and expensive.
Note: there is also I<cscope> which is free, and maybe others. But as
the name implies I<cscope> is limited to C-like languages.

=item *

Another corollary is that the tool must track each advance
in each language, usually with significant lag time, and
may not be bug-for-bug compatible with the compiler.

=item *

Also, since analysis basically entails compiling the code, analysis of
a large code base can take a long time, potentially as long or longer
than actually building it.

=item *

If some part of the application is written in a language the tool
doesn't know (say Python or Visual Basic or Perl or an IDL), no
analysis of that area can take place.


=head2 PLUSES

=over 4

=item *

The analysis can be as granular and as language-knowledgeable as its
developers can make it. If you change the signature of a C function, it
can tell you how many uses of that function, in what files and on what
lines, will need to change.

=item *

A code analysis tool may be tied to a set of languages but by the same
token it's NOT tied to a particular SCM or build system.


The minuses above are not design flaws but inherent tradeoffs.  For
true code impact analysis you must buy one of these tools and accept
the costs.

Whouses doesn't attempt code analysis per se.  As noted above, true
code analysis programs are tied to language but not to an SCM system.
Whouses flips this around; it doesn't care about language but it only
works with build systems that use clearmake within a ClearCase VOB.

Whouses takes the I<config records> generated by clearmake, analyzes
them, and tells you which files depend on which other files according
to the CRs. Both techniques have fuzziness of different kinds: code
analysis predicts what the real compiler will do based on what the
analysis compiler found; divergence is possible. Whouses predicts what
the next build will do based on what the last build did.  If changes
have taken place since, divergence is possible here too.

=head1 AUTHOR

David Boyce <dsbperl AT boyski.com>


Copyright (c) 2000-2006 David Boyce. All rights reserved.  This Perl
program is free software; you may redistribute and/or modify it under
the same terms as Perl itself.

=head1 STATUS

This is currently ALPHA code and thus I reserve the right to change the
UI incompatibly. At some point I'll bump the version suitably and
remove this warning, which will constitute an (almost) ironclad promise
to leave the interface alone.

=head1 PORTING

I've tried to write this in a platform independent style but it hasn't
been heavily tested on Windows (actually it hasn't been I<all> that
heavily tested anywhere). It does pass C<make test> on Windows and
appears to work fine in limited testing.

=head1 SEE ALSO

perl(1), ClearCase::CRDB(3), C<cleartool man catcr>