=head1 NAME
makepp_sandboxes -- How to partition a makepp build
=
for
vc
$Id
: makepp_sandboxes.pod,v 1.3 2007/05/29 20:21:46 pfeiffer Exp $
=head1 DESCRIPTION
There are a couple of reasons that you might want to partition the file tree
for
a makepp build:
=over 4
=item 1.
If you know that the majority of the tree is not affected by any changes made
to source files since the previous build, then you can
tell
makepp to assume
that files in those parts of the tree are already up-to-date, which means not
even implicitly loading their makefiles, let alone computing and checking
their dependencies. (Note that explicitly loaded makefiles are still loaded,
however.)
=item 2.
If you have multiple makepp processes accessing the same tree, then you want
to raise an error
if
you detect that two concurrent processes are writing the
same part of the tree, or that one process is reading a part of the tree that
a concurrent process is writing. Either way, you have a race condition in
which the relative order of events in two concurrent processes (which cannot
be guaranteed) may affect the result.
=back
Makepp
has
sandboxing facilities that address both concerns.
=head2 Sandboxing Options
The following makepp options may be used to set the sandboxing properties
of the subtree
given
by I<path> and all of its files and potential files:
=over 4
=item --dont-build I<path>
=item --
do
-build I<path>
Set or
reset
the
"dont-build"
property. Any file
with
this property set is
assumed to be up-to-date already, and
no
build checks will be performed. The
default
is
reset
(i.e.
"do-build"
).
=item --sandbox I<path>
=item --out-of-sandbox I<path>
Set or
reset
the
"in-sandbox"
property. An error is raised
if
makepp would
otherwise
write
a file
with
this property
reset
. Build checks are still
performed,
unless
the
"dont-build"
property is also set. The
default
is set
(i.e.
"in-sandbox"
),
unless
there are any B<--sandbox> options, in which case
the
default
is
reset
(i.e.
"out-of-sandbox"
).
=item --dont-
read
I<path>
=item --
do
-
read
I<path>
Set or
reset
the
"dont-read"
property. An error is raised
if
makepp would
otherwise
read
a file
with
this property set. The
default
is
reset
(i.e.
"do-read"
).
=back
Each of these 3 properties applies to the entire subtree, including to files
that
do
not yet exist. More specific paths
override
less specific paths. A
specified path may be an individual file, even
if
the file does not yet exist.
If a property is both set and
reset
on the exact same path, then the option
that appears furthest to the right on the command line takes precedence.
If the B<--sandbox-
warn
> option is specified, then violations of
"in-sandbox"
and
"dont-read"
are downgraded to warnings instead of errors. This is useful
when
there are hundreds of violations, so that you can collect all of them in
a single run and take appropriate corrective action. Otherwise, you see only
one violation per makepp invocation, and you don't know how many are left
until
they're all fixed.
=head2 Sandboxing
for
Acceleration
If you want to prevent makepp from wasting
time
processing files that you
know are already up-to-date (in particular, files that are generated by a
build tool other than makepp), then B<--dont-build> is the option
for
you.
By far the most common case
for
such an optimization is that you know that
everything not at or below the starting directory is already up-to-date.
This can be communicated to makepp using
"B<--dont-build /. --do-build .>"
.
=head2 Sandboxing
for
Concurrent Processes
One technique that can reduce build latency is to have multiple makepp
processes working on the same tree. This is quite a bit more difficult to
manage than using the B<-j> option, but it can also be substantially more
effective because:
=over 2
=item *
With sandboxing, the processes may be running on multiple hosts,
for
example,
via a job queuing
system
. Increasing the B<-j> limit eventually exhausts the
CPU resources of a single host, and can even slow the build due to excessive
process forking.
=item *
B<-j> does not currently parallelize some of makepp's
time
-consuming tasks
such as loading makefiles, scanning, building implicit dependencies
while
scanning, and checking dependencies.
=back
The biggest risk
with
this approach is that the build can become
nondeterministic
if
processes that might be concurrent interact
with
one
another. This leads to build systems that produce incorrect results
sporadically, and
with
no
simple mechanism to determine why it happens.
To address this risk, it is advisable to partition the tree among concurrent
processes such that
if
any process accesses the filesystem improperly, then an
error is deterministically raised immediately. Normally, this is accomplished
by assigning to
each
concurrent process a
"sandbox"
in which it is allowed to
write
, where the sandboxes of
no
two concurrent processes may overlap.
In addition,
each
process marks the sandboxes of any other possibly concurrent
processes as
"dont-read."
If a process reads a file that another concurrent
process is responsible
for
writing (and which therefore might not yet be
written), then an error is raised immediately.
=head2 Sandboxing
for
Sequential Processes
When the build is partitioned
for
concurrent makepp processes, there is also
usually a sequential relationship between various pairs of processes. For
example, there may be a dozen concurrent compile processes, followed by a
single
link
process that cannot begin
until
all of the compile processes have
completed. Such sequential relationships must be enforced by whatever
mechanism is orchestrating the various makepp processes (
for
example, the job
queuing
system
).
When processes have a known sequential relationship, there is normally
no
need
to raise an error
when
they access the same part of the tree, because the
result is nonetheless deterministic.
However, it is generally beneficial to specify B<--dont-build> options to the
dependent process (the
link
process in
our
example) that notify it of the
areas that have already been updated by the prerequisite processes (the
compile jobs in
our
example). In this manner, we avoid most of the
unnecessary work of null-building targets that were just updated.
=head1 AUTHOR
Anders Johnson (anders
@ieee
.org)