The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Test::Weaken - Test that freed memory objects were, indeed, freed

SYNOPSIS

    use Test::Weaken qw(leaks);
    use Data::Dumper;
    use Math::BigInt;
    use Math::BigFloat;

    my $good_test = sub {
        my $obj1 = Math::BigInt->new('42');
        my $obj2 = Math::BigFloat->new('7.11');
        return [ $obj1, $obj2 ];
    };
    if ( !leaks($good_test) ) {
        print "No leaks in test 1\n";
    } else {
        print "There were memory leaks from test 1!\n";
    }

    my $bad_test = sub {
        my $arrayref = [ 42, 711 ];
        push @{$arrayref}, $arrayref;  # circular reference
        return $arrayref;
    };
    my $bad_destructor = sub {'I am useless'};
    my $tester = Test::Weaken::leaks(
        {   constructor => $bad_test,
            destructor  => $bad_destructor,
        }
    );
    if ($tester) {
        printf "Test 2: %d of %d original references were not freed\n",
            $tester->unfreed_count(), $tester->probe_count();

        my $unfreed_proberefs = $tester->unfreed_proberefs();
        print "These are the probe references to the unfreed objects:\n";
        for my $ix ( 0 .. $#{$unfreed_proberefs} ) {
            print Data::Dumper->Dump( [ $unfreed_proberefs->[$ix] ],
                ["unfreed_$ix"] );
        }
    }

DESCRIPTION

A memory leak occurs when a Perl data structure is destroyed but some of the contents of that structure are not freed. Leaked memory is a useless overhead. Leaks can significantly impact system performance. They can also cause an application to abend due to lack of memory.

In Perl, circular references are a common cause of memory leaks. Circular references are allowed in Perl, but data structures containing circular references will leak memory unless the programmer takes specific measures to prevent leaks. Preventive measures include weakening the references and arranging to break the reference cycle just before the structure is destroyed.

When using circular references, it is easy to misdesign or misimplement a scheme for preventing memory leaks. Mistakes of this kind have been hard to detect in a test suite.

Test::Weaken allows easy detection of unfreed Perl data. Test::Weaken allows you to examine the unfreed data, even data that would usually have been made inaccessible.

Test::Weaken frees the test structure, then looks to see if any of the contents of the structure were not actually deallocated. By default, Test::Weaken determines the contents of a data structure by examining arrays and hashes, by following references, and by following tied variables to their underlying object. Test::Weaken does this recursively to unlimited depth.

Test::Weaken can deal with circular references without going into infinite loops. Test::Weaken will not visit the same Perl data object twice.

Data Objects, Blessed Objects and Structures

Object is a heavily overloaded term in the Perl world. This document will use the term Perl data object or data object to refer to any referenceable Perl datum, including scalars, arrays, hashes, references themselves, and code objects. The full list of types of referenceable Perl data objects is given in the description of the ref builtin in the Perl documentation. An object that has been blessed using the Perl bless builtin, will be called a blessed object.

In this document, a Perl data structure (often just called a structure) is any group of Perl objects that are co-mortal. Co-mortal means that the maintainer expects those objects to be destroyed at the same time. For example, if a group of Perl objects is referenced, directly or indirectly, through a hash, and is referenced only through that hash, a programmer will usually expect all of those objects to be destroyed when the hash is.

Perl data structures can be any set of Perl data objects. Since the question is one of expected lifetime, whether an object is part of a data structure is, in the last analysis, subjective.

The Contents of a Data Structure

A data structure must have one object that is designated as its top object. In most data structures, it is obvious which data object should be designated as the top object. The objects in the data structure, including the top object, are the contents of that data structure.

Test::Weaken gets its test data structure, or test structure, from a closure. The closure should return a reference to the test structure. This reference is called the test structure reference.

Children and Descendants

The elements of an array are children of the array. The values of a hash are children of the hash. A referent is a child of its reference. The underlying object of a tied variable is a child of the tied variable.

The descendants of a Perl data object are itself, its children, and any children of one of its descendants. By default, Test::Weaken determines the contents of a data structure by recursing through the descendants of the top object of the test data structure.

If one data object is the descendant of a second object, then the second data object is an ancestor of the first object. A data object is considered to be a descendant of itself, and also to be one of its own ancestors.

Test::Weaken's default assumption is that the contents of a data structure are the same as its descendants. This works for many cases, but not for all. Ways to deal with descendants that are not contents, such as globals, are dealt with in the section on persistent objects. Ways to deal with contents that are not descendants, such as inside-out objects, are dealt with in the section on nieces.

Builtin Types

This document will refer to the builtin type of objects. Perl's builtin types are the types Perl originally gives objects, as opposed to blessed types, the types assigned objects by the bless function. The builtin types are listed in the description of the ref builtin in the Perl documentation.

Perl's ref function returns the blessed type of its argument, if the argument has been blessed into a package. Otherwise the ref function returns the builtin type. The "reftype function" in Scalar::Util always returns the builtin type, even for blessed objects.

Persistent Objects

As a practical matter, a descendant that is not part of the contents of a test structure is only a problem if its lifetime extends beyond that of the test structure. A descendant that is expected to stay around after the test structure is destroyed is called a persistent object.

A persistent object is not a memory leak. That's the problem. Test::Weaken is trying to find memory leaks and it looks for data objects that remain after the test structure is freed. But a persistent object is not expected to disappear when the test structure goes away.

We need to separate the unfreed data objects which are memory leaks, from those which are persistent data objects. It's usually easiest to do this after the test by examining the return value of "unfreed_proberefs". The "ignore" named argument can also be used to pass Test::Weaken a closure that separates out persistent data objects "on the fly". These methods are described in detail below.

Nieces

A niece data object (also a niece object or just a niece) is a data object that is part of the contents of a data structure, but that is not a descendant of the top object of that data structure. When the OO technique called "inside-out objects" is used, most of the attributes of the blessed object will be nieces.

In Test::Weaken, usually the easiest way to deal with non-descendant contents is to make the data structure you are trying to test the lab rat in a wrapper structure. In this scheme, your test structure constructor will return a reference to the top object of the wrapper structure, instead of to the top object of the lab rat.

The top object of the wrapper structure will be a wrapper array. The wrapper array will contain the top object of the lab rat, along with other objects. The other objects need to be chosen so that the contents of the wrapper array are exactly the wrapper array itself, plus the contents of the lab rat.

It is not always easy to find the right objects to put into the wrapper array. For example, determining the contents of the lab rat may require a recursive scan from the lab rat's top object. Depending on the logical structure of the lab rat, this may be far from trivial.

As an alternative to using a wrapper, it is possible to have Test::Weaken add contents "on the fly," while it is scanning the lab rat. This can be done using the contents named argument, which takes a closure as its value.

Why the Test Structure is Passed Via a Closure

Test::Weaken gets its test structure reference indirectly, as the return value from a test structure constructor. Why so roundabout?

Because the indirect way is the easiest. When you create the test structure in Test::Weaken's calling environment, it takes a lot of craft to avoid leaving unintended references to the test structure in that calling environment. It is easy to get this wrong. Those unintended references will create memory leaks that are artifacts of the test environment. Leaks that are artifacts of the test environment are very difficult to sort out from the real thing.

The closure-local strategy is the easiest way to avoid leaving unintended references to the contents of Perl data objects. Using the closure-local strategy means working entirely within a closure, using only data objects local to that closure. Data objects local to a closure will be destroyed when the closure returns, and any references they held will be released. The closure-local strategy makes it relatively easy to be sure that nothing is left behind that will hold an unintended reference to any of the contents of the test structure.

Nothing prevents a user from subverting the closure-local strategy. A test structure constructor can return a reference to a test structure created from Perl data objects in any scope the user desires.

Returns and Exceptions

The methods of Test::Weaken do not return errors. Errors are always thrown as exceptions.

PORCELAIN METHODS

leaks

    use Test::Weaken;
    use English qw( -no_match_vars );

    my $tester = Test::Weaken::leaks(
        {   constructor => sub { Buggy_Object->new() },
            destructor  => \&destroy_buggy_object,
        }
    );
    if ($tester) {
        print "There are leaks\n" or Carp::croak("Cannot print to STDOUT: $ERRNO");
    }

Returns a Perl false if no unfreed data objects were detected. If unfreed data objects were detected, returns an evaluated Test::Weaken class instance.

Instances of the Test::Weaken class are called testers. An evaluated tester is one on which the tests have been run, and for which results are available.

Users who only want to know if there were unfreed data objects can test the return value of "leaks" for Perl true or false. Arguments to the "leaks" static method are passed as a reference to a hash of named arguments. "leaks" can also be called in a special "short form", where the test structure constructor and test structure destructor are passed directly as code references.

constructor => $coderef

The constructor argument is required. Its value must be a code reference to the test structure constructor. The test structure constructor should return a reference to the test structure. It is best to follow strictly the closure-local strategy, as described above.

    leaks ({ constructor => sub {
                              return Some::Object->new(123);
                            },
          });

When "leaks" is called using the "short form", the code reference to the test structure constructor must be the first argument to "leaks".

    leaks (sub {
             return Some::Object->new(123);
          });

The constructor can also return a list of objects all of which are to be checked.

    leaks (sub {
             return (Foo->new(), Bar->new());
          });

Usually this is when two objects are somehow inter-related so they both should destroy together, or perhaps sub-parts of an object not reached by the contents tracing (though see contents below for a more general way to reach such sub-parts.)

destructor => $coderef

The destructor argument is optional. If specified, its value must be a code reference to the test structure destructor.

Some test structures require a destructor to be called when they are freed. This destructor function is called just before Test::Weaken tries to free the test structure (by setting to undef). It's called with the object as returned by the constructor,

    leaks ({ constructor => sub { Foo->new },
             destructor  => sub {
                              my ($foo) = @_;
                              $foo->destroy;
                            },
          });

If the constructor returns multiple values then they're all passed to the destructor. The return value of the test structure destructor is ignored.

When "leaks" is called using the "short form", a code reference to the test structure destructor is the optional, second argument to "leaks".

    leaks (sub { Foo->new },
           sub {
             my ($foo) = @_;
             $foo->destroy;
           });
ignore
    sub ignore_my_global {
        my ($probe) = @_;
        return ( Scalar::Util::blessed($probe) && $probe->isa('MyGlobal') );
    }

    my $tester = Test::Weaken::leaks(
        {   constructor => sub { MyObject->new() },
            ignore      => \&ignore_my_global,
        }
);

The ignore argument is optional. It can be used to make a decision, specific to each Perl data object, on whether that object is ignored, or tracked and examined for children.

Use of the "ignore" argument should be avoided. Filtering the probe references that are returned by "unfreed_proberefs" is easier, safer and faster. But filtering after the fact is not always practical. For example, if large or complicated sub-objects need to be filtered out, it may be easiest to do so before they end up in the results.

When specified, the value of the "ignore" argument must be a reference to a callback subroutine. If the reference to the callback subroutine is $ignore, Test::Weaken's call to it will be the equivalent of $ignore->($safe_copy), where $safe_copy is a copy of a probe reference to a Perl data object.

The "ignore" callback will be made once for every Perl data object when it is about to be tracked, and once for every data object when it is about to be examined for children. The callback subroutine should return a Perl true value if the probe reference is to a data object which should be ignored. If the data object should be tracked and examined for children, the callback subroutine should return a Perl false.

For safety, Test::Weaken passes the "ignore" callback a copy of the internal probe reference. This prevents the user altering the probe reference itself. However, the data object referred to by the probe reference is not copied. Everything that is referred to, directly or indirectly, by this probe reference should be left unchanged by the "ignore" callback. The result of modifying the probe referents might be an exception, an abend, an infinite loop, or erroneous results.

The example above shows a common use of the "ignore" callback. In this a blessed object is ignored, but not the references to it. This is typically what is wanted. Often you know certain objects are outside the contents of your test structure, but you have references to those objects that are part of the contents of your test structure. In that case, you want to know if the references are leaking, but you do not want to see reports when the outside objects themselves are persistent. Compare this with the example for the "contents" callback below.

"ignore" callbacks are best kept simple. Defer as much of the analysis as you can until after the test is completed. "ignore" callbacks can be a significant overhead.

Test::Weaken offers some help in debugging "ignore" callback subroutines. See below.

contents
    sub contents {
        my ($probe) = @_;
        return unless Scalar::Util::reftype $probe eq 'REF';
        my $thing = ${$probe};
        return unless Scalar::Util::blessed($thing);
        return unless $thing->isa('MyObject');
        return ( $thing->data, $thing->moredata );
    } ## end sub MyObject::contents
    my $tester = Test::Weaken::leaks(
        {   constructor => sub { return MyObject->new },
            contents    => \&MyObject::contents
        }
    );

The contents argument is optional. It can be used to tell Test::Weaken about additional Perl data objects that need to be included, along with their children, in order to find all of the contents of the test data structure.

Use of the "contents" argument should be avoided when possible. Instead of using the "contents" argument, it is often possible to have the constructor create a reference to a "wrapper structure", as described above in the section on nieces. The "contents" argument is for situations where the "wrapper structure" technique is not practical. If, for example, creating the wrapper structure would involve a recursive descent through the lab rat object, using the "contents" argument may be easiest.

When specified, the value of the "contents" argument must be a reference to a callback subroutine. If the reference is $contents, Test::Weaken's call to it will be the equivalent of $contents->($safe_copy), where $safe_copy is a copy of the probe reference to a Perl data object. The "contents" callback is made once for every Perl data object when that Perl data object is about to be examined for children. This can impose a significant overhead.

The example of a "contents" callback above adds data objects whenever it encounters a reference to a blessed object. Compare this with the example for the "ignore" callback above. Checking for references to blessed objects will not produce the same behavior as checking for the blessed objects themselves -- there may be many references to a single object.

The callback subroutine will be evaluated in array context. It should return a list of additional Perl data objects to be tracked and examined for children. This list may be empty.

The "contents" and "ignore" callbacks can be used together. If, for an argument Perl data object, the "ignore" callback returns true, the objects returned by the "contents" callback will be used instead of the children for the argument data object. If, for an argument Perl data object, the "ignore" callback returns false, the objects returned by the "contents" callback will be used in addition to the children for the argument data object. Together, the "contents" and "ignore" callbacks can be used to completely customize the way in which Test::Weaken determines the contents of a data structure.

For safety, Test::Weaken passes the "contents" callback a copy of the internal probe reference. This prevents the user altering the probe reference itself. However, the data object referred to by the probe reference is not copied. Everything that is referred to, directly or indirectly, by this probe reference should be left unchanged by the "contents" callback. The result of modifying the probe referents might be an exception, an abend, an infinite loop, or erroneous results.

tracked_types
    my $test = Test::Weaken::leaks(
        {   constructor => sub {
                my $obj = MyObject->new;
                return $obj;
            },
            tracked_types => ['GLOB'],
        }
    );

The tracked_types argument is optional. If specified, the value of the tracked_types argument must be a reference to an array of the names of additional builtin types to track.

Objects of builtin types ARRAY, HASH, REF, SCALAR, VSTRING, and CODE are tracked by default. The builtin types that are not tracked, and which you may wish to add, are GLOB, IO, FORMAT and LVALUE. They are not tracked by default because, for reasons given below, tracking them usually causes more trouble than it saves.

unfreed_proberefs

    use Test::Weaken;
    use English qw( -no_match_vars );

    my $tester = Test::Weaken::leaks( sub { Buggy_Object->new() } );
    if ($tester) {
        my $unfreed_proberefs = $tester->unfreed_proberefs();
        my $unfreed_count     = @{$unfreed_proberefs};
        printf "%d of %d references were not freed\n",
            $tester->unfreed_count(), $tester->probe_count()
            or Carp::croak("Cannot print to STDOUT: $ERRNO");
        print "These are the probe references to the unfreed objects:\n"
            or Carp::croak("Cannot print to STDOUT: $ERRNO");
        for my $ix ( 0 .. $#{$unfreed_proberefs} ) {
            print Data::Dumper->Dump( [ $unfreed_proberefs->[$ix] ],
                ["unfreed_$ix"] )
                or Carp::croak("Cannot print to STDOUT: $ERRNO");
        }
    }

Returns a reference to an array of probe references to the unfreed data objects. Throws an exception if there is a problem, for example if the tester has not yet been evaluated.

The return value can be examined to pinpoint the source of a leak. A user may also analyze the return value to produce her own statistics about unfreed data objects.

unfreed_count

    use Test::Weaken;
    use English qw( -no_match_vars );

    my $tester = Test::Weaken::leaks( sub { Buggy_Object->new() } );
    next TEST if not $tester;
    printf "%d objects were not freed\n", $tester->unfreed_count(),
        or Carp::croak("Cannot print to STDOUT: $ERRNO");

Returns the count of unfreed data objects. This count will be exactly the length of the array referred to by the return value of the "unfreed_proberefs" method. Throws an exception if there is a problem, for example if the tester has not yet been evaluated.

probe_count

        use Test::Weaken;
        use English qw( -no_match_vars );

        my $tester = Test::Weaken::leaks(
            {   constructor => sub { Buggy_Object->new() },
                destructor  => \&destroy_buggy_object,
            }
        );
        next TEST if not $tester;
        printf "%d of %d objects were not freed\n",
            $tester->unfreed_count(), $tester->probe_count()
            or Carp::croak("Cannot print to STDOUT: $ERRNO");

Returns the total number of probe references in the test, including references to freed data objects. This is the count of probe references after Test::Weaken was finished finding the descendants of the test structure reference, but before Test::Weaken called the test structure destructor or reset the test structure reference to undef. Throws an exception if there is a problem, for example if the tester has not yet been evaluated.

PLUMBING METHODS

Most users can skip this section. The plumbing methods exist to satisfy object-oriented purists, and to accommodate the rare user who wants to access the probe counts even when the test did find any unfreed data objects.

new

    use Test::Weaken;
    use English qw( -no_match_vars );

    my $tester        = Test::Weaken->new( sub { My_Object->new() } );
    my $unfreed_count = $tester->test();
    my $proberefs     = $tester->unfreed_proberefs();
    printf "%d of %d objects freed\n",
        $unfreed_count,
        $tester->probe_count()
        or Carp::croak("Cannot print to STDOUT: $ERRNO");

The "new" method takes the same arguments as the "leaks" method, described above. Unlike the "leaks" method, it always returns an unevaluated tester. An unevaluated tester is one on which the test has not yet been run and for which results are not yet available. If there are any problems, the "new" method throws an exception.

The "test" method is the only method that can be called successfully on an unevaluated tester. Calling any other method on an unevaluated tester causes an exception to be thrown.

test

    use Test::Weaken;
    use English qw( -no_match_vars );

    my $tester = Test::Weaken->new(
        {   constructor => sub { My_Object->new() },
            destructor  => \&destroy_my_object,
        }
    );
    printf "There are %s\n", ( $tester->test() ? 'leaks' : 'no leaks' )
        or Carp::croak("Cannot print to STDOUT: $ERRNO");

Converts an unevaluated tester into an evaluated tester. It does this by performing the test specified by the arguments to the "new" constructor and recording the results. Throws an exception if there is a problem, for example if the tester had already been evaluated.

The "test" method returns the count of unfreed data objects. This will be identical to the length of the array returned by "unfreed_proberefs" and the count returned by "unfreed_count".

ADVANCED TECHNIQUES

Tracing Leaks

Avoidance

Test::Weaken makes tracing leaks easier, but avoidance is still by far the best way, and Test::Weaken helps with that. You need to use test-driven development, Test::More, modular tests in a t/ subdirectory, and revision control. These are all very good ideas for many other reasons.

Make Test::Weaken part of your test suite. Test frequently, so that when a leak occurs, you'll have a good idea of what changes were made since the last successful test. Often, examining these changes is enough to tell where the leak was introduced.

Adding Tags

The "unfreed_proberefs" method returns an array containing probes to the unfreed data objects. This can be used to find the source of leaks. If circumstances allow it, you might find it useful to add "tag" elements to arrays and hashes to aid in identifying the source of a leak.

Using Referent Addresses

You can quasi-uniquely identify data objects using the referent addresses of the probe references. A referent address can be determined by using "refaddr" in Scalar::Util. You can also obtain the referent address of a reference by adding 0 to the reference.

Note that in other Perl documentation, the term "reference address" is often used when a referent address is meant. Any given reference has both a reference address and a referent address. The reference address is the reference's own location in memory. The referent address is the address of the Perl data object to which the reference refers. It is the referent address that interests us here and, happily, it is the referent address that both zero addition and refaddr return.

Other Techniques

Sometimes, when you are interested in why an object is not being freed, you want to seek out the reference that keeps the object's refcount above 0. Devel::FindRef can be useful for this.

More About Quasi-Unique Addresses

I call referent addresses "quasi-unique", because they are only unique at a specific point in time. Once an object is freed, its address can be reused. Absent other evidence, a data object with a given referent address is not 100% certain to be the same data object as the object that had the same address earlier. This can bite you if you're not careful.

To be sure an earlier data object and a later object with the same address are actually the same object, you need to know that the earlier object will be persistent, or to compare the two objects. If you want to be really pedantic, even an exact match from a comparison doesn't settle the issue. It is possible that two indiscernable (that is, completely identical) objects with the same referent address are different in the following sense: the first data object might have been destroyed and a second, identical, object created at the same address. But for most practical programming purposes, two indiscernable data objects can be regarded as the same object.

Debugging Ignore Subroutines

check_ignore

    $tester = Test::Weaken::leaks(
        {   constructor => sub { MyObject->new() },
            ignore => Test::Weaken::check_ignore( \&ignore_my_global ),
        }
    );
    $tester = Test::Weaken::leaks(
        {   constructor => sub { DeepObject->new() },
            ignore      => Test::Weaken::check_ignore(
                \&cause_deep_problem, 99, 0, $reporting_depth
            ),
        }
    );

It can be hard to determine if "ignore" callback subroutines are inadvertently modifying the test structure. The Test::Weaken::check_ignore static method is provided to make this task easier. Test::Weaken::check_ignore constructs a debugging wrapper from four arguments, three of which are optional. The first argument must be the ignore callback that you are trying to debug. This callback is called the test subject, or lab rat.

The second, optional argument, is the maximum error count. Below this count, errors are reported as warnings using Carp::carp. When the maximum error count is reached, an exception is thrown using Carp::croak. The maximum error count, if defined, must be an number greater than or equal to 0. By default the maximum error count is 1, which means that the first error will be thrown as an exception.

If the maximum error count is 0, all errors will be reported as warnings and no exception will ever be thrown. Infinite loops are a common behavior of buggy lab rats, and setting the maximum error count to 0 will usually not be something you want to do.

The third, optional, argument is the compare depth. It is the depth to which the probe referents will be checked, as described below. It must be a number greater than or equal to 0. If the compare depth is 0, the probe referent is checked to unlimited depth. By default the compare depth is 0.

This fourth, optional, argument is the reporting depth. It is the depth to which the probe referents are dumped in check_ignore's error messages. It must be a number greater than or equal to -1. If the reporting depth is 0, the object is dumped to unlimited depth. If the reporting depth is -1, there is no dump in the error message. By default, the reporting depth is -1.

Test::Weaken::check_ignore returns a reference to the wrapper callback. If no problems are detected, the wrapper callback behaves exactly like the lab rat callback, except that the wrapper is slower.

To discover when and if the lab rat callback is altering its arguments, Test::Weaken::check_ignore compares the test structure before the lab rat is called, to the test structure after the lab rat returns. Test::Weaken::check_ignore compares the before and after test structures in two ways. First, it dumps the contents of each test structure using Data::Dumper. For comparison purposes, the dump using Data::Dumper is performed with Maxdepth set to the compare depth as described above. Second, if the immediate probe referent has builtin type REF, Test::Weaken::check_ignore determines whether the immediate probe referent is a weak reference or a strong one.

If either comparison shows a difference, the wrapper treats it as a problem, and produces an error message. This error message is either a Carp::carp warning or a Carp::croak exception, depending on the number of error messages already reported and the setting of the maximum error count. If the reporting depth is a non-negative number, the error message includes a dump from Data::Dumper of the test structure. Data::Dumper's Maxdepth for reporting purposes is the reporting depth as described above.

A user who wants other features, such as deep checking of the test structure for strengthened references, can easily copy Test::Weaken::check_ignore from the Test::Weaken source and hack it up. check_ignore is a static method that does not use any Test::Weaken package resources. The hacked version can reside anywhere, and does not need to be part of the Test::Weaken package.

XSUB Mortalizing

When a C language XSUB returns a newly created scalar it should "mortalize" so the scalar is freed once the caller has finished with it (see "Reference Counts and Mortality" in perlguts). Failing to do so leaks memory.

    SV *ret = newSViv(123);
    sv_2mortal (ret);   /* must mortalize */
    XPUSHs (ret);

Test::Weaken can check this by taking a reference to the returned scalar,

    my $leaks = leaks (sub {
                         return \( somexsub() );
                       });
    if ($leaks) ...

Don't store to a local scalar and then return that. Doing so will only check the local scalar, not the one returned by somexsub().

If you want the value for further calculations then take a reference to the return then look through that for the value.

    leaks (sub {
             my $ref = \( somexsub() );
             my $value = $$ref;
             # ... do something with $value
             return $ref;
           });

If an XSUB returns a list of values then take a reference to each as follows. This works because map and for make the loop variable (either $_ or named) an alias to each value successively.

    leaks (sub {
             return [ map {\$_} somexsub() ];
           });

    # or with a for loop
    leaks (sub {
             my @refs;
             foreach my $value (somexsub()) {
               push @refs, \$value;
             }
             return \@refs;
           });

Don't store a returned list to an array (either named or anonymous) as this copies into new scalars in that array and the returned ones from somexsub() then aren't checked.

If you want the values from a list for extra calculations then take the references first and look through them for the values like the single case above. For example,

    leaks (sub {
             my @refs = map {\$_} somexsub();
             my $first_ref = $refs[0]
             my $value = $$first_ref;
             # ... do something with $value
             return \@refs;
           });

An XSUB might deliberately return the same scalar each time, perhaps a pre-calculated constant or a global variable it maintains. In that case the scalar intentionally won't weaken away and this leaks() checking is not applicable.

Returning the same scalar every time occurs in pure Perl too from an anonymous constant subr, such as created by the constant module (see constant). This is unlikely to arise directly, but could be encountered through a scalar ref within an object etc.

    # FOO() returns same scalar every time
    *FOO = sub () { 123 };

    # likewise from the constant module
    use constant BAR => 456;

There's no way to tell the intended lifespan of an XSUB return, but generally if the code has any sort of newSV() or sv_newmortal() etc making a new scalar every time then it ought to weaken away.

The details of an XSUB return are often hidden in a typemap file for brevity and consistency (see "The Typemap" in perlxs). The supplied types (Extutils/typemap) are hard to get wrong, but code with explicit PUSHs() etc is worth checking. Generally too much mortalizing causes negative refcounts and probable segfaults, and not enough mortalizing leaks memory.

EXPORTS

By default, Test::Weaken exports nothing. Optionally, "leaks" may be exported.

IMPLEMENTATION DETAILS

Overview

Test::Weaken first recurses through the test structure. Starting from the test structure reference, it examines data objects for children recursively, until it has found the complete contents of the test structure. The test structure is explored to unlimited depth. For each tracked Perl data object, a probe reference is created. Tracked data objects are recorded. In the recursion, no object is visited twice, and infinite loops will not occur, even in the presence of cycles.

Once recursion through the test structure is complete, the probe references are weakened. This prevents the probe references from interfering with the normal deallocation of memory. Next, the test structure destructor is called, if there is one.

Finally, the test structure reference is set to undef. This should trigger the deallocation of the entire contents of the test structure. To check that this happened, Test::Weaken dereferences the probe references. If the referent of a probe reference was deallocated, the value of that probe reference will be undef. If a probe reference is still defined at this point, it refers to an unfreed Perl data object.

Tracked Objects

By default, objects of builtin types ARRAY, HASH, REF, SCALAR, VSTRING, and CODE are tracked. By default, GLOB, IO, FORMAT and LVALUE objects are not tracked.

Data::Dumper does not deal with IO and LVALUE objects gracefully, issuing a cryptic warning whenever it encounters them. Since Data::Dumper is a Perl core module in extremely wide use, this suggests that these IO and LVALUE objects are, to put it mildly, not commonly encountered as the contents of data structures.

GLOB objects usually either refer to an entry in the Perl symbol table, or are associated with a filehandle. Either way, the assumption they will share the lifetime of their parent data object is thrown into doubt. The trouble saved by ignoring GLOB objects seems to outweigh any advantage that would come from tracking them. IO objects, which are ignored because of Data::Dumper issues, are often associated with GLOB objects.

FORMAT objects are always global, and therefore can be expected to be persistent. Use of FORMAT objects is officially deprecated. Data::Dumper does not deal with FORMAT objects gracefully, issuing a cryptic warning whenever it encounters one.

This version of Test::Weaken might someday be run in a future version of Perl and encounter builtin types it does not know about. By default, those new builtin types will not be tracked. Any builtin type may be added to the list of builtin types to be tracked with the tracked_types named argument.

Examining Objects for Children

Objects of builtin type ARRAY, HASH, REF, SCALAR, VSTRING, GLOB, and LVALUE are examined for children. Specifically, elements of ARRAY objects, values of HASH objects, and referents of REF objects are children. Underlying tied variables are also children.

Objects of type CODE are not examined for children. Not examining CODE objects for children can be seen as a limitation, because closures do hold internal references to data objects. Future versions of Test::Weaken may examine CODE objects.

A variable of builtin type GLOB may be a scalar which was assigned a GLOB value (a scalar-GLOB) or it may simply be a GLOB (a pure-GLOB). The issue that arises from Test::Weaken's standpoint is that, in the case of a scalar-GLOB, the scalar and the GLOB may be tied separately. At present, the underlying tied variable of the scalar side of a scalar-GLOB is ignored. Only the underlying tied variable of the GLOB is a child for Test::Weaken's purposes.

The default method of recursing through a test structure to find its contents can be customized. The "ignore" callback can be used to force an object not to be examined for children. The "contents" callback can be used to add user-determined contents to the test structure.

AUTHOR

Jeffrey Kegler

BUGS

Please report any bugs or feature requests to bug-test-weaken at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Weaken. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Test::Weaken

You can also look for information at:

SEE ALSO

Test::Weaken at this point is robust and has seen extensive use. Its tracking of memory is careful enough that it has even stumbled upon a bug in perl itself.

Test::Weaken::Gtk2 is a CPAN Module of "helper" functions for Test::Weaken. Test::Weaken::Gtk2 is specifically aimed at the needs of users of Gtk2, but can also be used as an example of how an expert user extends and adapts Test::Weaken. Kevin Ryde, the author of Test::Weaken::Gtk2, has been a important contributor to Test::Weaken.

ACKNOWLEDGEMENTS

Thanks to jettero, Juerd, morgon and perrin of Perlmonks for their advice. Thanks to Lincoln Stein (developer of Devel::Cycle) for test cases and other ideas. Kevin Ryde made many important suggestions and provided the test cases which provided the impetus for the versions 2.000000 and after. For version 3.000000, Kevin also provided patches.

LICENSE AND COPYRIGHT

Copyright 2012 Jeffrey Kegler, all rights reserved.

Copyright 2012 Kevin Ryde

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.