The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Apocalypse_20 - Debugging [DRAFT]

VERSION

        Maintainer:             pugs committers
        Date:                   10 Apr 2005
        Last modified:  10 Apr 2005

This document proposes how debugging and AST introspection hooks might look in perl 6, in the context of MMD and macros.

PURPOSE

An attempt at drafting a callback API that should allow useful and simple implementation of:

  • The perl 6 debugger.

  • Infectious traits, enabling the implementation of taint mode and more complex variations on the theme.

EXAMPLE

With taint mode generalized into a debugging system, this perl statement will yield:

        $x = ($a ?? $b !! $c); # assuming $a is true
        ----------Xs----------
        ----------Xe---------
             --------E-------
              A-    B-    C- 

Assuming:

        Xs is the whole statement
        Xe is the single expression in Xs, it is apply(=, $x, ...);
        E is the ( ?? !! ) expression, it is apply(??!!, A, B, C);
        A, B, C are expressions for the vars $a, $b, $c ($x expr omitted for brevity)

The following callbacks:

                eval(Xs);
                        eval(Xe);
                                eval(E);
                                        apply(??!!, A, B, C);
                                                eval(A);
                                                # evaluated
                                                participated(A, $a); # $a is a value, perhaps is copy
                                                participated(E, $a);
                                                reduced(A, $a is copy); # also a value, even more so
                                                eval(B); # because A reduced to a true value
                                                participated(E, $a, $b);
                                                participated(B, $b);
                                                participated(E, $b);
                                                reduced(B, $b);
                                reduced(E, $b);
                                participated(Xe, $b); # by the reduction, we now know that $b interacts in Xe
                                participated(E, $x, $a, $b); # the order is from largest permutation to smallest
                                participated(E, $x, $a);
                                participated(E, $x, $b);
                                participated(Xe, $x, $b);
                                participated(E, $x);
                                participated(Xe, $x);
                                apply(=, $x, $b); # the same $b that E was reduced to
                        participated(Xe, $x, $b);
                        reduced(Xe, $x); # but really $b by now
                participated(Xs, $a, $b, $x); # a sort of catchall summary of the permuitations called therein

Participation is noted for every combination of values in an expression, as soon as it can be determined.

        participated(A, $a) is called once, as soon as possible. By that time we know that
        participated(E, $a) should also be called, so we do that
        participated(B, $b) there after. We now know that
        participated(E, $a, $b) should also happen, so we do that too

Note that the reduced values are also participating:

        substr($x, 3, 5);

        ...
                reduced(X, $substr); # $substr is a new value containing the retun value
                participated(X, $substr);
                participated(X, $substr, $x);
        ...

And literals are too. participated is called on 3 and 5 in that example as well.

OVERHEAD VS FEASABILITY

You use interaction between interaction whoring values and variable container values, and then apply that to expressions at the AST level, and then you can figure out from there the subset of suspected variables/values who might care about being told they participated in the expr.

This is just like pre-dispatching MMD at the compiler level.

Basically, you use the system to determine where it will be necessary. Isn't that cool?

If it's unused it should be zero overhead.

(Note: If I understand the above correctly, this hook system is exponential in the size of the expression. Yikes. -luqui)

THEORETICAL USAGE

DEBUGGING TRAPS

        multi sub trap_eval (Statement $x where { $x == $trapped }) {
                ...
        }

(i don't know how trap_eval will be named)

THE TAINTED TRAIT

A tainted trait should work by trapping

        participated(_, $x is tainted, $y isn't tainted);

and then setting tainted on $y. This requires that the MMD dispatch doesn't care if

        participated(X, $a, $b is tainted);

becomes

        participated(X, $b, $a);

Similarly, file handles and system calls are supplemented with more specific MMD dispatches, whose prototype contains 'does tainted', to disallow tainted content, or wrapped to mark taintedness.

Untainting is either explicit, by use of functions which just remove the trait, or implicitly:

an trap apply of the match operator (or perhaps 'reduced' on an expression containting the application of a rule match... We'll see how the macro AST for perl expressions will look) will untaint the captures inside the rule match object, after they were infected by being derived.

You could also define expressions which taint using macros:

        taints {
                ...
        }

the macro will take the AST for the block it encompasses, and taint everything that goes on inside the block.

DATA SEGREGATION

        my $user1_dbh does segregated :parent($user1);

        my $user2_dbh does segregated :parent($user2);

data coming from $user1_dbh will be segregated too, belonging to the object $user1.

when data from $user1 and $user2 meets, and exception is thrown.

If i'm implementing an IRC channel, for example, all input from a user's socket will be segregated.

Non commands will be unsegregated.

In this case a fatal error is thrown if due to a bug in the server the user's password as input by a command will be sent to a chatroom, which will try to make that data interact with other users sockets.

CODE COVERAGE METRICS

Just make a trap on statements, and make the callback record.

This model allows rather deep introspection, useful for detecting dead code.

PROFILING (OR NOT)

Just like you could do coverage analysis for code, you could profile it, if you can calculate the overhead of the hook MMDs out of the way. Perhaps counters on the ast also knowing time are in order. This might require parrotting though, but could definately be wrangled by these callbacks later.

We'll see.