- work on test suite.
- localize the depth to guard against non-local exits.
Current overhead (with PERLDBf_NONAME) wrt non-debugging run (estimates):
	 8% extra call frame on DB::sub
	 7% output of subroutine data
	70% output of timing data (on OS/2, 35% with custom dprof_times())
(Additional 17% are spent to write the output, but they are counted
 and subtracted.)  

With compensation for DProf overhead all but some odd 12% are subtracted ?!

- Calculate overhead/count for XS calls and Perl calls separately.
- goto &XSUB in pp_ctl.c;