The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
# vim: ts=8 sw=2 sts=0 noexpandtab:
# $Id: HACKING 582 2008-10-31 14:06:44Z nick@ccl4.org $

HACKING Devel::NYTProf
======================

We encourage hacking Devel::NYTProf!

OBTAINING THE CURRENT RELEASE
-----------------------------
The current official release can be obtained from CPAN
http://search.cpan.org/dist/Devel-NYTProf/

OBTAINING THE LATEST DEVELOPMENT CODE
-------------------------------------
You can grab the head of the latest trunk code from the Google Code repository, see
http://code.google.com/p/perl-devel-nytprof/source/checkout

CONTRIBUTING
------------
Please work with the latest code from the repository - see above.

Small patches can be uploaded via the issue tracker at
http://code.google.com/p/perl-devel-nytprof/issues/list

For larger changes please talk to us first via the mailing list at
http://code.google.com/p/perl-devel-nytprof/source/checkout

When developing, please ensure that no new compiler warnings are output.

TESTING
-------
You MUST write test cases for your changes. All tests that are dropped into the
"t" folder will be executed. (Remember to add them to MANIFEST.)  The testing system is
customized for this module because profilers are not that easy to test.
The system still uses Test::Harness and Test::More, so it should behave just
like any other perl modules 'make test'.

Writing tests is easy!

1) Design a perl script that will trigger the new behavior/feature that you
   want to test. Name the file 't/test##-description.p'

2) Create an empty 'reference' file for the test.
   Name the file 't/test##-description.rdt'
   When the test is run you'll get an error and a diff and you'll
   find a t/test##-description.rdt.new file waiting for you.
   If, and only if, the contents of that file are correct, then rename
   it to t/test##-description.rdt and you're done!
   Of course working out if the contents are correct can be
   non-trivial, but at least you don't have to write the file :)

3) Create a corresponding CSV output file if appropriate.
   You can use the same trick of creating an empty file, but this
   time with a .x suffix: t/test##-description.x
   You still need to verify the .x.new file of course!

Note:  While writing a test, it is helpful to be able to run it directly, 
without the test harness.  This allows you to view more output stdout and 
stderr.  Fortunately, its easy to do:

  perl -Mblib -MDevel::NYTProf t/test01.p

The output will be in the ./nytprof.out file.  You can then also run the
csv manually:

  perl -Mblib bin/nytprofcsv

The final file will be in ./nytprof/test01.p.csv

Remember, testing is VERY VERY important!  Within a day or two of releasing
code, the CPAN testers will test the release on pretty much every major platform
you can think of.  A failed test report is much easier to fix than a runtime
error like "bash: segmentation fault: core dumped"

GENERATING DISTRIBUTIONS
------------------------
Releases are generated with 'make metafile', and then fed through tar+gz.
You shouldn't ever check-in the distribution directory, any temporary files
(including Makefile.old) or change the $VERSION numbers. We'll do this for you.

RESOURCES
---------
Google Code:
http://code.google.com/p/perl-devel-nytprof/

Google Devel Group (must subscribe here):
http://groups.google.com/group/develnytprof-dev

NYTimes Open Code Blog:
http://open.nytimes.com/

TODO (unsorted)
----

Fix Reader
- not document methods with a leading underscore
- to have consistent_naming_style notMixedCamelCase (fixed?)
- to be fully OO (ie not document non-OO interfaces)
- to be subclassable
- to provide a subclass to manage generating CSV
- to provide a subclass to manage generating HTML

Then rework bin/ntyprof* to use the new subclasses
Ideally end up with a single nytprof command.

The whole reporting framework needs a rewrite to use a single 'thin' command
line and classes for the Model (lines, files, subs), View (html, csv etc),
and Controller (composing views to form reports).

Add (very) basic nytprofhtml test (ie it runs and produces output)

Rework option parsing so options can be implemented in perl, accessed from
perl, and stored in data file.

Write tests for new functionality.

Add way for program being profiled to switch output to a new profile file.
Perhaps via enable_profile($optional_new_filename)
See http://search.cpan.org/dist/Devel-Profile/ for use case.

Add @INC to data file so reports can be made more readable by removing
(possibly very long) library paths where appropriate.

Add time to begin and end pid markers in data file.
Add marker with timestamp for phases BEGIN, CHECK, INIT, END
(could combine with pid marker)

Add actual size and mtime of fid to data file. (Already in data file as zero,
just needs the stat() call.) Don't alter errno.

Intercept all opcodes that may fork and run perl code in the child
  ie fork, open, entersub (ie xs), others?
  and fflush before executing the op (so fpurge isn't strictly required)
  and reinit_if_forked() afterwards
  add option to force reinit_if_forked check per stmt just-in-case
Alternatively it might be better to use pthread_atfork() [if available] with a
child handler. The man page says "Remember: only async-cancel-safe functions
are allowed on the child side of fork()" so it seems that the safe thing to do
is to use a volatile flag variable, and change its value in the handler to
signal to the main code.

Add way to merge profile data. Merging could be done in perl.

Add constants to Data.pm for the array indexes
0=time_spent, 1=exe_count, 2=eval_line_data, etc

Support profiling programs which use threads:
  - move all relevant globals into a structure
  - add lock around output to file

We now save eval strings (from @{"_<$filename"}, see perldoc perldebguts)
but it requires use_db_subs=1 due to perl internals. Currently unused.
Add source code of first string eval at each fid:line to report.
Add option to control saving of source code. Perhaps

  savesrc=0 - don't save any source
  savesrc=1 - save only first string eval src per distinct fid:line (default)
  savesrc=2 - save all string eval src
  savesrc=3 - save all source code, not just string evals

Also option to delete @{"_<$filename"} to release memory could be useful
for programs that doo a lot of string evals.

Add % of total time to file table on index page.
Add % of total time to exclusive time column in subs table as a tooltip.
To do these we need accurate total time - based on sum of times between enable_profile()
and disable_profile().

Add resolution of __ANON__ sub names (eg imported 'constants') where possible.

Trim leading @INC portion from filename in __ANON__[/very/long/path/...]
in report output. (Keep full path in link/tooltip/title as it may be ambiguous when shortened).

Explain what's shown in html reports, ie say it's elapsed realtime.

Currently the line of only last BEGIN (or 'use') in the file are recorded.
Rename Foo::BEGIN subs to Foo::BEGIN[file:line]
(which matches the style used for Foo::__AUTO__[file:line])
Probably need to record or output the line range when the BEGIN 'sub' is entered.

Record $AUTOLOAD when AUTOLOAD() called
Perhaps as ...::AUTOLOAD[$AUTOLOAD]

Refactor this HACKING file!

Add file format backwards compatibility tests.

Add tests for evals in regex: s/.../ ...perl code... /e

Add tests for -block and -sub csv reports.

Add tests with various kinds of blocks (if, do, etc) and loops.

Set options via import so perl -d:NYTProf=... works. Very handy. May need
alternative option syntax. Also perl gives special meaning to 't' option
(threads) so we should reserve the same for eventual thread support.
Problem with this is that the import() call happens late so
limits the usefulness.

Add help option which would print a summary of the options and exit.
Could also print list of available clocks for the clock=N option
(using a set of #ifdef's)

Add mechanism to specify options inside the .p file, such as

  # NYTPROF=...

Add mechanism to specify inside the .p file that NYTProf
should not be loaded via the command line. That's needed to test
behaviors in environments where perl is init'd first. Such as mod_perl.
Then we can test things like not having the sub line range for some subs.

Add top-n statements to file reports between sub table and line table.

Pure css tooltips, with a :before or :after with content:, may let us add help notes to the
counts column to describe what the count is actually a count of, without
bloating the html.

  http://meyerweb.com/eric/css/edge/popups/demo.html
  http://www.communitymx.com/content/article.cfm?page=4&cid=4E2C0
  http://www.kollermedia.at/archive/2008/03/24/easy-css-tooltip/

The tricky/clever/new idea is that by nesting a span inside another and using
the :before or :after on the inner one the text of the popup can reside in css
and not html. Mind you, I've not seen anyone do this so I may be crazy :)

The data file includes the information mapping a line-level line to the
corresponding block-level and sub-level lines. This should be added to the data
structure. It would enable a much richer visualization of which lines have
contributed to the 'rolled up' counts. That's especially tricky to work out
with the block level view.

Following on from that I have a totally crazy idea that the browsers css engine
could be used to highlight the corresponding rollup line when hovering over a
source line, and/or the opposite. Needs lots of thought, but it's an interesting idea.

Investigate and fix "Unable to determine line number" cases. Here's one:

  $ NYTPROF=begin=1:blocks=1:trace=1 perl  -d:NYTProf -Mstrict -e 1
  ...
  New fid  1 (after  0:1   ): -e /Users/timbo/perl/mods/nytprof-trunk/-e
  New fid  2 (after  1:3   ): /usr/local/perl58-i/lib/5.8.6/strict.pm 
  at 3: EVAL in different file (-e, /usr/local/perl58-i/lib/5.8.6/strict.pm) at /usr/local/perl58-i/lib/5.8.6/strict.pm line 3.
  at 5: EVAL in different file (-e, /usr/local/perl58-i/lib/5.8.6/strict.pm) at /usr/local/perl58-i/lib/5.8.6/strict.pm line 5.
  at 25: EVAL in different file (-e, /usr/local/perl58-i/lib/5.8.6/strict.pm) at /usr/local/perl58-i/lib/5.8.6/strict.pm line 25.
  at 37: EVAL in different file (-e, /usr/local/perl58-i/lib/5.8.6/strict.pm) at /usr/local/perl58-i/lib/5.8.6/strict.pm line 37.
  Unable to determine line number in -e.
  Unable to determine line number in -e.

Add a 'permalink' icon (eg infinity symbol) to the right of lines defining subs
to make it easer to email/IM links to particular places in the code.

Change from tracing via warn() to use our own function that, at least initially,
calls warn() while temporarily disabling the __WARN__ hook.

Profile and optimize report generation

Add title/tooltip to inclusive times (ie subroutine times) showing the percentage
of the total runtime it represents.

The sub_caller information is currently one level deep. It would be good to
make it two levels. Especially because it would allow you to "see through"
AUTOLOADs and other kinds of 'dispatch' subs.

Currently goto isn't explicitly noticed by the sub profiler. Need to intercept pp_goto.
But that may be non-trivial. Could make it look like the statement that called
the sub that called goto also called the sub that goto went to, or make it look
like the goto &$sub made the call (but we'd then get the wrong inclusive time,
probably).

Slow builtins, eg those that make system calls or are otherwise expensive, like crypt,
could be treated as calls to xsubs in the CORE:: namespace.

Replace DB::enable_profiling() and DB::disable_profiling() with $DB::profile = 1|0;
That a more consistent API with $DB::single etc., but more importantly it lets
users leave the code in place when NYTProf is not loaded. It'll just do nothing,
whereas currently the user will get a fatal error if NYTProf isn't loaded.
It also allows smart things like use of local for temporary overrides.

Combine current profile_* globals into a single global int using bit fields.
That way assigning to $DB::profile can offer a finer degree of control.

Add mechanism to enable control of profiling on a per-sub-name and/or
per-package-name basis. For example, specify a regex and whenever a sub is
entered (for the first time, to make it cheap) check if the sub name matches
the regex. If it does then save the current $DB::profile value and set a new one.
When the sub exits restore the previous $DB::profile value.

Could optionally track resource usage per sub. Data sources could be perl sv
arenas (clone visit() function from sv.c) to measure number of SVs & total SV
memory, plus getrusage()). Abstract those into a structure with functions to
subtract the difference. Then use the same logic to get inclusive and exclusive
values as we use for inclusive and exclusive subroutine times.

Report max recursion depth and reci_time per sub in per-file reports.

Bug or limitation?: sub calls in a continue { ... } block of a while () get
associated with the 'next;' within the loop.
Also, test sub caller location for

  while ( foo() ) {  # all calls to foo should be from here
      ...
      ... # no calls to foo() should appear here
  }

Report could track which subs it has reported caller info for
and so be able to identify subs that were called but haven't been included
in the report because we didn't know where the sub was.
They could them be included in a separate 'miscellaneous' page.
This is a more general way to view the problem of xsubs in packages
for which we don't have any perl source code.

Investigate style.css problem when using --outfile=some/other/dir

Could save 'current subname' in sub profiler so we can say A was called by B
and not just A was called by line X of file Y. (Will need to SAVE* a link to
previous current subname and restore it on return from sub.)

Add per-package summary table like the per-sub stats to make it easier to see
a package where a lot of time is being spent in lots of different subs.

Add option to set processor affinity.