NAME
Devel::StatProfiler - low-overhead sampling code profiler
VERSION
version 0.54_01
SYNOPSIS
# profile (needs multiple runs, with representative data/distribution!)
perl -MDevel::StatProfiler foo.pl input1.txt
perl -MDevel::StatProfiler foo.pl input2.txt
perl -MDevel::StatProfiler foo.pl input3.txt
perl -MDevel::StatProfiler foo.pl input1.txt
# prepare a report from profile data
statprofilehtml
DESCRIPTION
Devel::StatProfiler is a sampling (or statistical) code profiler.
Rather than measuring the exact time spent in a statement (or subroutine), the profiler interrupts the program at fixed intervals (10 milliseconds by default) and takes a stack trace. Given a sufficient number of samples this provides a good indication of where the program is spending time and has a relatively low overhead (around 3-5% increased runtime).
Options
Options can be passed either on the command line:
perl -MDevel::StatProfiler=-interval,1000,-template,/tmp/profile/statprof.out
or by loading the profiler directly from the profiled program
use Devel::StatProfiler -interval => 1000, -template => '/tmp/profile/statprof.out';
-template <path> (default: statprof.out)
Sets the base name used for the output file. The full filename is obtained by appending a dot followed by a random string to the template path. This ensures that subsequent profiler runs don't overwrite the same output file.
-nostart
Don't start profiling when the module is loaded. To start the profile call enable_profile()
.
-interval <microsecs> (default 10000)
Sets the sampling interval, in microseconds (accuracy varies depending on OS/hardware).
-maxsize <size> (default 10MB)
After the trace file grows bigger than this size, start a new one with a bigger ordinal.
-source <strategy> (default 'none')
Sets which source code is saved in the profile
- none
-
No source code is saved in the profile file.
- traced_evals
-
Only the source code for eval()s that have at least one sample during evaluation is saved. This does NOT include eval()s that define subroutines that are sampled after the eval() ends.
- all_evals
-
The source code for all eval()s is saved in the profile file.
- all_evals_always
-
The source code for all eval()s is saved in the profile file, even when profiling is disabled.
-depth <stack depth> (default 20)
Sets the maximum number of stack frames saved for each sample.
-metadata HASHREF
Emit custom metadata in the header section of each profile file; this metadata will be available right after calling Devel::StatProfiler::Reader->new
.
-file <path>
In general, using -template
above is the preferred option, since -file
will not work when using fork()
or threads.
Sets the exact file path used for profile output file; if the file is already present, it's overwritten.
CAVEATS
goto &subroutine
With a sampling profiler there is no reliable way to track the goto &foo
construct, hence the profile data for this code
sub foo {
# 100 milliseconds of computation
}
sub bar {
# 100 milliseconds of computation, then
goto &foo;
}
bar() for 1..100000; # foo.pl, line 10
will report that the code at foo.pl line 10 has spent approximately the same time in calling foo
and bar
, and will report foo
as being called from the main program rather than from bar
.
XSUBs with callbacks
Since XSUBs don't have a Perl-level stack frame, Perl code called from XSUBs is reported as if called from the source line calling the XSUB.
Additionally, the exclusive time for the XSUB incorrectly includes the time spent in callbacks.
XSUBs and overload
If an object has an overloaded &{}
operator (code dereference) returning an XSUB as the code reference, the overload might be called twice in some situations.
changing profiler state
Calling enable_profile
, disable_profile
and stop_profile
from an inner runloop (including but not limited to from use
, require
, sort
blocks, callbacks invoked from XS code) can have confusing results: runloops started afterwards will honor the new state, outer runloops will not.
Unfortunately there is no way to detect the situaltion at the moment.
source code and #line
directives
The parsing of #line
directive used to map logical lines to physical lines uses heuristics, and they can obviously fail.
Files that contain #line
directives and have no samples taken in the part of the file outside the part mapped by #line
directives will not be found.
first line of subs
The first line of subs is found by searching for the sub definition in the code. Needless to say, this is fragile.
sampling accuracy
Since the profiler uses nanosleep
/Sleep
between samples, accuracy is at the mercy of the OS scheduler. In particular, under Windows the default system timer has an accuracy of about 15.6 milliseconds.
AUTHORS
Mattia Barbon <mattia@barbon.org>
Steffen Mueller <smueller@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2015 by Mattia Barbon, Steffen Mueller.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.