Author image Jarkko Hietaniemi


Devel::CallStack - record the calling stacks


    perl -d:CallStack ...


The Devel::CallStack is meant for code developers wondering why their code is running so slow. One possible reason is simply too many subroutine or method calls since they are not cheap in Perl.

The Devel::CallStack records the calling stacks, how many times each calling stack is being called. By default the results are written to a file called callstack.out.

NOTE: recording the callstacks is a very heavy operation which slows down the execution of your code easily ten-fold or more: do not attempt any other code timing or profiling at the same time. The gathered information is useful in conjunction with other profiling tools such as Devel::DProf.


I got frustrated by Devel::DProf results that looked not unlike this:

  Total Elapsed Time = 1.892063 Seconds
    User+System Time = 1.742063 Seconds
  Exclusive Times
  %Time ExclSec CumulS #Calls sec/call Csec/c  Name
   13.8   0.241  0.426   2170   0.0001 0.0002  Foo::_id
   10.3   0.181  0.181   1747   0.0001 0.0001  Foo::Map::has
   9.18   0.160  0.434      3   0.0532 0.1448  main::BEGIN
   8.21   0.143  0.143   5205   0.0000 0.0000  Foo::Map::_has
   7.46   0.130  0.611      1   0.1299 0.6112  Foo::Map::new

I obviously needed to try cutting down the number of Foo::_id calls (not to mention the number of Foo::Map::_has and Foo::Map::_has calls), but the problem was that Foo::_id was being called from multiple places, there were more than one possible "hot path" that I needed to locate and "cool down".


For this file,

    sub foo { bar(@_) }
    sub bar { zog(@_) if $_[0] % 7 }
    sub zog { }
    for (my $i = 0; $i < 1e3; $i++) {
        $i % 5 ? foo($i) : bar($i);

running perl -d:CallStack will result in:

    main::bar 1 200
    main::bar,main::zog 2 171
    main::foo 1 800
    main::foo,main::bar 2 800
    main::foo,main::bar,main::zog 3 686

Meaning that the callstack main::bar was called 200 times, which makes sense since every fifth call out of 1000 should have been made to bar(). On the other hand, the callstack main::bar,main::zog was reached 171 times, which is the number of integers between 0 and 999 (inclusive) that are evenly divisible both by five and seven. The numbers in the second column are the callstack depths (the number of commas plus one).


Parameters are given in the command line after the -d:Callstack and a =:

    perl -d:CallStack=...

The available parameters are as follows:


The results are written by default to a file called callstack.out. This can be changed either with

    perl -d:CallStack=out=filename


    perl -d:CallStack=out=stdout
    perl -d:CallStack=out=stderr

which will output to a file called filename or the standard output or the standard error, respectively.


By default the calling stacks are walked all the way back to the beginning. This may be very expensive if the calling stacks are deep. To limit the number of frames walked back, supply the depth parameter:

   perl -d:CallStack=depth=N

or just

   perl -d:CallStack=N

Using callstack depth two for for our example:

    main::bar 1 200
    main::bar,main::zog 2 857
    main::foo 1 800
    main::foo,main::bar 2 800

Using the depth of one (or zero) gives the number of times each subroutine was called:

    main::bar 1 1000
    main::foo 1 800
    main::zog 1 857


By default the callstacks go from left to right, that is, the callers are on the left and the callees are on the right, the time flows from left to right. With the reverse parameter you can flip the order, which may fit your brain better. For our example:

    main::bar 1 200
    main::bar,main::foo 2 800
    main::foo 1 800
    main::zog,main::bar 2 171
    main::zog,main::bar,main::foo 3 686


By default only the names of the called subroutines (methods) are recorded. To record also the filename and (calling) linenumber in the file, use the full parameter:

   perl -d:CallStack=full

The filename and the linenumber are prepended to the subname, all concatenated with single colons, for our example: 1 800 1 200, 2 171,, 3 686, 2 800


Normally the output file is overwritten. To append instead:

   perl -d:CallStack=append


Normally the statistics are started from scratch for each run. To read in initial statistics from a file:

   perl -d:CallStack=in=filename

The =filename part is optional, the default filename is callstack.out. The input data needs to be in the same format (depth, full, reverse) as the current settings.

Combining parameters

To use several parameters at the same time, combine the parameters by using a comma:

   perl -d:CallStack=3,out=my.out,full

If you combine the append and in parameters, you get cumulative statistics.


To get a copy of the statistics accumulated so far, call

        my %C = Devel::CallStack::get();

The keys of the hash are the callstacks as comma-concatenated strings, and the values are the number of calls.

To set the statistics, call


To clear the statistics, simply call Devel::CallStack::set() with no argument.

To write out the statistics accumulated so far, call


This overwrites the existing output file (either callstack.out or whatever you used for the out parameter or the standard output or error streams) unless the append parameter is used. You need to do any needed file renaming yourself. write() is used by Devel::CallStack itself to output the statistics at the end of a run, by calling it from its END block.

To read in the statistics accumulated from a file, call


This merges in the data instead of replacing. If you want to replace the data, call set() yourself. read() is used by the in parameter. The input data needs to be in the same format (depth, full, reverse) as the current settings.


Now you've run your code with Devel::CallStack. Now what?

If you see that a method or a subroutine is called several thousand times while the upper layers are called only a few times:

  • First of all try your code with different input and with different amount of input: how does the number of calls vary? Linear, logarithmic, squared, cubed, random? Have you picked the right algorithm? You are not reimplementing something from the Perl core that might have either a better algorithm or simply a faster implementation? (For example sort().)

  • You may manually inline the method or subroutine code to its callers. The downsides include harder maintenance (remember to document/comment the inlining both to the callers and to the original code), the upsides include faster execution. Maybe you can somehow automate the inlining, for example via Perl source filters?

  • You may manually or (preferably) automatically cache the computation, whenever reasonable and possible. Use for example the Memoize module. The downsides include more memory usage, upsides include faster execution.

If you see some deep code paths having only a few callers (or maybe even just a single one):

  • Maybe you have several layers of subroutines calling each other always along the same paths - you could possibly collapse/inline several levels of these subroutines into fewer ones, or even just a single one. If you still need to have some of the intermediate functions separately, you may consider maintaining separate functions for those, but remember to document/comment the fact profusely.


Devel::CallStack unfortunately works only in 5.6.1 or later Perls, there is something different with the -d:Xyz option of older Perls that breaks Devel::CallStack, one would get something like this:

   syntax error at line 0, near "use Devel::CallStack="


Sébastien Aperghis-Tramoni for bravely testing the code in Jaguar.


"caller" in perlfunc; Devel::CallerItem, Devel::DumpStack, Devel::StackTrace, for alternative views of the call stack; Devel::DProf and Devel::SmallProf for time-based profiling; Devel::Cover for coverage.


Jarkko Hietaniemi <> 2004-2005


This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.