TIP 280: Add Full Stack Trace Capability With Location Introspection

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:		Andreas Kupries <andreas_kupries@users.sourceforge.net>
Author:		Andreas Kupries <andreask@activestate.com>
State:		Final
Type:		Project
Vote:		Done
Created:	10-Aug-2004
Post-History:	
Keywords:	Tcl
Obsoletes:	211
Tcl-Version:	8.5

Abstract

This TIP proposes adding a new subcommand to the info command to get a list of all the frames on the current stack, with additional information about command location, type of execution, etc., rather than the limited list returned by info level. It is also related to Peter MacDonald's [86], or rather, to the info linenum feature proposed there. The base feature of providing location information for a command was extended, allowing the user to ask for the current location in all stack levels. Enough information is returned to consider this as an extended [211].

Rationale

The motivation for this feature is the debugging of scripts in situations where an error stack is wanted, to see where the problem occurred, but aborting the execution is not desired. To enable this a means of providing the essential information shown in an error stack is needed, which does not require an abort (and subsequent unwinding of the C stack) to assemble this information.

An example is the testing of scripts where an unexpected error should not fail the test case, nor the framework, yet still allow the recording of the problematic location and how it was reached in some log.

The original motivation for [211] was that there is currently no way to get a list of all the frames in the current stack managed by Tcl_PushCallFrame() and Tcl_PopCallFrame(). The info level command does not contain frames that are callers of uplevel, reporting only the frames that are accessible via another uplevel command. There are times when the lack of information can have a negative impact on code design.

This motivation asks in essence for an error stack as well, but limited itself to the returning commands themselves, and not the other information, like the line the command is on, its context, etc.

Other use cases, also found in [211]:

  1. tcltest, and other testing frameworks.

    The first case is with the core's Tcltest package, where the complete lack of ability to gain access to that information means it is impossible to gain information about a test without modifying the Tcltest code itself. Being able to find out the caller info would be very useful, especially for logging information. Currently, there is no way to get the caller's info, due to the fact that the code for the test is upleveled and, hence, not visible via info level.

  2. TestStubs Package

    The TestStubs package provides the ability to temporarily redefine commands, in particular for stubbing out or replacing functionality in a test case. There is a command in the package called chain, which is used within the code replacing a command (or part of a command) to call the original definition of the command. For example, one could do:

    stubs::stub ensemble array names {
        return [lsort [uplevel 1 chain names $args]]
    }
    

    However, since the chain command is (and should be) limited to only running from within a stub definition, it needs to call info level to find out if its caller is one of the stubbed commands, and what the name of that command is. With info level, it would not have access to the level that is running inside the stubbed procedure. Hence, either it cannot check this constraint, or stubs cannot be allowed to use uplevel when calling it (which means things like the above either cannot work, or need to be rewritten in a considerably less clear manner).

Specification of the Proposed Change

Tcl Level API

The builtin command info is extended to accept a new subcommand, frame. When this subcommand is called it returns information about the current command and its location. This information is available not only for the current stack level, but also the higher stack levels used to reach the current location.

The syntax of the new subcommand is

info frame ?level?

The new functionality will provide access to all frames on the stack rather than the current limited subset. This TIP does not propose to alter uplevel or upvar so that they can see these hidden levels.

If level is not specified, this command returns a number giving the frame level of the command. This is 1 if the command is invoked at top-level.

If level is specified, then the result is a dictionary containing the location information for the command at the level on the stack.

If level is positive (> 0) then it selects a particular stack level (1 refers to the top-most active command, i.e., info frame itself, 2 to the command it was called from, and so on); otherwise it gives a level relative to the current command (0 refers to the current command, i.e., info frame itself, -1 to its caller, and so on).

This is similar to how info level works, except that this subcommand reports all frames, like source'd scripts, evals, uplevels, etc.

Note that for nested commands, like "foo [bar [x]]" only "x" will be seen by an info frame invoked within "x". This is the same as for info level and error stack traces.

The result dictionary may contain the keys listed below, with the specified meanings for their values.

  • type

    This entry is always present and describes the nature of the location for the command. The recognized values are source, proc, eval, and precompiled.

    In some circumstances it makes sense to have this information extensible, i.e. to allow user-defined type names. For more about this topic see the discussion at the end of the document.

    * source means that the command is found in a script loaded by the source command.

    * proc means that the command is found in dynamically created procedure body.

    * eval means that the command is executed by eval or uplevel.

    * precompiled means that the command is found in a precompiled script (loadable by the package tbcload), and no further information will be available.

  • line

    This entry provides the number of the line the command is at inside of the script it is a part of. This information is not present for type precompiled. For type source this information is counted relative to the beginning of the file, whereas for the last two types the line is counted relative to the start of the script.

  • file

    This entry is present only for type source. It provides the normalized path of the file the command is in.

  • cmd

    This entry provides the string representation of the command. This is usually the unsubstituted form, however for commands which are a pure list executed by eval it is the substituted form as they have no other string representation. Care is taken that the pure-List property of the latter is not spoiled.

  • proc

    This entry is present only if the command is found in the body of a regular Tcl procedure. It then provides the name of that procedure.

  • lambda

    This entry is present only if the command is found in the body of an anonymous Tcl procedure, i.e. a lambda. It then provides the entire definition of the lambda in question.

  • level

    This entry is present only if the queried frame has a corresponding frame returned by info level. It provides the index of this frame, relative to the current level (0 and negative numbers).

A thing of note is that for procedures statically defined in files the locations of commands in their bodies will be reported with type source and absolute line numbers, and not as type proc. The same is true for procedures nested in statically defined procedures, and literal eval scripts in files or statically defined procedures.

In contrast, for a procedure definition or eval within a dynamically eval'uated environment count linenumbers relative to the start of their script, even if they would be able to count relative to the start of the outer dynamic script. That type of number usually makes more sense.

A different way of describing this behaviour is that we track file based locations as deeply as we can, and where we cannot the lines are counted based on the smallest possible eval or procbody scope, as that scope is usually easier to find than any dynamic outer scope.

The syntactic form {expand} is handled like eval. This means that if it is given a literal list argument the system tracks the line-number within the list words as well, and otherwise all line-numbers are counted relative to the start of each word (smallest scope)

The following other builtin commands are changed as well to support the tracking of line numbers:

  1. catch

  2. dict for

  3. dict with

  4. eval

  5. for

  6. foreach

  7. if

  8. interp eval

  9. namespace eval

  10. proc

  11. source

  12. switch

  13. while

Public C API

No changes are made to the public C API.

Examples

Note that this is not a complete set of examples covering all possible cases. Let us assume that the file is named EX. The "cmd" is always the "info frame ..." command, and "level" is 0 too, always, in this situation. This is left out of the result dictionaries to keep them small.

 puts [info frame 0]           ;# 01* - output: {cmd {info frame 0} line 1 file EX type source}
                               ;# 02
 proc foo {} {                 ;# 03  /foo remembers 3
     puts [info frame 0]       ;# 04*
 }                             ;# 05
 foo                           ;# 06  - output: {cmd {...} line 4 file EX type source}
                               ;# 07
 set script {                  ;# 08  1
     puts [info frame 0]       ;# 09  2*
 }                             ;# 10  3
 eval $script                  ;# 11  - output: {cmd {...} line 2 type eval}
                               ;# 12
 eval {                        ;# 13  1
     puts [info frame 0]       ;# 14* 2*
 }                             ;# 15  3  - output: {cmd {...} line 14 type source}
                               ;# 16
 proc fox {} {                 ;# 17  /fox remembers 17
     eval $::script            ;# 18
 }                             ;# 19
 fox                           ;# 20  - output: {cmd {...} line 2 type eval}
                               ;# 21
 proc squirrel {} {            ;# 22  /squirrel remembers 22
     eval {                    ;# 23  1
         puts [info frame 0]   ;# 24* 2*
     }                         ;# 25  3
 }                             ;# 26
 squirrel                      ;# 27  - output: {cmd {...} line 24 type source}
                               ;# 28
 proc fuchs {} {               ;# 29    /fuchs remembers 29
     proc dog {} {             ;# 30  1 /dog   remembers 30
         puts [info frame 0]   ;# 31* 2*
     }                         ;# 32  3
 }                             ;# 33
 fuchs                         ;# 34
 dog                           ;# 35  - output: {cmd {...} line 31 type source}
                               ;# 36
 eval {                        ;# 37  1
     proc wolf {               ;# 38  2 /wolf remembers 38
         puts [info frame 0]   ;# 39* 3*
     }                         ;# 40  4
 }                             ;# 41  5
 wolf                          ;# 42  - output: {cmd {...} line 39 type source}
                               ;# 43
 set scripted {                ;# 44  1
     proc deer {} {            ;# 45  2 /deer remembers 2
         puts [info frame 0]   ;# 46  3*
     }                         ;# 47  4
 }                             ;# 48  5
 eval $scripted                ;# 49
 deer                          ;# 50  - output: {cmd {...} line 3 type eval}
                               ;# 51
 puts [set a b                 ;# 52
       info frame 0]           ;# 53* - output: {cmd {...} line 53 type source file EX}
                               ;# 54
 puts [info \
       frame 0]                ;# 56  - output: {cmd {...} line 55 type source file EX}
                               ;# 57
 proc salmon {} \
 {                             ;# 59
     puts [info frame 0]       ;# 60*
 }                             ;# 61
 salmon                        ;# 62  - output: {cmd {...} line 60 type source file EX}
                               ;# 63
 set method frame              ;# 64
 puts [info $method 0]         ;# 65* - output: {cmd {info $method 0} line 65 type source file EX}
                               ;# 66
 proc trout {} {               ;# 67
     puts [info $method 0]     ;# 68*
 }                             ;# 69
 trout                         ;# 70  - output: {cmd {info $method 0} line 68 type source file EX}

Another example showing how to query the whole stack of frames. It assumed that the file is named EX. The output shown after the example was manually reformatted to explicitly ordered the keys and the columns vertically aligned, for readability and better comparability of the lines. The returned level information was removed, and level numbers for info frame were added at the beginning of the lines.

 proc setRes {result} {               ;# 01
   if {$result == "fail"} {           ;# 02
      set level [info frame]          ;# 03
      while {$level} {                ;# 04
           puts [info frame $level]   ;# 05
           incr level -1              ;# 06
       } ;# end of while              ;# 07
   } ;# end of if                     ;# 08
 } ;# end of proc setRes              ;# 09
 proc runTest {tc} {                  ;# 10
      # run the testcase              ;# 11
      uplevel 1  setRes fail          ;# 12
 } ;# end of proc runTest             ;# 13
 runTest TC0001                       ;# 14

Output:

 4 {cmd {runTest TC0001}        line 14 type source file EX}
 3 {cmd {uplevel 1 setRes fail} line 12 type source file EX proc runTest}
 2 {cmd {setRes fail}           line  1 type eval}
 1 {cmd {info frame $level}     line  5 type source file EX proc setRes}

To see the connection between the new feature and error stack traces replace the info frame command on line 3 of setRes with "return -code error X". This will generate an error trace and with a bit of reformatting the relationship can be seen easily:

Stack trace:

	X
	    while executing
	"setRes fail"
	    ("uplevel" body line 1)
	    invoked from within
	"uplevel 1  setRes fail"
	    (procedure "runTest" line 3)
	    invoked from within
	"runTest TC0001"
	    (file "EX.tcl" line 14)

Reformat the trace, with all output from a single command on one line, and some noise removed.

	X
	"setRes fail" ("uplevel" body line 1)
	"uplevel 1  setRes fail" (procedure "runTest" line 3)
	"runTest TC0001" (file "EX.tcl" line 14)

Now revert the order of the lines, and add vertical alignment. Now have in essence the ouput of the unmodified example, with same difference in the line numbers. But that is only because the error stack trace always counts lines relative to the script, and doesn't attempt to determine an absolute location in a file.

	"runTest TC0001"         (file "EX.tcl" line 14)
	"uplevel 1  setRes fail" (procedure "runTest" line 3)
	"setRes fail"            ("uplevel" body line 1)
	X

Child Interpreters

The current implementation of info level only returns levels up to the top of the stack for the current interpreter. Such an approach puts limitations on what information can be retrieved, but allows for a certain level of "security" when running code in child interps, especially safe interps.

Given the security considerations of safe interps, and consistancy with regards to what information is returned across multiple circumstances, the stack trace returned will only return information up to the top level of the current interp, the same limit info level is bound by.

Discussion

A point noted by Lars Hellstrom is that this TIP makes the builtin source command special. It is not possible anymore to re-create source purely in Tcl, doing so will loose the name of the file. We are still tracking line numbers, but they are then relative to the start of the eval'd script, withut a connection to the file.

In a similar way, user-defined control structures lose the ability of the builtin if, etc. to track line numbers in a literal code block relative to the enclosing file. The user-defined commands are again reduced to providing line numbers relative to the start of the code block.

To bring user-defined commands back to the same level as the builtins we need:

  1. An extension of info frame which delivers the line information for all words of the command we are inside of. As each word can start on a different line (because of continuation lines and preceding multi-line literals).

    This information is actually already available internally. The current implementation needs it for the tracking into literals.

  2. Extensions to the uplevel and eval commands which allow a script to provide them with location information for the script they are evaluating.

For example:

 proc newsource {foo script} {
   ... preprocess whatever ...
   set loc [info frame data for word 2]
   uplevel -location $loc 1 $script
 }

Internally we also need quite a bit more flexible data structures. The type names for example are defined through a C enum, and can be checked quickly and efficiently. The file information, i.e. path, is stored and used dependent on that, only for type source.

To keep this TIP and its implementation reasonably small the actual detailed specification and implementation of such extensibility is defered to a future TIP.

Reference Implementation

An implementation patch is available on SourceForgehttp://sourceforge.net/support/tracker.php?aid=1571568 .

Copyright

This document has been placed in the public domain.

Please note that any correspondence to the author concerning this TIP is considered in the public domain unless otherwise specifically requested by the individual(s) authoring said correspondence. This is to allow information about the TIP to be placed in a public forum for discussion.

History