First-Class Traces

Some development environments provide a tracer, which allows you to observe the entering and exiting of function calls[1].

E.g. in Common Lisp you can do:

(defun flatten (list)
    ((null list) list)
    ((atom list) (list list))
    (t (append (flatten (first list))
               (flatten (rest list))))))
(trace flatten)
(flatten '(a (b c) d))

which could output

0 FLATTEN > ...
  >> LIST : (A (B C) D)
  1 FLATTEN > ...
    >> LIST : A
  1 FLATTEN < ...
    << VALUE-0 : (A)
  1 FLATTEN > ...
    >> LIST : ((B C) D)
    2 FLATTEN > ...
      >> LIST : (B C)
      3 FLATTEN > ...
        >> LIST : B
      3 FLATTEN < ...
        << VALUE-0 : (B)
      3 FLATTEN > ...
        >> LIST : (C)
        4 FLATTEN > ...
          >> LIST : C
        4 FLATTEN < ...
          << VALUE-0 : (C)
        4 FLATTEN > ...
          >> LIST : NIL
        4 FLATTEN < ...
          << VALUE-0 : NIL
      3 FLATTEN < ...
        << VALUE-0 : (C)
    2 FLATTEN < ...
      << VALUE-0 : (B C)
    2 FLATTEN > ...
      >> LIST : (D)
      3 FLATTEN > ...
        >> LIST : D
      3 FLATTEN < ...
        << VALUE-0 : (D)
      3 FLATTEN > ...
        >> LIST : NIL
      3 FLATTEN < ...
        << VALUE-0 : NIL
    2 FLATTEN < ...
      << VALUE-0 : (D)
  1 FLATTEN < ...
    << VALUE-0 : (B C D)
0 FLATTEN < ...
  << VALUE-0 : (A B C D)
(A B C D)

As you can see, the macro TRACE only enables printing the trace. Printing a trace without having a way to access it directly/programmatically seems rather premature to me. It is evident from the above trace that traces form a recursive tree-like data type, that however is not accessible to the programmer. The entering and the exiting values from a function belong to one node of the tree (even though they are printed far apart), and the subtraces are subtrees.

Thus, continuing the above example, such a `trace' data structure might roughly resemble

(FLATTEN (A (B C) D) (A B C D)
  (FLATTEN ((B C) D) (B C D)
    (FLATTEN (B C) (B C)
      (FLATTEN B (B))
      (FLATTEN (C) (C)
        (FLATTEN C (C))  ; X
        (FLATTEN NIL NIL)))
    (FLATTEN (D) (D)
      (FLATTEN D (D))
      (FLATTEN NIL NIL))))

Since I could not find anything about this, I was wondering the following: are there any languages (or easy ways of implementing it yourself) that can reify traces as data structures such as above that you can manipulate? In case you are wondering, there are potentially many reasons why this might be useful. E.g. one might like to compare two traces, or store a trace for later, or even consider the trace itself as a kind of higher-level program (although I have not investigated this idea in depth).

Note that I am not referring to `stack traces' (a.k.a. `backtraces'). The stack trace at the execution of location X above might be a structure resembling

  (FLATTEN ((B C) D)
    (FLATTEN (B C)
      (FLATTEN (C)
        (FLATTEN C)))))  ; X

Debuggers usually already provide many ways of manipulating the stack trace after e.g. a exception or breakpoint (although I do not know which ones are able to reify it...). Also, as I recall, one simple way of implementing continuations is copying the stack (which is part of the continuation) to the heap. So stack traces are not the issue here, nor are the `traces' as the word is sometimes used in theoretical computer science (sequences of successive states of some state transition system).

[1] It is actually an interesting question what it would mean to trace operators other than functions, but I did not go into that here.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Just hijack the function, like normal trace

CL-USER> (defvar *current-trace* nil)                                                                       

CL-USER> (defun my-trace (function-symbol)                                                                  
           (let ((function (symbol-function function-symbol)))                                              
             (setf (symbol-function function-symbol)                                                        
                   (lambda (&rest args)                                                                     
                     (multiple-value-bind (return-values trace)                                             
                         (let (*current-trace*)                                                             
                           (declare (special *current-trace*))                                              
                           (values (multiple-value-list (apply function args))                              
                                   (nreverse *current-trace*)))                                             
                       (push `(,function-symbol ,args ,return-values ,@trace)                               
                       (apply #'values return-values))))))

CL-USER> (defun a (a)                                                                                       
           (1+ a))

CL-USER> (defun test (a b)                                                                                  
           (values (a a)                                                                                    

CL-USER> (my-trace 'test)                                                                                   
CL-USER> (my-trace 'a)                                                                                      

CL-USER> (test 1 2)                                                                                         
CL-USER> *current-trace*                                                                                    
((TEST (1 2) (2 2) (A (1) (2))))

You'd want to save the symbol's original function somewhere, possibly in its properties if you were lazy. I'm sure it's equally hackable in smalltalk. Generally speaking, abusing dynamic state at function calls is easy in late binded languages. It's when one wants to play with the static environment that things get messy.

Hijacking the function

Thanks, but this does not work completely, because it is not allowed to redefine the meaning of the symbols in the common-lisp package. Also, it uses side-effects to aggregate the trace, which should be unneccessary, I think, because a trace is simply a function of its subtraces.
P.S. Is it me or does the code in you post make the screen very wide?

You usually can redefine the

You usually can redefine the bindings of symbols in the CL package, although the incantation tends to be implementation-dependant (EDIT: also, the effect is similarly undefined, but you'll find most implementations react rather sanely at high debug/low speed levels). Note that the side-effects include no sharing and is on a dynamic variable. It is no more side-effectful than a tail-recursive function transformed into iterative form is (and the conversion back would be similar, with an additional explicitly threaded argument).

PS, i copied directly from my term, and it seems that the spacing until the end of the lines got pasted too. Unfortunately, I can't seem to find it in the edit box.

Trivial for aspects

Trace reification sounds like a 20-line hack for just about any aspect-oriented language, if you're into that sort of thing. Certainly in Aspect-J what you describe would be pretty trivial to implement.

Debuggers usually already provide many ways of manipulating the stack trace after e.g. a exception or breakpoint (although I do not know which ones are able to reify it...)

Since JDK 1.4 (about four years, now), all Java environments provide reified stack traces. These are available at run-time, and not just in the debugger, but are read-only to prevent security spoofing.


Quoting from Pickaxe 1st Ed. (Since it's online and I'm too lazy to type it out from 2nd):


Establishes aProc as the handler for tracing, or disables tracing if the parameter is nil. aProc takes up to six parameters: an event name, a filename, a line number, an object id, a binding, and the name of a class. aProc is invoked whenever an event occurs. Events are: c-call (call a C-language routine), c-return (return from a C-language routine), call (call a Ruby method), class (start a class or module definition), end (finish a class or module definition), line (execute code on a new line), raise (raise an exception), and return (return from a Ruby method). Tracing is disabled within the context of aProc.

Fairly simple in Perl 5

Doing this in Perl 5 is fairly simple. You can easily redefine functions with symbol table manipulation. Using the built-in caller() function you can get information about how deep you are in the call chain, as well as what function called the current invocation. Tracking arguments is also trivial, as is maintaining Perl's context sensitive return values. It also doesn't suffer from the issue the Common LISP example above does because you can redefine built-in functions using the CORE:: namespace (it's tricky for sure, but possible).

traces in Haskell

The Hat tracer for Haskell gives you almost exactly what you want. Your original program is transformed into a program that computes its own trace as a data structure. (The data structure is stored in a file rather than memory, but that is mostly for efficiency, since it is so huge.) Within the trace, every reduction has a link from the redex to the reduct (result), and also backwards from the reduct to the redex (parent link). At the moment, Hat has specialised tools to view and explore the trace, but in theory you could suck it back in as a Haskell value if you wanted. (The major issue again is the size of the trace.) There is also current research on developing a relational query language over traces.

Unlike the other languages mentioned above (Java, Ruby, Perl), the Hat traces are not just stack back-traces, but truly complete records of everything the program ever did.