Quotation and evaluation -- or, how pure is the pure lambda calculus?

By now theories of computation are a dime a dozen. What makes any such theory interesting, at this stage of the game, is the perspective such a theory offers on computation. One aspect of algebraic theories -- like the lambda or the pi-calculus -- that has long intrigued me is that they are typically predicated on a theory of names. They'll tell you what a computation is as long as you are willing to pony up some notion of a name. Moreover, this notion of name has got to come equipped with a notion of equality to effect things like substitution in lambda and pi or synchronization in pi. And, since having an infinite set of atomic entities over which you specify an equality leads to too much computational power, a notion of name, equipped with an equality, sneaks in a notion of computation -- the very thing the algebraic theory is attempting to give an account of. In this light, even the 'pure' lambda calculus doesn't seem so pure.

Of course, the process calculi suffer this critique more severely than lambda does because there are natural name-free accounts of lambda, e.g. SKI, while proposals for name-free accounts of pi are turgid at best and it is unclear whether they extend naturally to variants like ambients. Further, i would argue that SKI loses much of the flavor of lambda and, more importantly, doesn't set up the theory of computation to extend well to other programming notions, like reference and mutability especially in higher-order situations. This may be a matter of taste, more than anything else, but as i argued above, theories of computation are cheap, at this point, and taste and sensibility seems to be the essential differentiating factor amongst them.

With this little preamble, let me give you, dear reader, a taste of a treatment of a reflective account of computation that side-steps this little conundrum about names. The presentation below presents a reflective account of lambda that is derived from a reflective account of pi, published in a workshop at ETAPS in 2005. The reason i went back to treat lambda after giving an account of pi is that it turns out that such an account for lambda is a wee bit trickier because lambda is implicitly higher-order while plain vanilla pi is not. An account that gives a lambda calculus emerging just out of term constructors in which there is a kind of alternation of name constructors and program constructors -- a feature whose rationale we will visit at the end of this little note -- turns out to be fairly tightly constrained and therefore may be of some interest to this community.

So, without further ado, here's the syntax:

M,N ::= x // terms are names
P // or programs
P ::= L // programs are the ur-program, L
λx.M // or abstractions
MN // or applications
*x // or unquoted or 'dropped' names
x ::= ^P^ // names are quoted programs

Two things to note about this grammar immediately:

1. There's a production for names! Names are quoted programs.
2. There's a strict alternation between name constructors and program constructors, i.e. there are no terms of the form ^^...P...^^; rather, '^' and '^' must surround programs.

As usual, we calculate bound occurrences of names recursively, via

FN(L) = {}
FN(x) = {x}
FN(λx.M) = FN(M)\{x}
FN(MN) = FN(M)∪FN(N)
FN(*x) = {x}

Next, we whack this set of terms down by the usual alpha-equivalence plus

^*x^ == x

Note, we could have achieved the same effect by moving *x from the production for P into the production for M,N; but, for uniformity (especially with the ETAPS presentation for a reflective pi) it was more natural to leave dequote (a.k.a. drop) among the program ctors -- because it is one.

Before giving the operational semantics, let me also point out the relationship between L -- another departure from the standard presentation of lambda -- and quotation. It turns out to get names to emerge just out of quoted programs you have to have a base, atomic program. In pi this is straightforward, it's the null process, 0. In lambda the choice of the base program turns out to be somewhat interesting; but, apart from it's dynamics, you have to have one and L is our base, atomic program. It's the first or ur-program. From the first program one can form the first name, ^L^. With this name one can form the first abstraction, λ^L^.L . With this one can form another name ^λ^L^.L^, evidently distinct from ^L^ and you're off to the races. You've got -- at least -- the terms of a lambda calculus where the variables are a bit funny, but no funnier than deBruijn numbers, to my taste, anyway.

Equipped with a set of (equivalence classes of) terms over which to compute, lets' write down the operational semantics. We have the usual beta reduction

(λx.M)N -> M[N/x]

but with a catch M[N/x] denotes semantic substitution, which we give now.

x[N/x] := N
y[N/x] := y (y <> x)
(λx.M)[N/x] := λx.M
(λy.M)[N/x] := λz.(M[z/y][N/x]) (y <> x, z ∉ FN(M)∪FN(N)∪{x,y})
(M1 M2)[N/x] := (M1[N/x])(M2[N/x])

all bog standard, so far, but now we get to the fresh tofu

*x[N/x] := *x (N not a name, i.e. of the form ^M^)
*x[^M^/x] := M
*y[N/x] := *y
L[N/x] := L

The upshot of these rules is that a dropped name is a place-holder to drop in a program frozen in a name. This is not terribly magical in lambda -- which is already higher-order -- but it is very powerful in pi and ultimately gives rise to a mechanism for interpreting the fresh name generator.

Ok, so we've got beta-reduction sorted, but we also have other terms, in particular, L, that have to be given semantics. It turns out there are a lot of possibilities for the dynamics of L, but here is one that seems very natural: it's a lambda constructor. I.e.,

L^M^N -> λ^M^.N

Note that some care must be exercised here. If we were to have the rule

LMN -> λ^M^.N

then because names are terms -- lambda is implicitly higher-order -- you could form

L^M^N which would reduce to λ^^M^^.N

defeating the strict alternation of name constructors and program constructors.

Finally, we add a minimal context rule

M -> M' => MN -> M'N

which makes our calculus reasonably lazy.

So, this gives you a calculus into which you can embed the lambda calculus (built over the name set of your choice) by the obvious translation scheme. But, this calculus is pure. It doesn't demand its user to pony up a set of names. It's a complete theory in and of itself. A notion of name equality can be built, recursively, using the notions of computational equality and straightforward syntactic equality of the finite set of term constructors.

Another perspective on these same statements is that we have a 'machine code'. Expressions in this calculus are completely closed with no need for dereferencing of symbols. It is name-free, like SKI, and could be used to build a different sort of virtual machine that is in many ways like the categorical abstract machine, but still retains much of traditional stack machine shape. But there is another intriguing possibility. First, i leave it as an exercise to the reader to write down an encoding of this calculus into binary. But, here's an open question: is there an arithmetic formula that corresponds to the reduction rule? More formally, is there a binary encoding

B : M -> {0,1}*

and a function

R : {0,1}* x {0,1}* -> {0,1}*

s.t.

R(B(M),B(N)) = B(D) whenever MN ->* D and D doesn't reduce?

If such a function did exist it might be the basis for a really, really fast implementation of lambda.

Finally, let me point out an intriguing connection with games semantics that gives some support for the alternation of name constructor and program constructor as desiderata. i discovered this when thinking about adding quotation to set theory. The basic idea is that you can interpret program constructor as question and name constructor as answer and role of player versus opponent as the dividing line between program and environment.

It's these connections and questions, apart from any real affection for 'purity', that make an account of this shape interesting to me and i hope to others as well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A few questions

I'm having trouble understanding your system and the motivation behind it. To start off, perhaps a link to the workshop paper you mention would help the readers. Anyway, here are my questions and comments:

  1. I don't see what's so special about having a production for names in your term grammar. Any complete formalization of the λ-calculus with programs as ground terms in the meta-theory will include productions for names in its signature. Rather than using quoted programs, we could have used finite bit strings, and it's easy to equationally define == and <> on finite bit strings.
  2. I don't understand in what sense your system is reflective. As I understand it in the context of programming languages, reflection generally means an object program is able to access a representation of its state as present in the meta-level and modify its semantics. But your quoted terms are opaque: you can either pass around a constant quoted term, or turn it into a program by unquoting. You have no means to inspect quoted terms or construct new quoted terms. Doesn't this just amount to delayed evaluation? How is it any more powerful than normal order evaluation in the λ-calculus?

    While a quasi-quote mechanism as in Lisp is useful for meta-programming, the rules you've described are less powerful than that.

  3. This rule of substitution is confusing:

    *x[^M^/x] := M

    Since x is a quoted program, the above rule is equivalent to the following:
    *^P^[^M^/^P^] := M
    First of all, that means your substitution operator is not the same as α-renaming, since *^P^[^P^/^P^] := P, rather than *^P^.

    Secondly, the net effect is that your system is not confluent; it depends on a particular order of β-reduction. For example, consider the following:
    (λ y . (λ x . *x) y) z
    = (λ ^Q^ . (λ ^P^ . *^P^) ^Q^) ^R^
    = (λ ^Q^ . *^P^[^Q^/^P^]) ^R^
    = (λ ^Q^ . Q) ^R^
    = Q
    or alternatively,
    = (λ ^Q^ . (λ ^P^ . *^P^) ^Q^) ^R^
    = ((λ ^P^ . *^P^) ^Q^)[^R^/^Q^]
    = (λ ^P^ . *^P^) (^Q^[^R^/^Q^])
    = (λ ^P^ . *^P^) ^R^
    = *^P^[^R^/^P^]
    = R
    Since Q and R were arbitrary programs, this seems problematic. Is this the behavior you intended? If so, what is your assumed order of evaluation, and why?

  4. Is there any way you can construct a function in your system that has the same behavior as the L operator you've defined?

    L^M^N -> \^M^.N

    I can't think of a way you could define such a function; in fact, L doesn't act like a function at all, because it has to rebind occurences of the name ^M^ within the body ^N^. That essentially means α-equivalence does not hold in your system.

  5. How does the following rule "makes [y]our calculus reasonably lazy"?

    M -> M' => MN -> M'N

    All this rule says is you're allowed to simplify a function expression before applying it to its argument. But you already have to simplify the head expression until you get a λ-abstraction before you can β-reduce the function applied to its argument. Do you mean by this that you assume normal order evaluation?

  6. It's a complete theory in and of itself.

    Not quite... a theory is always interpreted in terms of some meta-theory; at most, what you might claim is that you are asking less of the meta-theory.

  7. I don't fully understand the following statement:

    A notion of name equality can be built, recursively, using the notions of computational equality and straightforward syntactic equality of the finite set of term constructors.

    The second part—equality defined recursively in terms of the term constructors—is fine. But are you saying you're also defining name equality in terms of computational equality, i.e., if programs P and Q both reduce to program R, then ^P^ == ^R^ == ^Q^? Wouldn't that make a single step of your reduction procedure undecidable?

  8. I don't understand the significance of the possible "arithmetic formula" you mention:

    But, here's an open question: is there an arithmetic formula that corresponds to the reduction rule? ... If such a function did exist it might be the basis for a really, really fast implementation of lambda.

    Why would expressing β-reduction in terms of arithmetic on bit strings necessarily, or even likely, be particularly faster than standard rewriting? Without further clarification, this just seems like mystical speculation.

  9. I'm not familiar with games semantics. Could you provide us with relevant background articles that help explain what you mean by the following?

    The basic idea is that you can interpret program constructor as question and name constructor as answer and role of player versus opponent as the dividing line between program and environment.

    Perhaps this would be clearer once formalized in the appropriate framework.

Hmm, this ended up much longer than I'd anticipated. I'm sorry if it seems like a rant, I'm used to comprehensive reviews. I'm glad that you're thinking about these issues, and I hope this serves as useful feedback.

Great feedback

I'm having trouble understanding your system and the motivation behind it. To start off, perhaps a link to the workshop paper you mention would help the readers.

Here you go: L. Gregory Meredith, Matthias Radestock: A Reflective Higher-order Calculus. Electr. Notes Theor. Comput. Sci. 141(5): 49-67 (2005)

Your evalutation order question is great. There are two errors in your first evaluation.

(λ y . (λ x . *x) y) z
= (λ ^Q^ . (λ ^P^ . *^P^) ^Q^) ^R^
= (λ ^Q^ . *^P^[^Q^/^P^]) ^R^
= (λ ^Q^ . Q) ^R^
= Q

Unless you are using '=' for reduction these don't hold. i'm going to assume you meant

(λ y . (λ x . *x) y) z
= (λ ^Q^ . (λ ^P^ . *^P^) ^Q^) ^R^
-> (λ ^Q^ . *^P^[^Q^/^P^]) ^R^
-> (λ ^Q^ . Q) ^R^
-> Q

Unfortunately, as you may check, there is no context rule of the form

M -> M' => \x.M -> \x.M'

So, your first evaluation step is inadmissable. Only the second evaluation calculation actually obeys the rules of the system.

Is there any way you can construct a function in your system that has the same behavior as the L operator you've defined?

L^M^N -> \^M^.N

I can't think of a way you could define such a function; in fact, L doesn't act like a function at all, because it has to rebind occurences of the name ^M^ within the body ^N^. That essentially means α-equivalence does not hold in your system.

Could you help me understand what you mean by 'rebind'? If N has ^M^ as a free name then L^M^N reduces to a term that binds ^M^ in N. i don't know what rebinding is in this context. Also could you help me understand the violation of alpha-equivalence that follows from this reduction? i think you would have to produce terms which violate reflexivity or symmetry or transitivity and i don't see any. But, maybe i'm missing something.

Finally, you're absolutely right that there doesn't seem to be a way to define a function giving the L dynamics! That's the reason it is included as the "get off the ground" dynamics. You need some base program, call it L, to get the quoting mechanism to work out. So, it seems natural to explore a dynamics for L that does not embed in the system itself.

How does the following rule "makes [y]our calculus reasonably lazy"?

M -> M' => MN -> M'N

All this rule says is you're allowed to simplify a function expression before applying it to its argument. But you already have to simplify the head expression until you get a λ-abstraction before you can β-reduce the function applied to its argument. Do you mean by this that you assume normal order evaluation?

See the response i made to your previous evaluation. The presentation here is very conservative about what may evaluate and when. This rule is what allows the evaluation in head position. Without it a lot more terms are stuck.

#

It's a complete theory in and of itself.

Not quite... a theory is always interpreted in terms of some meta-theory; at most, what you might claim is that you are asking less of the meta-theory.

Actually, one expects that any meta theory will be no more expressive that a functionally complete theory. In this sense the meta theory will be iso to the theory itself. Thus, by closing the loophole around names we have made the reflective tower.

I don't fully understand the following statement:

A notion of name equality can be built, recursively, using the notions of computational equality and straightforward syntactic equality of the finite set of term constructors.

The second part—equality defined recursively in terms of the term constructors—is fine. But are you saying you're also defining name equality in terms of computational equality, i.e., if programs P and Q both reduce to program R, then ^P^ == ^R^ == ^Q^? Wouldn't that make a single step of your reduction procedure undecidable?

We might use applicative bisimulation or contextual equivalence or some such notion of program equivalence. Given such a notion of program equivalence, ~, we may use P ~ Q => ^P^ == ^Q^. The way to get around decideability issues in reduction is to use the alternation of name constructor and program constructor. Quote depth in this system is alway finite. Further, a program, P, can never contain a name that is syntactically equal to ^P^. So, we can take equivalence up to applicative probes or contexts that are bounded in the quote depth of the names they use. In particular bounded above by the max quote depth of the programs we are comparing.

# I don't understand the significance of the possible "arithmetic formula" you mention:

But, here's an open question: is there an arithmetic formula that corresponds to the reduction rule? ... If such a function did exist it might be the basis for a really, really fast implementation of lambda.

Why would expressing β-reduction in terms of arithmetic on bit strings necessarily, or even likely, be particularly faster than standard rewriting? Without further clarification, this just seems like mystical speculation.

First, note that i said may provide the basis for a fast implementation of lambda. The primary intuition behind the suggestion is that we have very good specialized hardware support for binary arithmetic. So, if the complexity characteristics of B and R are within certain bounds you may get a speed up relative to existing hardware/software implementations of lambda.

# I'm not familiar with games semantics. Could you provide us with relevant background articles that help explain what you mean by the following?

The basic idea is that you can interpret program constructor as question and name constructor as answer and role of player versus opponent as the dividing line between program and environment.

Perhaps this would be clearer once formalized in the appropriate framework.

Google is your friend! But, anything by Abramsky is good. Hyland and Ong is good. McCusker's dissertation is good.

Linking would be appreciated.

Hello Lucius,

Welcome to LtU.

Google is your friend! But, anything by Abramsky is good. Hyland and Ong is good. McCusker's dissertation is good.

While I for one very much appreciate your contribution I would however respectfully ask that you please provide more complete references, and links where possible (e.g. through http://citeseer.ist.psu.edu/). Googling is not so easy when the subject area is not immediately obvious. As the original poster you are best equipped to provide us with background materials to support your claims and for us to follow your line of reasoning.

Thank you again for your contribution.

Links to the games literature

Here you go:

http://citeseer.ist.psu.edu/abramsky95full.html

http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=362313

http://portal.acm.org/citation.cfm?id=224189&dl=ACM&coll=&CFID=15151515&CFTOKEN=6184618

Clarifications

Unfortunately, as you may check, there is no context rule of the form
M -> M' => \x.M -> \x.M'
So, your first evaluation step is inadmissable. Only the second evaluation calculation actually obeys the rules of the system.

Ok, so you're only performing rewrites at the top level, possibly recursing in based on these context rules. I was looking at your definition as an equational logic specification, which has no restrictions on context. Now the question is, can you redefine your system so that any order of evaluation is admissable? This seem to be more in keeping with the λ-calculus and functional programming.

Could you help me understand what you mean by 'rebind'? If N has ^M^ as a free name then L^M^N reduces to a term that binds ^M^ in N.

Consider the following expression:
(λ ^M^ . L ^M^ ^M^) ^P^
If we allowed reduction within the body of the λ-abstraction, you would be rebinding the second occurence of ^M^. As you have explained, this is not allowed in your system. I'm still uncertain about allowing free names to become bound, since one of the uses for free names in the λ-calculus is to allow us to simplify subterms without concern for their context. You've removed this feature by only allowing simplification at the top level, and it's not clear that this is a good trade-off.

Finally, you're absolutely right that there doesn't seem to be a way to define a function giving the L dynamics! That's the reason it is included as the "get off the ground" dynamics. You need some base program, call it L, to get the quoting mechanism to work out.

If you just want to build names, L need not be a program. You could create a separate subsort for names, with the following constructors:
v : -> Name
_0 : Name -> Name (the _ indicates 0 takes an argument on the left)
_1 : Name -> Name
The problem I have is that L isn't a function at all in the traditional sense, because it can do things that functions which you can define cannot do. Yet you can pass L around as if it were a function. Thus either L doesn't do anything useful (after all, λ_._ already exists), or it does something useful (meta-programming), but we can't create variations on that useful feature in user-defined programs. If N were a quoted program and M a quoted name, and if we had quasi-quote primitives, then something like L could be defined as a function.

Actually, one expects that any meta theory will be no more expressive that a functionally complete theory.

That depends on what you mean by expressive. By including more "built-in" constructs in the meta-theory (such as a countable set of names or a notion of substitution on terms), you can express a particular theory more succinctly. Conversely, if you don't use those constructs, they need not be defined in the meta-theory. You could think of it as the complexity of the interpreter.

We might use applicative bisimulation or contextual equivalence or some such notion of program equivalence.

True, but if you want to automate such a decision procedure, what will you do when comparing two names which you cannot prove are equivalent or different? Does program reduction fail, or are such names considered different by default? Moreover, what useful properties does this buy you, as compared to syntactic name equality? I think a motivating example would help.

The way to get around decideability issues in reduction is to use the alternation of name constructor and program constructor.

I don't understand this. The standard λ-calculus with normal order evaluation is contained within your language, correct? So you could be forced to compare two syntactically different names, which are each quotations of standard λ-calculus terms (i.e., arbitrary programs), while still adhering to your alternation constraints on quotation.

In summary, the main problems I see with your system are (a) local simplification may have side effects (you've masked the side effects by requiring reductions to be global), and (b) using quoted programs as names not seem to actually simplify things as you claim, and rather complicates the reduction procedure (depending on how you define name comparison). I can see what your motivation is in defining names as quoted programs, coming from the π-calculus, where only names are passed along channels; but the λ-calculus works differently: we already pass terms, so there is no need to disguise them as names. As another person mentioned, names in the λ-calculus can be seen as representations of graph edges, but terms (values) are "passed along" those edges.

Thanks for continuing to provide quality feedback!

If you just want to build names, L need not be a program. You could create a separate subsort for names, with the following constructors:
v : -> Name
_0 : Name -> Name (the _ indicates 0 takes an argument on the left)
_1 : Name -> Name
The problem I have is that L isn't a function at all in the traditional sense, because it can do things that functions which you can define cannot do. Yet you can pass L around as if it were a function. Thus either L doesn't do anything useful (after all, λ_._ already exists), or it does something useful (meta-programming), but we can't create variations on that useful feature in user-defined programs. If N were a quoted program and M a quoted name, and if we had quasi-quote primitives, then something like L could be defined as a function.

This comment illustrates, i believe, the subtlety of the construction and the philosophical point at stake. First of all, introducing the sort Name opens the hole in the theory again. We wish to have a closed theory. That is, we cannot have the theory parameterized in anything, but especially not something that sneaks in a notion of computation.

Next, note that there are two independent functions of L. One is to provide a bottom for the quoting recursion in the grammar. (Actually, your signature v: -> Name, also indicates the need for this. You're using the term unit which must be added explicitly to any calculus that wants to have a well-founded quoting relation. i have also looked into non-well-founded quoting relations, but it's a very subtle theory, indeed.) The other is really just a requirement of the completeness of the reduction relation. We are forced to give L some dynamics. We could, for example, make it inert. The point of the lambda-ctor dynamics was just to provide an example of interesting and potentially useful dynamics that was not embeddable in the system.

I don't understand this. The standard λ-calculus with normal order evaluation is contained within your language, correct? So you could be forced to compare two syntactically different names, which are each quotations of standard λ-calculus terms (i.e., arbitrary programs), while still adhering to your alternation constraints on quotation.

The point is that while general applicative bisimulation may not terminate we provide inductive bounds (see my previous response) so that it will terminate. i take your point, though and will present details in a subsequent posting. In the meantime, i simply point out that this is a general feature of these style of program equivalences. They are effective and dials on what constitutes an observation as well as up-to techniques considerably change the complexity of the procedure.

In summary, the main problems I see with your system are (a) local simplification may have side effects (you've masked the side effects by requiring reductions to be global), and (b) using quoted programs as names not seem to actually simplify things as you claim, and rather complicates the reduction procedure (depending on how you define name comparison). I can see what your motivation is in defining names as quoted programs, coming from the π-calculus, where only names are passed along channels; but the λ-calculus works differently: we already pass terms, so there is no need to disguise them as names. As another person mentioned, names in the λ-calculus can be seen as representations of graph edges, but terms (values) are "passed along" those edges.

As i mentioned in my original posting, SKI already provides a name-free account of lambda. The real question i am investigating is what is the "vector" along which program design, analysis and evaluation travels in different name-free settings? To illustrate this idea, consider, is it likely that monadic-style structuring of computation would have been discovered if the dialogue about the structure of computation was couched entirely in the SKI setting? i doubt it would. i submit that monadic style grew out of the tension between the cleanliness of "pure" functional programming and the obvious, practical execution advantages of side-effecting programs. There was a natural vector or dialectic informing that development of our understanding of computation. Likewise, there is a certain natural arc to be found in the optimal reduction and graph-based techniques for lambda. These formulations shape the way one imagines computation. The quotation mechanism is another such structuring technique for framing our notions of computation.

To my way of thinking it makes explicit certain points that have been implicit in many formulations of computation, going all the way back to set theory. For example, it causes reference to be accounted for at the outset. More intriguingly, it identifies the possibility that the signifier contains all the information necessary to obtain the signified. This sort of locality is interesting if one is interested in physical realizations of computation and how physical systems embed notions of computation.

As for the trade-offs in complexity, i am in complete agreement with you regarding the evident introduction of complexity. i think that reference -- even of the form found in lambda -- is a very complex beast, and any careful account reveals even more structure than initially meets the eye. i have been very conservative in the introduction of features and made sure to document all the design choices i could see (hence the presentation with the context rule as it is). The system is minimal relative to the design desiderata and i find it instructive that there is so much complexity in a minimal system.

To attempt to bring this point home, let me summarize: if you want a name-free account that more or less preserves the term structure, has a well-founded alternating quote relation and a dynamics for the base program that does not embed in the system, then this system is minimal. Up to variants on the dynamics of L, there isn't a system with less complexity that does the job. i find this eye-opening.

i observe a tension between the quoting apparatus and lambda. Both provide a reference/dereference mechanism, but the former considerably weakens scoping requirements for manipulating reference/dereference. i find this interesting and hope that members of the community might as well.

Name-free graph-based representations

I tend to think of lambda-terms themselves as graph-like structures, where a reference to a variable is just an edge of the graph pointing back to the abstraction node.

In order write down a serialized representation of that graph on paper, it helps to label the abstraction nodes with variable names, but I don't think they're at all essential to the lambda calculus.

I'm not so familiar with the Pi calculus, but perhaps a similar approach could be taken there?

Compositional account?

Can you give a compositional translation from lambda terms to graphs, or give a reference for one?

Lambda terms as graphs

The graph representation of lambda terms dates back to Bourbaki. For a modern overview, see Ariola's "Lambda calculus plus letrec": http://www.cs.uoregon.edu/~ariola/cycles.html.

Sharing graphs and bigraphs

I don't know what criteria of compositionality you have in mind, but Lamping's infamous sharing graphs provide a model of beta evaluation for the lambda calculus that is Levy-optimal, and also embeds various notions of continuation. I'll give refs later, if you want them.

Milner has been using bigraphs as a model of concurrency: I don't know much of the details, and can't comment on the relationship to the pi-calculus.