## A question on free variable capture.

When doing variable substitution in an application ((E1 E2)) it is fairly straight forward if E2 is a variable. When E2 is something more complex, like an abstraction, how do things work with regards to alpha conversion?

Do you first find all the free variables in E2, then rename those that exist free in E1? Or since E2 isn't a variable do not do an alpha conversion stage?

Anyone out there too familiar with lambda calculus, I'd appreciate a little assist.

-Chris

## Comment viewing options

### Don't worry, be happy

Do you first find all the free variables in E2, then rename those that exist free in E1? Or since E2 isn't a variable do not do an alpha conversion stage?

Unfortunately, you've asked a question that is quite complicated to answer (or quite simple, depending on your needs).

The simplest solution is to view free variables as a temporary concept useful only to bootstrap things, and to always use closed expressions, i.e. surround your expression with a lambda expression that introduces and binds the "free variable."

Somewhat more subtly, less formally and more pragmatically, you can just rig things depending on what you intend: are the two free variables intended to be the "same" or "different" values? If they are different just relabel them so they are distinct.

The upshot of alpha conversion is simply that the labels don't matter, the bindings do, and there are a number of formal tricks, such as de Bruijn indexes to formalize the same concepts without this problem.

If you just want to understand LC, don't worry about it too much, but if you want to do formal derivations and proofs you probably need to investigate some of the techniques for dealing with this.

### ... closed expressions

Take the example:

(\a.\b. b a) b

if I were to convert the free variable into a closed expression,
should my result look something like this:

(\a.\b. b a) (\b. b)
==> (\b. b (\b.b))

How does this avoid capturing b?

Edit: Oops, screwed up the substitution

### Close it on the outside

No. One way of thinking about a free variable is that it is just a bound variable in some bigger scope than the one you are currently working with.

(\a.\b. b a) b

You can turn it into:

\b.(\a.\b. b a) b

Then you can see how alpha-conversion can apply "legally":

\c.(\a.\b. b a) c

If the argument b was a more complex expression with free variables, repeat this until all free variables are bound. (This is the origin of the term "closure" used in FP; you are "closing off" all the free variables with bindings.)

### alpha conversion

Ahem .. alpha conversion doesn't touch free variables. It renames bound variables. To take cjh example:

(\a. \b. b a) b

is calculated by first alpha-converting the term to the equivalent:

(\x1. \x2. x1 x2) b

where x1, x2 are fresh variables, and then beta-reducing:

(\x2. b x2)

### alpha-rewrite algorithm

Does this simple algorithm seem correct then for finding a fresh name for a variable?

while new-term is bound in abstraction:
new-term = rename(new-term)

beta-reduce new-term for argument in body


where rename() just appends a ' to its argument.

### Good enough

Your algorithm is not clear enough for me to be sure what you mean in the details, so I can't really say, but I think you have the basic idea.

To be clear, the algorithm I'm proposing at a high level of abstraction is:

- make sure all your variables are bound, by adding closures as necessary
- make sure each bound variable has a distinct label from all others in the expression
- now beta reduce

### alpha conversion or substitution?

I don't really understand your original question, but maybe it will become clearer by going into a little more detail.

The first question to ask is: are you talking about alpha conversion (renaming a bound variable, without changing the behavior of a function) or variable substitution (replacing a free variable in an expression e by another expression e')? These are related, but different concepts.

Alpha-conversion can be defined as:

lambda x . e = lambda y . (e[x -> y])    where y not in fv(e)

Informally, what this says is that given a function (lambda x . e), we can rename the argument variable x to a new variable y, provided that (a) we consistently replace occurences of x in the body by y; and (b) we didn't already use y to mean something else in the body (if we did, we'd be shadowing or capturing y). Formally, this definition of alpha conversion depends on the definitions of two other things:

1. the set of free variables fv(e)
2. substitution e[x -> y]

A. We can define fv by cases as follows. Let x stand for any variable, and e, e1, and e2 stand for any lambda-calculus expressions.

1. fv(x) = x
2. fv(lambda x . e) = fv(e) \ { x }
3. fv(e1 e2) = fv(e1) U fv(e2)

In particular, (2) removes x from the set of free variables found in e, because it is bound by the lambda x .

B. Ok, next we can define substitution. I think this will address the confusion you had in your original post. I will present this in the general context of replacing free variable x by the expression e', since the notion of substitution is also useful for defining beta-reduction.

1. x[x -> e'] = e'
2. y[x -> e'] = y    where x != y
3. (lambda x . e)[x -> e'] = lambda x . e
4. (lambda y . e)[x -> e'] = lambda y . (e[x -> e'])    where x != y and y not in fv(e')
5. (e1 e2)[x -> e'] = (e1[x -> e'])(e2[x -> e'])

Explanations:

1. The first two cases serve as base cases: if you encounter x, replace it by e'

2. and if you encounter some other variable, don't change it.
3. In case (3) we don't change the expression because there are no free occurrences of x in (lambda x . e)—they're all binding or bound occurrences.
4. Note the side condition in case (4): we have to make sure the lambda y . doesn't accidentally capture a free variable y used in e'. This is because any free occurrences of y in e' refer to the value bound to y outside of the expression (lambda y . e). If the side condition is violated, you simply have to alpha-convert (lambda y . e), renaming y to some variable z that is not free in e'. If you want to avoid this sort of cascading alpha-conversion, what you can do is first gather up the set of all variables used in the original expression (not just free variables), and then pick a new variable that isn't in that set.
5. Finally, case (5) looks like what you mentioned in your original post. If you've gotten to the point that you're replacing the free variable x to the expression e' in the expression (e1 e2), all you have to do is independently recurse on the subexpressions e1 and e2. You don't need to worry about whether e2 is a variable or a lambda-abstraction or an application, because we've defined substitution for all of those cases.

### Some examples to clarify my point.

It's been a little over a week that I've been studying this topic and my understaing is very basic and cluttered since I've been piecing things together from disparate sources.

So I apologize that my questions are muddy, and I appreciate everyone who has the patience to deal with me as I try and clear things up!

I hope I can rephrase my original question now, so that it is a bit more clear.

Sameer, you were right to notice that I was slightly mixing up my topics. My question really does relate to both alpha and beta conversion. Specifically how they interact during reduction.

Let me see if I can explain.

Suppose you have a function:

\x. x y


In the simple case of beta reduction, if you were to take the following application:

(\x. x y) x


the reduction is straight forward:

(\x. x y) x              ; e1: \x y x  e2: x
==> (\x. x y) x'         ; alpha-convert e2 since x is bound in e1
==> x' y                 ; e1[x'/x]


Now let's add a new function into the mix:

\x. y x


If we take a new application:

(\x. x y) \x. y x


Is there a need to alpha convert the x, like so:

(\x. x y) \x. y x           ; e1: \x. y x        e2: \x. y x
==> (\x. x y) \x'. y x'     ; alpha-convert e2 since x is bound in e1


I'm going to stop there. When the next reduction step happens our reduced expression should be the following:

(\x'. y x') y


I become unsure of the next step when looking at the y variables. From what I have gathered from this discussion, they do not need alpha-conversion since they occur free. Is that correct?

### One little mistake

In the 'straight forward' reduction, you have to rename the bound x, not the free x. It should be (\x'. x' y) x.

In the following, you don't need to alpha convert the x in this case, but it does not give you wrong results either.

You're correct that the ys don't need alpha-conversion.

### beta-reduction

Beta-equivalence can be expressed by the following equation. Reading it left-to-right, this can be used as a definition of beta-reduction.

(lambda x . e) e' = e[x -> e']

Here e[x -> e'] is e, with free occurrences of x replaced by e', as defined in my previous post. Try applying the rules step-by-step on an example. Note that we don't have to do anything to e', because all of the free occurrences of x in e go away after performing substitution--they're replaced by e'. Thus in your example "(\x. x y) (\x. y x)", there's no need to perform alpha-conversion.

However, in the following example, we do need to perform alpha-conversion. Try it out and figure out where you need to do that. Remember that you can only alpha-convert bound variables.

(\x.\y. x y) (\x. y x)

More generally, note that the basic alpha, beta, and eta equivalences don't specify when or how you should apply them, simply how you can apply them. An evaluation strategy (such as either normal order or call by value) specifies a particular way to apply the rules.

### Follow-up

I've been pouring over this conversation, and various web resources now for the past week. And I think I finally have gotten my head around the concept of substitution.

I've built a lambda calculator that seems to work properly (alpha,beta and eta examples work. I've implemented AND, OR, NOT and IF, as well as FALSE, TRUE, ONE, TWO, THREE and PLUS).

I'll post it online sometime, when its a bit easier to use.

Just for completeness I've written down substitution as I understand it (and as I've implemented it), I'm including it here for comments and ridicule!

λ-calculus substitution (This damn concept took me forever to get right.)

	Substitution in λ-calculus is a rather involved process.
This is entirely because free variable can get captured during substitution.

Substitution is done recursively, and has the following rules:

Given application (λv.E M) we define E[M/v] substitute M for v in E as:

Variable Case
E ≡ v → E[M/v] → M
E is equal to v, substitute M for E
E ≡ x ∧ x ≠ v → E[M/v] → x
E is a variable other than v, no-change to E

Application Case
E ≡ (E1 E2) → E[M/v] → (E1[M/v] E2[M/v])
substitute recursively on E1 and E2

Abstraction Case
E ≡ λx. E1 ∧ v ∉ FV(E1) → E[M/v] → E
v is bound in E1, no-change to E

E ≡ λx. E1 ∧ x ≠ v ∧ x ∉ FV(M) → E[M/v] → λx. E1[M/v]
x is a variable other than v and x is not bound in M, substitute recursively on E1

E ≡ λx. E1 ∧ x ≠ v ∧ x ∈ FV(M) ∧ z ≠ x ∧ z ∉ FV(M E1) → E ↔α λz. (E1[z/x] → E1[M/v])
x is a variable other than v and x is not bound in M;
z is a variable other than x and z is not bound in M and E1, alpha convert x to z in E1 then substitute recursively in E1



The notation used above I've lifted from various formal descriptions I've encountered, but I use it with a heavy foreign accent. The comments hopefully will clear up my meaning.

### Simpler

You need only one rule for abstraction:

E = \x.E1 & z notin FV(E1,M,v) => E[M/v] = \z.E1[z/x][M/v]

[Edit: small correction]

### Neat

Thats the simplest way I've seen it described yet, danke.