Advantages of pointfree?

In the famous paper "Can Programming Be Liberated from the von Neumann Style?", Backus describes the programming language FP. It has many properties that are clearly advantageous:

  • Programs are written in terms of a well-understood set of combining forms that preserve termination properties
  • Data structures are persistent and IO is well-structured
  • Functions are not first class objects making certain forms of reasoning easier
  • Programs are written primarily in terms of function composition, not application, which makes manipulation of programs easier (or at least, it does in my estimation)

There is one more prominent property: FP is a pointfree language. My question is, how advantageous is this really? Obviously it does make manipulation of programs easier in some sense, but its benefits seem to pale in comparison to the benefits derived from other properties (such as its first order nature).

There is also a big limitation that results from not having any sort of "substitution form": It is not possible to "lift" objects into functions. For example, if one wants to write a function that takes some value X and a list and multiplies every element in that list by X, there's no way to "lift" that X into the 'map' combining form. One must instead use 'distr' to transform the list into a list of pairs where one element of each pair is 'X', after which the multiplication function can be mapped over the list. Besides being a bit opaque, this is needlessly inefficient.

Of course, one could always write a new combining form to handle this particular situation (if the language were to permit it), but there's no way to write it in terms of the existing 'map' form without substitution or higher order functions (which would permit partial application); it must be written from scratch, and as such, can't trivially inherit any properties of 'map'. In some sense, the lack of substitution seems to work against the idea of programming with a well-understood set of combining forms.

This is hardly an attack on the idea of pointfree languages. I'm well-versed in concatenative languages (which typically do offer variables as an escape hatch), I've written non-trivial programs in FP, and I utilize the pointfree style in Haskell heavily. I do very much enjoy pointfree programming; I simply am having trouble justifying it in an objective manner.

All thoughts and pointers to relevant literature will be much appreciated. (To be clear, I'm looking for materials that directly address the benefits of doing away with variables, not the other properties of FP or SQUIGOL-like languages such as programming with a rich set of combining forms or morphisms.) More specifically, if anyone were able to comment on what one would lose if a substitution form were added to FP, you'd have my gratitude.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Obligatory Comment

EWD692 ;-)

[On your example, I would think that optimizing compilers would do that transform automatically for you. Although ML experience says: just don't optimize - the few program fragments which need optimization will typically be optimized by the programmer.]

Dijkstra doesn't get it

Always wanted to write that subject...

In all seriousness, I'm not sure he is understanding a particular element of FP. He makes the claim that arguments are distinguished by ordinals instead of names, and then sarcastically questions why the ALGOL committee didn't require formal parameters be named 'par1', 'par2', etc.

In FP, the term '1' denotes a function that returns the head of the list passed to it. The term '2' is equivalent to the head of the tail, and so on. It's not the case that they're formal parameters as he seems to imply; they're simply normal functions.

I agree

I think a lot of his comments were ill-placed given what we know now. A substantial amount of his critique must have derived from trying to push Algol-like languages instead of math-like languages. Some of it sticks (why introduce all that Greekisms, it looks interesting, but here I agree, it is just a modern form of shamanism), some doesn't, (being able to reason by equality is way more intuitive than being able to reason by pre-/postcondition).

But he was way too smart not to understand. Maybe he just didn't like Backus?

[But the conclusion is right: unjustifiable aggressive overselling.]

[I think his comment on using '1' and '2' for functions/arguments has more to do with tidy programming where argument names should somehow relate to whatever the arguments denote (graph, db_connection, ...).]

Dijkstra doesn't get it (2)

In the point with the (par)1, (par)2, etc Dijkstra would have to be right, because natural numbers are little self-documenting. Backus generally did not use parameter names because he favored numbers as selectors. Variables contradicted the referential transparency and so he renounced it. But nothing speaks against instance variables (map data type / associative data element); these do not contradict referential transparency and can therefore be handled as local variables, except for slower processing. Immutable data types are generally a bit lame. Of course, variables are also a little bit fancier than #namex, #namey, etc, but that must serve as a distinction to the global function names.

// Example, "FP" with instance variables:
((#a+#b*sin°'2*pi*#c)<-a;b;c;):(100;50;0.3;)

Uneasy

I confess to be slightly uneasy about the application of the term "point-free" to FP languages, since this seems to mean that definitions are given in eta-contracted form, and don't make use of lambdas, and this seems to have only the most superficial relation to the point-free topologies that the term is supposed to be derived from. So I always think I am missing something...

But when I look at Backus' article, he gives such definitions in FP of the identity combinator as: id:x=x, which definitely doesn't look point-free to me.

My feeling that I am missing something only increases.

You're not

I don't believe you're missing anything. The programs themselves do not contain variables, but they certainly show up when reasoning about said programs.

Point-free: does it really mean stateless?

As I've been thinking about this exchange between Backus and Dijkstra, it has dawned on me that part of the problem we have understanding them is a loss of historical context.

We take it for granted that you can have "variables" that aren't really variable: "functional", "referentially-transparent", "stateless" variables. I suspect this idea hadn't taken hold in 1977. (To be fair, it still hasn't taken hold in most of the programming world in 2009... ;-) )

If Backus found it hard to imagine having a labeled "variable" that someone wouldn't "accidentally" assign a new, changed value to, you can see that he might feel that the only way to get functional programming would be to go "point-free", i.e. if there aren't any explicit label assignments, you can't re-assign them. In this reading, the virtues of "point-free" being expounded are really about the same as the virtues of "functional programming" as we currently understand it.

If this is true, Backus sounds odd because he is at the start of a line of development: this article caused others to think about the properties he was interested in, and find better ways to express and think about them. Since we are the inheritors of these later developments, we find it hard to understand the mind-set that produced the ideas and terminology that started the chain.

Conserve the points!

That's an interesting and very plausible hypothesis, Marc. I can understand Charles' complaint about terminology, although I don't truly appreciate it, as I am not familiar with the pre-existing topological definition.

Myself, I prefer a relatively point-free style. However, I also think that the difference is more stylistic than substantive, and taking the point-free approach too far is indeed "pointless." Yes, that is an overused pun, but it has a large grain of of truth behind it, as a completely "point-free" program is often obfuscated.

I find the clearest programs are a happy compromise between composition and explicitly named bindings. There has been much ado made over the fact that the point-free approach is supposed to make it easier to algebraically transform a given program, but I find I can usually perform (informal) algebraic transformations pretty easily either way.

My preference is in part due to the fact I hate making up and learning names. If a name must be given, it should be particularly suggestive, or at least idiomatic. One of the minor annoyances I have with Haskell is that there isn't a "case-lambda" analogue for anonymous functions, so you have to either introduce a name for the function, or use a (\x -> case x of {...}) construct.

A very loose rule of thumb I have is if a bound variable is used only once, then it perhaps it shouldn't be given a name at all. If I want to refer to something more than once, I tend to prefer to give it an explicit name. In particular, I don't like relegating complicated, "one-off" variable-shuffling to combinators. Above all, I prefer whatever I think is the clearest, which I admit is a very subjective criterion.

While I rather like Leslie Lamport's How to Write a Proof, I have a few complaints. For example, the "informal" English sketch tends to mirror the formal structure much too closely. In Figure 5, for example, I think it would have been much more clear to write "Assume \sqrt{2} \in Q" instead of "Assume r^2 = 2, r \in Q", however this stilted style of "informal" exposition permeates mathematics.

Not true

Backus was well aware that variables didn't have to be mutable. His FL language was released in 1989, and he certainly would've been familiar with ML-like languages by then. He very explicitly tried to differentiate between his "function level" approach and languages centered around lambda abstractions. He argued that the restrictions of the former enabled one to reason more easily about programs.

Search the paper linked to in my initial post from 1977 for the word "lambda" (which appears 27 times) and read what he had to say. I think you'll agree with my assessment.

A matter of interpretation

Search the paper linked to in my initial post from 1977 for the word "lambda" (which appears 27 times) and read what he had to say

I did just that, but it hasn't changed my sense that some concern that he hasn't quite managed to get out in the open is at the root of his qualms, and that this implicit problem is in fact the statefulness inherent in the von Neumann architecture.

His main purported reason for rejecting lambda expressions centers around the "complexity" of substitution rules that attend variables. At one point he explicitly lists:

"Each of
these requires a complex mechanism to be built into the
framework so that variables, subscripted variables,
pointers, file names, procedure names, call-by-value formal
parameters, call-by-name formal parameters, and so
on, can all be properly interpreted."

This grouping together of implicitly stateful and potentially stateless entities suggests that he isn't consciously aware of statefulness dimension of these things. To me, that makes it an odder coincidence that the "point-free" approach just happens to force all these into their stateless equivalents.

He also keeps relating the issues he is raising to the von Neumann architecture. He explicitly mentions the Funarg problem, a problem of implicit state that goes away when you introduce proper stateless closures.

It is quite possible that Backus had been exposed to the notion of stateless variables, but if so he is not explicitly weighting it here as a possible solution to his qualms. I think this is because he is is "thinking close to the machine", and the machine he has in mind is implicitly stateful.

From the point of view of the history of ideas, this is another interesting shift in mentality. Nowadays we take for granted the idea of a "virtual machine", that you might use to logically define the environment of a PL, independent of its hardware architecture.

But many of us will remember that as recently as the early 90s it was still quite common to think to debug programs by examining the values in stack, machine registers, etc. (Also, remember not-so-long-ago threads on LtU where FP newbies expressed incredulity that stateless languages could be implemented on stateful machines.)

Backus shows definite signs of not having made the shift in mental model between the logical and implementational aspects of his proposals, and again, I think this makes it harder for us to understand his concerns on the other side of that paradigmatic divide.

So let me turn this around, John. What aspect of Backus' concerns about "algebraic reasoning" are not addressed by a stateless language with explicit labeling?

Backus shows definite signs

Backus shows definite signs of not having made the shift in mental model between the logical and implementational aspects of his proposals

I simply can't agree that Backus definitely shows any such signs. You may want to watch Backus's explanation of FL (if you haven't already). In the video, he directly compares the function level approach with "the style of programming used in Lisp and in most functional languages" (4:07).

Ultimately, I think this issue is unimportant as I'm not sure what can be gained from diving into the psychology of the situation. I don't believe we have a need to guess what sort of "variables" he was talking about; he was very clear about what he was arguing for.

So let me turn this around, John. What aspect of Backus' concerns about "algebraic reasoning" are not addressed by a stateless language with explicit labeling?

Backus's main concern, it seems to me, was that the power of lambda abstractions as they commonly appear in functional languages is simply too great. Instead of being required to build up a program in a hierarchical manner, it's too easy to just write something from scratch. In the worst case, you many never even discover the patterns underlying your program at all.

I think there's strong evidence for this claim if you look at the concatenative programming languages and compare them to the situation in ML or Haskell. In concatenative languages, even the simplest of patterns are classified. For example, there is a family of combinators for applying multiple functions to a single element, another family for applying a list of functions to their corresponding inputs on a stack, et cetera. You don't get the same sort of fine granularity in your abstractions in other languages. (To be fair, Haskell does have the concatenative languages beat on larger abstractions at the moment, but this mostly due to Haskell's type system enabling them and large differences in the communities involved.)

To make this a bit more concrete, I'll give an example of a program. Say you want to write a function that, given two values, returns their sum and their product. In Haskell, you'd probably do something like this:

\x y -> (x + y, x * y)

In FL, you'd do the following:

[+, -]

Now obviously this is a bit of an unfair example, but I do think it demonstrates my point. Since FL does not offer variables in the sense that Haskell does, you have no choice but to write the program as above. In the Haskell version, you could write some construction combinator that, given two functions, applied them both to some input value. In practice though, you don't; or, at least, I've never seen it done. Instead, you use a lambda to implement everything from scratch and do not make the pattern of construction explicit.

(I should also note that, even if you wanted to make such patterns explicit in Haskell, things get pretty painful. First, the syntax doesn't give you a way to write with nearly the concision you can in FL. Second, and more importantly, the curried functions in Haskell get in the way; pointfree programming is much more comfortable when arguments are tupled.)

Essentially, I think Backus's thinking was that the lack of variables would force a hierarchal construction of programs. Such a construction would give enough structure to allow powerful -- and most critically for Backus, very simple -- reasoning about programs in a way you wouldn't otherwise have.

A little from column A, a little from column B?

It is a rather enjoyable video with an interesting historical context, thanks for posting it!

I'd say your speculations are also rather plausible, however, I would point out the motivations Backus gives for FL around 8:30 into the video. None of them really specifically apply to either functional languages or imperative languages, except one, namely his comment that "while" doesn't have nice algebraic properties, which would support Marc's thesis. But I also agree with you that he was trying to get at something deeper, whether or not he was actually onto something I'm a little unsure.

After all, almost every "functional" programming language at that point in time was not purely functional. Unfortunately we would unlikely to be able to definitively answer these questions, unless somebody who knew Backus personally can elaborate.

Backus's influence in the functional programming arena is undeniable. I find myself using point-free combinators from Control.Arrow more and more these days.

Understanding Backus's reasons for avoiding variables

Unfortunately we would unlikely to be able to definitively answer these questions, unless somebody who knew Backus personally can elaborate.

I worked with John on functional programming for several years starting in 1974. We were definitely familiar with stateless functional languages, e.g., ISWIM (Landin). John's languages were stateless at the time I worked with him, although I believe I influenced him to think about modeling systems with state. I also suggested allowing lambda variables when defining a new combining form such as "Apply to all", but he resisted. I was never able to fully understand Backus's reasons for avoiding all variables. I think John Rowan has come closest to understanding Backus's point of view.

Stateless?

Point-free must imply stateless, otherwise, probably, equational laws are broken. I assume a pointfree monadic variant must exist, it should be possible to handle state.

I think you should read Dijkstra's comment in another light: He just found it a badly written paper. Too many badly presented abstract ideas, too often repeated, with too many unjustified claims. It just didn't compare very well with a good article exhibiting a nice algorithm with concrete proofs of properties (like Knuth's articles).

All ideas can easily be broken down. Guess he was wearing his wrong hat that day.

Algebraic Optimizations

I think the idea of a pointfree language is great, but one needs defining constructs in such a language, and lambda abstractions are just too convenient to miss (and most defining pointfree constructs would translate to lambda terms anyway).

But it opens the way to algebraic optimizations, optimize by equality, which I always thought would be nice to have in a FP.

Optimization by equality

Could you expand a little on what you mean by optimize by equality? To me that sounds exactly the same as what a simplifier in a functional compiler does.

Map Fusion

The ability to state higher-order equalities which may not be derived directly by the compiler.

I think ghc has it build in with the RULES pragma. I am not sure how much it is used though.

[Googled it: look here]

Not sure

The rules used in GHC do mention the objects involved:

forall f g xs.  map f (map g xs) = map (f.g) xs

Accordingly, I'm not sure why you claim this is an advantage of pointfree languages. Perhaps it is simply a question of the degree of clarity involved. For example, the pointfree rule equivalent to the above is as follows:

forall f g.  map f . map g = map (f . g)

As I indicated in my initial post, it seems to me that the advantage of FP-like languages has more directly to do with working in terms of function composition rather than avoiding the use of variables per se. Perhaps such an avoidance is important simply because it is a precondition for the compositional style.

If that were indeed the case, an approach to variables that "lifted" them to functions would offer the best of both worlds. In FP for example, one could express the 'swap' function as a directed rewrite rule:

forall f g.  swap·[f,g] → [g,f]

Given the semantics of the functional form of construction:

forall f g x.  [f,g]:x → <f:x,g:x>

... the above definition for 'swap' would imply the following:

swap:<1,2> → <2,1>

I should note that Factor, a stack-based concatenative programming language, already does something similar to this. For example, the 'x' and 'y' in the definition below are composed as if they were functions even though they're understood to be objects:

:: hypot ( x y -- h ) x x * y y * + sqrt ;

Neither am I

Accordingly, I'm not sure why you claim this is an advantage of pointfree languages. Perhaps it is simply a question of the degree of clarity involved.

Yes, that is what I meant. Equational reasoning might be more intuitive than reasoning with pre-/postconditions. Though someone well versed in pre-/postconditions, like Dijkstra, apparently thinks different, and well, I have only seen academic point-free examples so far.

'Introduction to Algorithms' is still presented in a Pascal-like style, and for good reason. Point-free is a valid line of research, but until that book is rewritten in point-free style, I think it will remain a niche.

forall f g. map f . map g = map (f . g)

Don't see why this is point free. You are still mentioning the arguments? But I wonder what other automatic rewrites become possible. Some Knuth-Bendix based rewrite system?

But I don't know this field very well. I glanced at the (very nice) theses by Alcino da Cunha [fixed the name]. I really like the idea of point-free languages, but I don't find the definitions readable or intuitive.

You should discuss this with someone else.

Extra Comment

In my forthcoming *cough* language, I was partly inspired by point-free/combinator calculi. That's one of the reasons I used a block-style syntax for abstractions.

    def args: options -> text -> list text =
        [ oo, o ->
            aa = list.filter [ (o0, a0) -> o0 == o ] oo;
            aa = list.map [ (o,a) -> a ] aa;
            aa ]

I.e., a block denotes/is a combinator.

[But once you get entrenched in mundane problems, all academic niceties go out the window.]

a Pascal-like style

'Introduction to Algorithms' is still presented in a Pascal-like style, and for good reason.

What are those good reasons? They become less clear to me with every passing day.

Higher Abstraction

Higher abstraction comes with a significant cost. The more hardware is abstracted away, the harder it is to understand what is going on at the low-level. If you don't understand what is going on at a low-level, everything gets wishy-washy. Algorithms are not made to make sure that people (programmers) understand what they do, they are primarily made to ensure (run-time) efficiency.

I for one think that algorithms should be taught using LLVM assembly.

Algorithms are not made to

Algorithms are not made to make sure that people (programmers) understand what they do, they are primarily made to ensure (run-time) efficiency.

No comment!

Ha

Algorithms are not made to make sure that people (programmers) understand what they do, they are primarily made to ensure (run-time) efficiency.

I just opened CLRS to a random page. Line 2 of the algorithm on that page is "do determine j such that i \in K_j". What do you think the time complexity of "determine" is? And how is this not abstracting away far more than if this were written in Scheme or Haskell?

Most common denominator

... would probably be the best answer.

But also, best understood, clearest readability, cleanest operational model, best high-level language which allows low-level optimizations, simplest expressive language? Take your pick.

Some of the arguments are vague, sure, but would you really know a better paradigm to present algorithms except for the imperative one?

[Post ended up somewhere else. This is a response to why a Pascal-like syntax is used in Introductions.]

Two nice examples, one for, one against

Brent Yorgey has a nice, compelling example for the point free style in his Typeclassopedia article in the latest Monad Reader.

Namely he defines the operator "fish" (>=>) as

f >=> g  =  \x -> f x >>= g

and then the monad laws are quite nice, corresponding to the definition of a monoid:

f >=> return      ==   f
return >=> g      ==   g
(f >=> g) >=> h   ==   f >=> (g >=> h)

On the other hand, I have an example that I doubt has an insightful point-free description:

enQ e  = CorecQ' (mapContT (liftM (e:)) (modReader inc_len))

The above definition is mostly pointfree, which I like, and depends on a few non-standard definitions contained in the paper, but I don't think that eliminating "e" from the mix clarifies anything. Then again, maybe choosing a different set of functions from the outset would lead to a clear point-free definition.

to the beginning topic

Unfortunately no substitution found. (except perhaps lambda variables)
But maybe two helpful variants.

--- Variant A ---
Space efficient solution with lazy list.

lazydistr == (id lazy '((isatom°[0])->();((head°[0]),[1],),(tail°[0]) lazydistr [1]))°ee

Application:

(+ aa) ° '(10;20;30;40;50;) lazydistr '5

--- Variant B ---
Code generation before the map application. (aa == arg map head°term)

Application:

([0] map prop°'id,'+,(cquote°[1]),):((10;20;30;40;50;);5;)

[UPDATE]

--- Variant C ---
Combination of applytoall and distr.

daar == arg dmapr top ° term
dmapr == ((isatom°[0]°[0])->();([1] app ((head°[0]),[1],)°[0]),(((tail°[0]),[1],)°[0]) dmapr [1]) ° ee

Application:

(+ daar):((10;20;30;40;50;);5;)

*(using www.fp-system.org)

+/%#

I just summed up the average point made in this discussion.

Edit: This is subtle, dry humor.

We are amateurs and have little brains

FP/PF is like low code -- lack of anything unnecessary.
We like programming to be simple -- mathematical functions.
We love to program with handicap and retrenchment (referential
transparency and combinators), to be forced on the right path.
We want programming without context (no global and no lambda variables);
each context must be in the argument of the functions. (*)
-- The little spirit keeps order, only the genius overlooks the chaos. --
Avoiding lambda variables -- For people with big brains, it's like a joke.


(*) I don't know why, but John Backus wrote:

"... Advantages of FP systems. The main reason FP systems are considerably simpler than either conventional languages or lambda-calculus-based languages is that they use only the most elementary fixed naming system (naming a function in a definition) with a simple fixed rule of substituting a function for its name. Thus they avoid the complexities both of the naming systems of conventional languages and of the substitution rules of the lambda calculus. FP systems permit the definition of different naming systems (see Sections 13.3.4 and 14.7) for various purposes. These need not be complex, since many programs can do without them completely. Most importantly, they treat names as functions that can be combined with other functions without special treatment. ..."

How do you form your opinion?

Second Try

While browsing a Julia tutorial I came across the bullet point "scoping of variables".
That gave me an idea.
Variables need a scope in the program code. Associative data structures do not need a scope in the program code for the validity of their local instance variables.
If you want to introduce lambda variables in FP, for example with the following construct:

(<name> lambda <term with scope>) : <value>

This form is not chosen very well, but maybe someone else has a better idea.
Now, one would usually like to nest these "lambdas" as well, as this raises the question of the binding strategy. A dynamic binding strategy brings hard-to-find errors (example will be provided on demand).
So, a closure technique has to be chosen, which requires a complicated structure that is based on the function definitions; Simply substituting function names for their assigned value (for automatic reasoning?) might go back to dynamic binding. Who knows about this?

But another possibility is to use an associative data structure and its local instance variables instead of lambda variables that presuppose a scope:
You could play with the function definitions as a kind of "Goto" (with Tailrecursion).
It would not require the complete listing of all variables as is the case with lambda - but could continue to use the local instance variables in the other functions that you jumped with "Goto" call.

Sometimes ...

Depending on the language, there are many reasons why a lambda cannot be inlined, but sometimes it is indeed possible. Things that prevent it: (i) When the lambda itself can be used as a value that can escape the scope of its declaration; (ii) in the presence of recursion (or potential for recursion); (iii) when a pointer (or reference) to the lambda is passed as an argument.

In Ecstasy, for example, the lambda syntax looks like:

param -> expression
(param1, param2) -> expression
(Type1 param1, Type2 param2) -> {body}

So a nonsensical example of a function that takes an int and returns a string might look like:

function String(Int) f = n -> {
                              StringBuffer buf = new StringBuffer(); 
                              while (n != 0)
                                  {
                                  buf.append(n);
                                  n >>>= 1;
                                  }
                              return buf.toString();
                              };

You could invoke it directly as well (yeah it's a stupid example but you could do it):

String s = n -> {
                StringBuffer buf = new StringBuffer(); 
                while (n != 0)
                    {
                    buf.append(n);
                    n >>>= 1;
                    }
                return buf.toString();
                }(42);

To inline the lambda, you just drop the parameters and the invocation:

Int    n = 42;
String s = {
           StringBuffer buf = new StringBuffer(); 
           while (n != 0)
               {
               buf.append(n);
               n >>>= 1;
               }
           return buf.toString();
           };

The obvious side-effect is that the reference to the separate lambda function no longer exists.

the use of free variables and the Leibniz principle

Does the use of free variables violate the Leibniz principle of identity substitution?
(the Leibniz principle is important for "lazy evaluation")

An example:

the term

 ... ('5+'3*sin°2pi*id):0.7 ... 

can be replaced by

 ... '2.14683045111454 ... 

the term with a variable

 ... (x lambda '5+'3*sin°2pi*x):0.7 ... 

can also be replaced by

 ... '2.14683045111454 ... 

But what is that?
the term within the lambda expression with the (free) variable amp

 ... (amp lambda ... (('5+amp*sin°2pi*id):0.7) ... ):3 ... 

would be replaced by

 ... (amp lambda ... '2.14683045111454 ... ):3 ... 

and
the term within the lambda expression with another argument for the lambda variable

 ... (amp lambda ... (('5+amp*sin°2pi*id):0.7) ... ):42 ... 

would be replaced by

 ... (amp lambda ... '_34.94437368439645 ... ):42 ... 

Is that correct?

My point of view

... Is that correct?

and

... Implicit was an expectation that there should be a well-defined understanding of when two program phrases were semantically equivalent ...

The use of nested scopes is a common way in Lisp (see LET-function).
In Backus FP the terms are always closed, so the term substitution is always possible.
With free variables, the terms are not closed, so term replacement is not always possible.
Backus needed for his Algebra the possibility that the terms are always deformable / replaceable.
It is not that you have to completely renounce variables, there is the possibility to give instance variables
in a "similar format of closures" as arguments to the terms / functions.
But what John Nowak seeks, he will not get under the term of term deformation / term replacement.
It's my opinion.

Edit: Look at http://dirkgerrits.com/publications/john-backus.pdf#section.8 --> "Closed applicative languages (1971 - 1976)" ff

A point-free self-hosting Haskell compiler

https://www.ioccc.org/2019/lynn/hint.html

Why?
One-letter variable names abound in IOCCC entries, and for good reason. These tiny pieces of confetti are hard to read, and leave room for more code. Then why not go further and use zero-letter variable names? That is, tacit programming or point-free style.

I had been playing with an algorithm devised by Oleg Kiselyov that effortlessly and efficiently eliminates those pesky variables, leaving behind terms composed from a small set of combinators. No need for lambda lifting or supercombinators.

By adding a handful of lines of mostly parsing code, we get a Haskell compiler, or rather, a compiler that accepts a subset of Haskell sufficiently large to self-host. You might say I wrote a tool for this contest, then ran it on itself to make an entry for it.

it sounds strange

I think you are laughing in tail-recursion.

Heavily Labeled Point-free Programming

I like a physical 'code as material' metaphor in language design. Less like natural language and more like Lego bricks. The essential properties for this metaphor are locality and compositionality. Locality is our ability to decompose, model, and understand behavior with minimal syntactic context.

As a consequence, most languages I design are point-free. It isn't that point-free is intrinsically good. Rather, the semantics of lambda variable substitution, free variables, and alpha conversion are intrinsically non-local, and do not fit my metaphor.

However, giving up on lambda variables doesn't imply abandoning naming systems entirely. In his paper, John Backus describes a naming system that is essentially an association list with atomic symbol values (a row-polymorphic record, if we must assign a type). However, the nature of names will shift from 'variables' outside the model to 'labeled structures' within the model.

Although we don't "need" names, they remain convenient both for lightweight extensibility (e.g. it's often more convenient to add a field to a record than to change a pair to a triple), commutativity (order of fields within a record can be opaque or deterministic), and as a form of documentation. Any one of these conditions are sufficient to motivate support for label-based names.

My most recent language, Glas, heavily uses labeled structure. Instead of a combinatory logic, Glas is based on graph rewriting. Nodes in the graph have directed, labeled edges. There are two rewrite rules - application and unification. When nodes are unified, outbound edges with the same label are also unified. A function is a bounded subgraph with a unique outbound 'open' edge. When applied, the function body is copied and the open edge is connected to the applicand, enabling access to input-output parameters via unification.

Glas is a point-free language for some definitions of "point-free". There are no lambda variables, no substitution semantics, the nodes are anonymous, and function application is implicitly parameterized by location in graph.

OTOH, the graph can represent cyclic structure. When a cyclic graph is represented *textually*, we inevitably encounter the challenges of graph representation, e.g. using references by relative path or offset or temporary node names. Depending on how we define 'point-free', we could consider any graph-based PL that allows cycles to have 'points'. In that case, violation of "point-free" is just an artifact of using 1-dimensional syntax to express K-dimensional computations.

In any case, we can have rich naming systems without the semantic weaknesses of lambda substitution. I believe this is a path worth developing.