Disallow shadowing?

I came across an early post on the topic of disallowing shadowing (message quoted below):

In lambda calculus you can always rebind an already-bound name, but in
practice, there are lots of limitations imposed on rebinding in real
languages: top-level module bindings can't be redefined in mzscheme,
operations of type classes can't be redefined at the top level in
Haskell, etc. What happens if you take this to the extreme and disallow
shadowing entirely in a lexically scoped language? [1]

At first I thought this would cause problems for macros, but in a
hygienic system that renames everything to something fresh anyway, it
probably wouldn't matter.

Other than being an annoyance when you want to reuse a name for some
other purpose (which is arguably bad style anyway), what problems would
disallowing local rebinding cause?

Dave

[1] http://www.mail-archive.com/haskell@haskell.org/msg01268.html

There were a few follow-up discussions but did not expand further. Essentially I have the same question as Dave and I hope to hear more comments on it.
What is theoretical and practical consequence, say if we forbid shadowing in the lambda caluclus?

I noticed that CoffeeScript has gone toward the direction of forbid shadowing. Although raised a lot of opposition, the designer insisted on it. Is his point solid?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

GHC has a warning for name

GHC has a warning for name shadowing, which I use as a matter of course (as part of '-Wall'). Programming without name shadowing is not difficult, and (IMO) does improve clarity and resist silly mistakes. Errors caused by accidental name shadowing are rare (so I don't look for them by habit) and can be very difficult to spot without such warnings.

Of course, CoffeeScript's scoping has a lot of other problems, independent of shadowing. Scope should certainly be subject to local reasoning. Avoiding pervasive mutable state would also simplify detection of the problem.

Theoretically boring

I don't think there's much interesting from a theoretical perspective. As a practical / engineering perspective, I think this is one of those cases where good IDE support can gives best-of-both-worlds behavior:

- Symbols have a text name (what you press at the keyboard) which can shadow the text name of other symbols in scope.

- Symbols have a distinct visual appearance from other shadowed symbols.

- All new symbols must be explicitly introduced.

- Any ambiguity of reference is flagged by the IDE and resolution is required.

- Copying code just works. Shadowed symbols become visually distinguishable automatically.

Looking forward to some of

Looking forward to some of these features (maybe not disallowing shadowing ;) in DrRacket.

Coffeescript has no binders + syntax for warnings

The problem with Coffeescript is not that it forbids shadowing (this can be is a software engineering hurdle, but is a choice you're free to make as a designer, although a user-togglable warning would probably be more appropriate), but that it does not have a construct for introducing new variables. The scope of variables is inferred, and this (which is only possible in absence of shadowing) is pure madness, dare I say equivalent or worse to dynamic scoping by default.

(Of course you could alleviate the problems with inferred scope by good IDE feedback, essentially fixing a lack in the term syntax at the tooling level, but that's wishful thinking, as to my knowledge there do not exist such Coffeescript editors.)

I think we could be more careful about shadowing in our binders design; it's good to have a warning on shadowing (or tooling support that helps you notice shadowing), but you want some syntax (or tooling UI) to indicate intent and disable this warning when shadowing is intentional, such as

  let x = normalize x in
  ...

I've personally toyed with the following design: extend the pattern (binders) language with a construction `$p`, and the expression language with a construction `$e`, where `$p` and `$e` have the same dynamic semantics and static type than `p` and `e` respectively, and `$p` never warns about names bound by `p` that shadow names of the environment, and `$e` never warns about binders inside `e` that shadow outside names (shadowing of e binders by e binders may still warn).

Structural editing comes to the rescue

Alternatively you can let names have no semantic meaning; they are solely for the purpose of the programmer. When you start entering a variable name, you get a list of variables in scope that gets progressively narrowed down as you enter more characters. If you have two variables with the same name in scope, then that list will have 2 remaining items even if the full variable name is typed, forcing the programmer to pick one.

I.e. Names shouldn't be

I.e. Names shouldn't be wasted on compilers (tm J. Edwards)

which is only possible in

which is only possible in presence of shadowing

I'm not sure what you meant by this parenthetical. I'm not having trouble imagining a language in which there is no shadowing but variables are implicitly bound to the outermost scope that they appear in.

I've never used Coffeescript, but Python's scheme of introducing a new shadowing variable in any scope with an assignment to that variable is also insane. I've only been bitten by it a handful of times, but when I am it's usually annoying to debug.

Hm

I meant precisely the contrary, of course: if there is no construction to signal the introduction of a new scope, there is a non-ambiguous reconstruction of scoping *if* you assume that no shadowing is taking place.

You are right that even this interpretation is not totally correct, as Python used to have built-in notion of scopes (block scope, class scope, module scope) that scope-inference would not cross, introducing the possibility of shadowing.

Hear, hear

The scope of variables is inferred, and this ... is pure madness, dare I say equivalent or worse to dynamic scoping by default.

This should be engraved on the hearts of language designers forever. All the worst and most confusing problems with Python, Ruby, Javascript and (apparently) Coffescript derive from scope inference. And it's made even worse by the fact that each of those languages has multiple scope chains to consider (object/class hierarchy, lexical, whatever). It's a complete mess and it would have been trivial to avoid. All it takes is one little keyword, var, let, I don't even care what you call it!

So, imperative language designers of the future, heed my warning: SHARPLY DISTINGUISH BINDING FROM ASSIGNMENT OR BE FOREVER DAMNED.

Example (gashe proposal)

So for example:

let p = e in (*1*)
let $p = e in (*2*)
$let p = e in (*3*)
let p = e in (*4*)

The first $ silences the warning the p#2 shadows p#1 and the second $ silences all warnings about shadowing p#2 in any binder but p#4 warns because it shadows p#3. So the $ acts as a barrier?

Yes. $p on binders denotes

Yes. $p on binders denotes intentional shadowing (there are a couple of programming patterns were shadowing is sought-after, and not a mistake), while $e on expressions acts as an expression-wide barrier to help code mobility (typically when you cut-paste code into a new scope).

NetRexx has an interesting attitude to scope

Declarations, casts, and what I will call casts-without-expressions are all forms of assignment. The (lexically) first appearance of an assignment to a variable types that variable. If you write "a = 32", then a is of type Rexx (a hybrid between string and number, like a Perl scalar). If you write "a = int 32", where "int 32" is a cast of the value 32 (inherently of type Rexx) to an int, then a is of type int. If you write "a = int", then a is again of type int but is not initialized.

Hijacking

The problem with disallowing shadowing in some languages is that it leads to hijacking. Consider you have a global scope with variables and functions, and in one function you shadow an existing variable x.

Fine, the compiler gives an error x shadows a global, so you change it and keep writing the function.

Now you've finished and a year later another programmer comes along and sees you have 100 global variables and 200 functions and adds one more variable, x, and another function .. and creates a bug in an unrelated function .. one they should not need to look into.

What they've done is hijacked the defining instance of the variable out of the function, so the one in the function is now shadowing. So the code is fragile, and the language design doesn't respect the holy grail of information hiding by functional abstraction.

The hijacking is easy to fix

... one they should not need to look into.

... and the language design doesn't respect the holy grail of information hiding by functional abstraction.

I suppose the only thing this new programmer needs to do is alpha-renaming, which does not break abstraction and does not even need investigation into the function.

Fully specified names

I mused, some time back, about the prospect of a language with fully specified names. Supposing each scope could be given a name, each reference to a named entity would give not only its local name within its scope, but the name of its scope, the name of whatever scope contains that, all the way up to and including libraries, and the name is unique amongst all entities in the infosphere. I figured it would be handled by an editor that, most of the time, hides most of these monster-qualified names so the source code isn't rendered unreadable.

As a purely technical problem, one would have to contrive to handle anonymous blocks and anonymous functions, and at the largest scale it would seem to entail something not-dissimilar to IP addresses. A more daunting prospect might be interaction with versioning.

In the end, though, I tabled the idea (in the US sense, not the UK sense) because I decided, at the time, that free variables in a term are sometimes deliberately of unknown scope; as I recall, I was at the time studying Will Cook's treatment of inheritance as a fixpoint operation. Though the exercise has left me with a disconcerting sense that our whole notion of local names (essentially, the Lisp symbol type) is somehow flawed.

... free variables in a

... free variables in a term are sometimes deliberately of unknown scope

So that they could be captured later? Are you arguing for deliberate dynamic binding?

left me with a disconcerting our whole notion of local names (essentially, the Lisp symbol type) is somehow flawed

In what sense? I do not see the relation to the Lisp symbol type. Do you mind elaborating a little bit more?

Are you arguing for

Are you arguing for deliberate dynamic binding?

It wouldn't have to be dynamic since, as iirc was alluded to somewhere here, changes to source code can similarly alter static bindings.

I believe the central phenomenon in my thinking at the time (this would be probably over twenty years ago) was inheritance: an object of a descendant class invokes a method that it doesn't itself define, and therefore it doesn't know exactly which ancentral scope the binding is in.

I do not see the relation to the Lisp symbol type.

The Lisp symbol type is the embodiment of an unqualified name; it exists as a type for the purpose of being understandable only in relationship to the environment in which it is evaluated and, in general, the entire ancestry of that environment.

re fully specified names

Though the exercise has left me with a disconcerting sense that our whole notion of local names (essentially, the Lisp symbol type) is somehow flawed.

I went through a similar line of reasoning and concluded that insisting on the universality of the conceit of "functions" (pure or otherwise) is the root cause of a lot of cognitive dissonance and confusion about programming.

universality of the conceit

universality of the conceit of "functions" (pure or otherwise) is the root cause of a lot of cognitive dissonance

Since plm has already asked for more exposition on John's philosophies, I'll ask to hear more about this, as well, if you're willing to write more.

re: universality of the conceit

I'll ask to hear more about this, as well, if you're willing to write more.

I think I'll be writing more about it later as blog posts or something.

Briefly:

In a side-effectful language, "functions" are a notation for writing subroutines unified by a set of operational conventions for passing parameters, using a return/parameter stack, return values from subroutines, non-local exits from subroutines, exceptions, dynamic winding, closures, and continuations. (Possibly more I'm forgetting.)

There's no essential reason to build programs exclusively out of subroutines based on such conceptually heavy and arbitrary conglommerations of conventions -- we do it out of historic momentum.

We built up these overly-complex, overly-arbitrary subroutine conventions because we had to fit everything into our earlier choice of function notation. We keep extending and hairing up function notation to accommodate all those glommed together conventions. It's a vicious circle. A kind of collective neurosis.

From that perspective it seems to me that local variables and nested scopes within our function notations are solving a problem that doesn't exist until it's created by our choice of function notation. The topic issue of name shadowing is an issue only because of a strange succession of choices we made that create the problem in the first place.

Of course, the resulting "functions" in such languages, considered in their full generality, have no deep relation at all to mathematical functions of the lambda variety. On top of the historic weirdness of insisting on building programs out of these heavyweight subroutine conventions -- our choice of function notation falsely suggests we can use lambda-style reasoning much more than we really can or ought to.

Then the real kick in the teeth came for me when I started thinking of "favorite programs" of mine -- various unix utilities, text editors, games, and so forth. I thought about the conceptual models of these that I've seen in good implementations of them. Almost to a one they are organized around non-functional concepts like state machines. A reader of the source code has to step back from seeing an ocean of "functions" and find the real conceptual model to which the program is written.

So I became interested in looking for notations that might allow such programs to be expressed in a more straightforward, direct style. That pretty quickly led me to want to break away from the tyranny of "function" calling conventions in typical languages. And that's when I realized that complexities of the notation (like shadowing) are solving problems created only by the choice of notation itself.

As for actual "pure" functional languages:

They just don't interest me very much as practical tools -- which I know some people will find a ridiculous opinion but that's a whole separate discussion.

As a point of history I regard their main original motivation to be that of trying to make "functions" more resemble the lambda concept but I don't remember at any point experiencing that as a problem that really needed to be solved.

Parameterization seems fundamental

I agree with your point about there being a good bit of baggage in many languages around stacks and exceptions, etc. The core idea of functions, though - taking the thing we're building and making a particular part of it a parameter - seems like a fundamental operation that's going to be wanted in whatever notation we come up with. Also, to me, the point of purity is to find a clean semantic model for the higher level components we build. It's not an end in itself.

re: parameterization

I agree that parameterization is fundamental but I do not agree that parameterization is "the core idea of functions".

A mathematical function is typically a parameterized mapping from domain to codomain.

Therefore, for one thing, a "core idea of functions" is that of returning a value (or values) from some codomain. The programming concept of passing parameters is more general than that. One can pass parameters not expecting a value in return.

Sometimes in response to that people try to say that programming language functions are really just "goto with parameters" but in saying so, they are dragging in all the baggage of continuations. In particular, we are left requiring that "function calls" are not exactly goto-with-parameters because they pass along an implicit continuation. And we are left with the notion that a return is pretty much just like that kind of function call except we omit passing a continuation (or else ignore the continuation we fictively pass during a return).

In practice it's also a bit unclear where explicit parameters leave off, and where parameterization with the ephemeral environment and global state is important. We have traditional concatenative languages, at least, as an existence proof that the distinction between explicitly designated parameters and global state is not essential.

The programming concept of

The programming concept of passing parameters is more general than that.

That's an advantage of a pure language - parameterization is exactly what functions do.

Half of what functions do

Parameterization is, I would say, half of what functions do. The other half is, functions presume a bottleneck through which the results must pass. Eliminating the bottleneck is a puzzle; continuations I maintain do not eliminate the bottleneck, rather they submerge it into the deep structure of the control logic. I freely admit, I don't know how to solve this puzzle, but I do see that it is very deeply embedded in our thinking. Even the very concept of "side-effect" presumes a term-based notion of locality. So, I reckon, the bottleneck doesn't depend on whether or not the function is pure.

Named parameters to the rescue?

That was a little cryptic, but I wonder if my language addresses your concern with the named parameters I mentioned earlier.

Let me first elaborate on what I think you might be complaining about, as it's a common problem (apologies if I've misread you). The problem: If you have a function f(x) and you want to add another parameter, it's easy to switch to f(x,y). But then if you want to add another return value, then common practice is to introduce tuples or something (get the result through "the bottleneck"). Doing so breaks f for existing clients and even if you're willing to make the required changes, it complicates f for clients who don't care about g.

What we really ought to be doing, IMO, is introducing the new return result as g(x,y), leaving f(x,y) alone. But there are a couple of problems with that in most existing languages. For one, the obvious syntax problem: repeating the parameter list for every return value is annoying. For another, in most languages my proposed solution entails re-calculating any intermediate results that f produced that are also needed by g.

This is the main reason that I place an emphasis on named parameters in my language. Instead of:

print f(10,20) g(10, 20)

You'd write:

print f g where
    x = 10
    y = 20

Similarly on the defining side, you can setup the context of having an x and a y, and then provide definitions for f, g, etc. and let the compiler figure out how to handle the sharing (it's not a heroic effort).

Interesting

This is a <searches for word> nifty idea. I like it; it feels like the sort of thing that might turn out to have unanticipated uses — possibly smooth in the sense of my Smoothness Conjecture. As you suspected, at first blush at least it's not quite what I had in mind here... and yet I can't say with great confidence, off hand, that it isn't somehow related. In my own efforts with Kernel, I treated the whole multiple-value-return thing as a halucination brought on by underpowered data-handling mechanisms, and the Kernel Report makes a case for that — but your solution as I understand it is, in fact, a more powerful data-handling mechanism; just not the same one Kernel, as a Lisp, naturally went for.

I like it. It seems more

I like it. It seems more robust than my idea to do everything by type. We could even make formal parameters as fields that could be shared by multiple procedures...like branded types (x : int, the branded type is "x" but the actual type is int). As a bonus, you could document each parameter only once even if it was used in multiple procedures.

It does seem sort of similar to dynamic binding:

fluid-let x = 10, y = 20 in 
  print f g

The caveat is that dynamic binding maintains the binding down through multiple calls whereas in your scheme, I suppose, binding is still lexical. But you can statically check dynamic binding (by ascribing to signatures what parameters must be bound). It would then be very equivalent and flexible.

We could even make formal

We could even make formal parameters as fields that could be shared by multiple procedures

Yes, that's what I do. You declare parameters separately from definitions, which could be in another module (making functions more extensible / open). Declarations can specify parameters they depend on, but for definitions it's usually inferred.

But you can statically check dynamic binding (by ascribing to signatures what parameters must be bound).

Yep. Haskell actually has dynamic binding through a very similar mechanism, but it's a somewhat obscure feature, rather than a cornerstone of the language.

I've spent a good bit of time thinking about the interplay of this feature with the rest of the language. It impacts quite a bit: context variables are important to a bunch of idioms, interaction with processes, functional reactive values (modeled via hidden state parameter).

Bottleneck supports local reasoning

Eliminating the 'bottleneck' is not difficult. Logic variables (or fractional types), term rewriting, constraint models, grammars with intersection - none of these introduce severe bottlenecks on how data flows through a computation.

But I think this isn't a problem to solve. That 'bottleneck' is also the basis for local reasoning and context-free equational reasoning. And local reasoning is awesome. Even for concatenative notation, I find it hugely useful to control inputs observed and outputs produced by each operation.

Not (just) concrete data

I didn't actually mean a conrete data-flow bottleneck, although that's one form it can take. An information bottleneck, yes, but the information involved isn't just concrete data. When you mention "the basis for local reasoning and context-free equational reasoning", that too I include in the bottleneck. It's kind of my point. The moment one thinks of a term as denoting a meaning, there's the bottleneck. For instance, if one then allows for the possibility of exceptions (iirc there's a recent LtU thread about that), suddenly the denoted meaning is a disjoint sum, which is an artificial way to force a singular denotation in a non-singular situation. And so on. To me, the very fact that one keeps wanting to go back to reasoning that embraces the bottleneck, suggests that the ways we supposedly 'eliminated' the bottleneck were really unsuccessful --- if they'd really worked, we wouldn't be so anxious to go back to it. Hence my view that eliminating it is difficult.

Grasp the bottleneck

that one keeps wanting to go back to reasoning that embraces the bottleneck, suggests that the ways we supposedly 'eliminated' the bottleneck were really unsuccessful

Were the attempts truly unsuccessful, or just unwise? Local reasoning has tremendous value, and we should be very wary about giving that up.

if one then allows for the possibility of exceptions, suddenly the denoted meaning is a disjoint sum

"Doctor! It hurts when I shoot myself in the foot!" :)

Should I imagine a wine bottle?

From John's comment I gather that I didn't really understand what he was talking about. Re-reading this exchange several times hasn't helped much. Do you (David) understand what the bottleneck is? I have only the foggiest impression of what it might be from the example of terms and exceptions.

Bottleneck on Abstractive Power

John Shutt's philosophy on PL is very structural. And that much of his philosophy aligns well with my own. John's favorite language feature is abstractive power - an ability to manipulate the structure of the language in non-trivial ways from within the language. Among his favored tools to achieve abstractive power is the use of fexprs. Fexprs enable a term expression - a syntactic or computed structure - to be given different interpretations in different contexts.

A function has just one meaning, just one dataflow. In a sense, these can be understood as bottlenecks on John's favorite feature. We can apply a function, but it is difficult to achieve wide or deep influence on the language by use of functions, or in context pervasively using functions. We could achieve deep change with, for example, global rewrite rules or AOP. But doing so would defeat much value of functions.

There is some subtext in our dialog. :)

I prize composition and local reasoning. Context-free semantics are essential for those properties. To me, abstractive power is not a high priority. I align more with you in that regard, e.g. a preference to model effects, exceptions, backtracking, etc. in a purely functional manner, leveraging monads and arrows, categories, EDSLs, algebraic effects, and so on. (I also leverage capabilities and substructural types. It is often useful to control access to a capability via monad or similar. And substructural types are very nice for enforcing structured behavior orthogonal to the monad/arrow structure.)

John's 'bottleneck' isn't inherently objectionable unless you have the same priorities as John.

Irons in the fire

I was looking forward to seeing what you'd say about me. :-)  That's a fair description of some of where I'm coming from... but I'm not altogether single-minded. I've more than one iron in the fire. For one thing, my interest in manipulating structure follows a prior interest in grokking it; I look for indirect evidence of a paradigmatic flaw (example), and try to deduce the nature of the flaw and what sort of alternative paradigm would resolve it. This doesn't usually come from observing a single anomaly — extrapolation wants more than one data point.

But here I see multiple data points. Continuations, which I explored in Kernel, appear to me to lack generality because they are inherently stack-based; and they have exceptions at one hand and parallelism at the other, an accumulation of interesting problems in control flow. Monads seem to me to be another thing that almost works until you look at it more closely. So I find myself wondering after some other kind of structure than the single-result bottleneck — though not, afaics, any of the alternatives I've seen, none of which have ever impressed me as having a compelling rhyming scheme.

Thanks

Thanks (and to David). I have a less foggy idea of what you mean now.

Control flow is paradigmatic flaw

Control flow is itself a paradigmatic flaw. From your blog:

What you hope to find is the natural structure of what you're describing — a structure that affords a really beautiful, simple description. When you strike the natural structure, a sort of resonance occurs, in which various subsidiary problems you may have had with your description just melt away, and the description practically writes itself.

But here's a problem that often occurs: You've got a structure that affords a pleasing approximate description. But as you try to tweak the description for greater precision, instead of the description getting simpler, as it should if you were really fine-tuning near the natural structure, instead the description gets more and more complicated.

Control flow demonstrates these properties. It may seem natural and beautiful in context of a single CPU. But when we try to tweak the description for open systems, greater concurrency, or other computation hardware (e.g. FPGA, GPGPU), or even greater precision (e.g. the implementation details on hardware), it does not become simpler. Instead, it becomes more and more complicated. Dataflow is the more natural paradigm, and has superior properties all around. However, for dataflow to replace control flow, we must find alternatives to control-flow as the basis for managing side-effects. Linear types or incremental functional structures (monads, arrows, automata, etc.) offer useful alternatives.

Continuations, at least as you've experienced them, seem to be control-flow oriented. You are capturing an implicit 'stack'. Delimited continuations at least offer a little formalism for what is captured. But if you instead model a delimited continuation monad, it becomes much more feasible to treat continuations as an abstraction of control flow, expressed as a data flow.

I don't agree with your article on monads, though I acknowledge the rigid, brittle nature of threading information through static nominative types (e.g. in monad type signatures). Any inherent problem with monads, however, should apply even to a dynamically typed implementation of monads. For flexibility with static types, structural row types or dependent types could help (cf. algebraic effects).

For the particular use-case of diagnostic messages (with no semantic effect, and no effect on optimizations) many pure functional programmers would find it acceptable to introduce some primitive 'debug' functions or use 'unsafe' IO. With a better IDE, it certainly wouldn't be a problem to probe a pure function to observe intermediate values. I think you cannot generalize from a debug messages to the broader use of monads.

I agree that control flow

I agree that control flow, at least as tradtionally understoood, is a problem. It seems to me attempts to eliminate it are just as bad. I've been trying to pick apart the notion of time, to see what makes it tick (sorry, the pun was irresistable).

I tend to view delimited continuations as evidence of something going awry, though this doesn't keep me from being most interested in them.

I'm skeptical when someone says something isn't a problem because one can use automation to mangage its complexity. There's a quote out there somewhere, that I'm most annoyed at having lost track of, to the effect that if your equations for cosmological physics (I think it was) take up more than half a page, don't even bother with them, they're wrong, because nature just isn't that complicated.

And yes, once you move off the well-trodden path, there's all sorts of other paths to try; as I may have remarked a time or two, I think it's great that people explore alternative approaches that differ wildly from each other. Hurrah for basic research! I'm aware you don't share my assessment of monads.

flow based programming

there's a non-empty subset of the FBP crowd (who i guess distinguish themselves from those others who say they are, like, in the Dataflow crowd) are quite vocal about call/return being pure evil.

However, for dataflow to

However, for dataflow to replace control flow, we must find alternatives to control-flow as the basis for managing side-effects.

Or perhaps enshrine a single, most general control flow primitive we know of, such that any coherent composition of effects can be readily derived.

Conceit of impurity?

Exceptions, closures, continuations, coroutines, state machines and automata, effects, etc. can be quite beautiful when modeled (purely) functionally (e.g. by applicative, arrow, monad, pipe, zipper, lens). The real utility of pure functional languages is to enforce development of clean semantic models. It is not to eliminate effects and exceptions, but to make them (and the coupling to context) obvious and explicit. The utility certainly is not "to make 'functions' more resemble the lambda concept" - that's a means, not an ends.

Even in an impure language, purely functional approaches are valuable. I feel purely functional models of automata or incremental processes (e.g. `µP.(a→(P*b))` or `∃s.(((s*a)→(s*b)) * s)`) are far superior to most impure models, e.g. with respect to abstraction, composition, process control, reuse.

Sadly, it seems to take a purely functional language to discover purely functional models. I conjecture this is due to path-of-least-resistance and path-dependence issues. In effectful languages, the path of least resistance (short term) is often to use effects, or to add a feature to the language.

Under POSIWID, the hypothetical historical 'purpose' of purely functional programming isn't especially relevant.

Exceptions, closures,

Exceptions, closures, continuations, coroutines, state machines and automata, effects, etc. can be quite beautiful when modeled (purely) functionally

This is really beside the point, from my perspective.

Remember that I've become "unconvinced" (was formerly convinced) that I really want to build programs out of exceptions, closures, continuations, and coroutines. I don't think pure functional programming languages have a monopoly on "beautiful" models for these things but in any event that a programming language has such models doesn't much impress me.

State machines and effects do strike me as practical concepts to use in programming but in my experience, in good programs, these concepts are generally deployed in simple ways that don't need a lot of "taming" by some discipline enforced by a programming language.

I feel purely functional models of automata or incremental processes (e.g. `µP.(a→(P*b))` or `∃s.(((s*a)→(s*b)) * s)`) are far superior to most impure models, e.g. with respect to abstraction, composition, process control, reuse.

It's pretty common to create little DSLs for that purpose. I'm not sure I see, there, an argument for a general purpose pure functional language.

Under POSIWID, the hypothetical historical 'purpose' of purely functional programming isn't especially relevant.

History is useful for tracing back how and when you came to care about something when in the present it no longer seems clear why you should.

For example, sometimes people still put a very high premium on programming language features that purportedly greatly increase the powers of abstraction and composability, perhaps in service of taming terribly complex programs or perhaps in service of turbo-charging "re-use".

I wonder in retrospect how much of that emphasis on those qualities is really driven by momentum left over from the days when people believed in a particular kind of looming "software crisis" that didn't quite materialize in the ways forecast.

History is useful for

History is useful for tracing back how and when you came to care about something

But history is not so useful for explaining why a newcomer should care, nor even why that same newcomer might care years later. Our reasons for caring can change.

State machines [..] do strike me as practical concepts to use in programming

If you want any composability or extensibility or reusability properties, grammars would be a better option than state machines.

in my experience, in good programs, these concepts are generally deployed in simple ways that don't need a lot of "taming" by some discipline enforced by a programming language

What proportion of programs, in your experience, would you qualify as 'good'?

In my experience, a lot of programs are poorly factored organic messes with high coupling and low cohesion. Academic software really needs a CRAP license. Software in industry is somewhat better, but it is still difficult to extract reusable value. Beyond that, there are contexts where security or safety are relevant, where we cannot assume even well-factored and well-architected code is 'good' because it might be malign or unsafe.

To me, the purpose of PL design is to guide the path-of-least-resistance to achieve greater software quality and productivity despite the hostile forces of industry or academia. It isn't the 'good programs' that demonstrate a need for taming.

re: history

What proportion of programs, in your experience, would you qualify as 'good'?

Some very small proportion.

the usual suspects

Remember that I've become "unconvinced" (was formerly convinced) that I really want to build programs out of exceptions, closures, continuations, and coroutines.

I only want a subset of executable behavior matching the usual suspects you cite, driven by a higher level running those as sub-programs. So far I haven't noticed you mention what you might want instead, and I'm quite curious.

If you can use verbs to interfere or to conflict in a description of what you want, that would be especially helpful from my perspective. We can use those as near synonyms when both have similar etymologies; both come from roots meaning to strike, but with prefixes inter and con meaning between and together, respectively. (I point this out only to note you can use either term while mixing and matching when it improves clarity.)

When stacks were invented as a mechanism supporting recursion, the purpose was to avoid interference by using different state for new routine invocations, so they don't conflict with each others' computation. (Yes, I know I'm saying something almost tautological in there; ontology gets funny that way.) Lots of mechanisms we use are about organization to disambiguate state and behavior that would otherwise interfere, from semaphores to database transactions.

Abstractions

Do you have an alternative model? If you didn't have functions, pure or otherwise, how would you implement the idea of abstractions? Mathematical functions? Tree data structures? Recursion?

Alternatives to functions

There are quite a few powerful alternatives to functions, including:

  • term (or graph) rewriting, (cf. Maude)
  • temporal logic (cf. Dedalus)
  • linear logic
  • grammars (generative or analytical)

There are also alternative notations for functions, e.g. concatenative or combinatorial instead of applicative.

It turns out that grammars are truly excellent for modeling automata and state machines in an abstract, compositional way. (Grammars name paths/protocols, much like functions describe paths. State machines tend to name states, which is more difficult to compose.) But I've not encountered languages based on grammars. Probably because they're too expressive, too easy to lose control of performance.

All definitions global

The approach I've taken with names in my language is similar to this. In any body of source, there are a finite number of definitions and names are just a means of referencing those definitions. Jules will tell you that if you had just used a structural editor, then the names could have been labels in a drop-down box :). My point of view is similar.

The possibly-more-interesting thing that I do is decouple scopes (namespace management) from functional dependence. Consider this snippet (in pseudo-code):

def foo(x)
    def bar(y) = x*y
    ...

Because all definitions are global, you can reference bar outside of the definition of foo. What you see is that bar is a function of x and y. You can look at this as lambda lifting, but in a way that's exposed to the programmer and not just an implementation technique. (I'm curious if anyone knows of other languages that have taken this approach.)

you can reference bar

you can reference bar outside of the definition of foo

Can I call `bar` outside of `foo`? If so, how? I mean how should I feed values to the free variables in the body of `bar`? The function call syntax is better not be used for that. Probably we need a `let`: `let x = 1 in bar(2)`?

Yes

You have the right idea. Named parameters are fundamental in my language. I decided not to go into that because it's a whole nother topic but, yes, you'd call bar by passing in x by name (or by having an x available in the calling context).

I've been playing with a

I've been playing with a similar idea that I refer to "call by type" parameter binding. Say you call a method that takes a "distance" and you have a distance in scope; calling that method with simply bind that distance implicitly. For this to work, you would need a very rich type system, to the point where you could take integers and other primitives with richer type information that might not be useful in type checking. Given ambiguity, you would simply enhance the type of the variable you want to pass as a parameter (of course, this gets annoying since all expressions used as parameters have to be bound to variables).

Oh neat.

I’ve been doing just that recently, in an object-oriented language based on mixin graphs instead of inheritance trees. With rich enough types, you can get away with parameter sets instead of parameter lists. And in a record, most of the time you don’t really need to name the fields.

record Enemy = { Sprite, Position, Velocity, Strength }

def draw { Enemy }:
  // Implicitly the Sprite and Position of the Enemy.
  draw { Sprite, Position }

def collide { Player, Enemy }:
  // Juxtaposition = scope refinement = member access.
  Player Health -= Enemy Strength

def update { Enemy “e” }:
  // Naming things for clarity is fine too.
  e Position += e Velocity

You only need to disambiguate a value when there are multiple values by that name or type in scope. I think this makes scoping approach the naturalness of anaphora in natural language.

Component Entity Systems

Reminds me of Component Entity Systems. An entity is a set of components. Very nice.

Yep.

That’s precisely what got me thinking about it in the first place. The “mixin graph” aspect comes from the ability to have outside dependencies, much like pure virtual functions in the land of inheritance. Abstract entities are composed (by set union, basically) to form concrete entities that are instantiable. A more complete motivating example:

// Abstract entity (“mixin”).
entity Moving:
  needs Position
  needs Velocity

def update { Moving, ΔTime “dt” }:
  // TODO Quit using Euler integration in example code.
  Position += Velocity × dt

entity Visible:
  needs Sprite
  needs Position

def render { Visible, Display }:
  render { Sprite, Position, Display }

// Concrete entity (no unsatisfied needs).
entity Player:
  has Position
  has Velocity
  has Sprite
  is Visible
  is Moving

I still need to hash out some issues with time and control flow. Sean’s work is probably the thing to look at.

Also probably relevant,

Also probably relevant, Odersky's DOT.

Generalized: Some definitions nonlocal

Because all definitions are global, you can reference bar outside of the definition of foo. What you see is that bar is a function of x and y.

I've been thinking along these lines too. I would like to generalize this as a strategy which bar uses to capture the variable x.

For background, what I would use this for is to generalize markup and style languages. It's nice to write UI code in a way that structurally resembles the UI itself, because that way there's a visceral relationship between the way to interact with the program and the way to build and maintain the program. However, in a UI, almost every piece is open-ended, with many possible ways to extend it for accessibility, cultural localization, teaching, satire, layout with other UIs, etc. So I'm interested in declaring hundreds of modular extension points throughout a program without flattening the program structure across hundreds of module-level declarations.

What you're doing is related to that. You're exposing a module-level declaration based on a syntactically local definition of bar.

The reason I would talk about variable capture strategies is this: Once you have nonlocal variable declarations, you have expressions that can't actually use variables that appear in the vicinity, because these expressions are supposed to be meaningful at an outer scope where those variables don't exist. In your example, the global version of bar cannot use the local version of x, but the local version can, so the combined declaration can capture x in a different way than we're used to.

Your example could be macro-expanded to something like what I have in mind, and then simplified to something more traditional, like so:

(Sorry, this example may be very vague due to some unexplained notations I'm making up on the spot. Note that every (fn ...) here should have a &freeVarStrategy and a &paramScope for consistency's sake, but I'm leaving them implicit when they're not important.)

def foo(x)
    def bar(y) = x * y
    ...

#topLevel foo = fn (x) ->
    let bar =
        (#topLevel bar =
            (fn (y, &freeVarStrategy &locallyOptionalNamed) ->
                x * y))
    in
    ...

#topLevel foo = fn (x) ->
    let bar =
        (#topLevel bar =
            (fn (y, &paramScope #myParam) ->
                (#myParam &locallyOptionalNamed x = x) * y))
    in
    ...

#topLevel foo = fn (x) ->
    let bar =
        (#topLevel bar =
            (fn (y, &locallyOptionalNamed x = x) ->
                 x * y))
    in
    ...

#topLevel bar = fn (y, &named x) ->
    x * y
#topLevel foo = fn (x)
    let bar = fn (y, &optionalNamed x = x) ->
        bar(y, &named x = x)
    in
    ...

I don't think programmers will like to use the full generality of this scope gymnastics for most programs, because they often avoid JavaScript's var hoisting. Fortunately, like var hoisting, these gymnastics are all local to a lexical scope (e.g. all declarations of module-level variables are somewhere inside the module), so they're only potential obstacles or features for maintaining code, not for black-box program composition.

A wide open gap in this design is that I'm not sure what a strategy would actually be. Are there just a few of these strategies hardcoded in the language, or can they be defined as programs that manipulate first-class representations of the variables in scope? Do they have access to the variables' static type information? (If so, how does scope in a type expression work?) What about linear types and multiple occurrences of a captured variable? Meanwhile, the scope names (e.g. "topLevel" and "myParam") tempt me to have a system for scoping the scope names. Because of these gaps, I've been spending my effort elsewhere.

Something's wrong

I haven't grokked most of your post yet, but the following is not true for my language:

Once you have nonlocal variable declarations, you have expressions that can't actually use variables that appear in the vicinity, because these expressions are supposed to be meaningful at an outer scope where those variables don't exist.

Expressions are valid in a particular context (set of bindings) and because all parameters are globally available, you can always create a context that has the right bindings.

In your example, the global version of bar cannot use the local version of x

What does that mean?

Encapsulation of captured variables

Hmm, so for your language, what's the difference between declaring bar locally and declaring it at the top level? I was thinking the local version would encapsulate the value of x seen at bar's definition time, even if you passed bar into a higher-order utility that had its own x inside.

def foo(x)
    def bar(y) = x * y
    return map( bar, [ 1, 2, 3 ] )
def map(f, x)
    ...  // Can f be called with the right x here?

The languages newLISP and PicoLisp use dynamic scope, and I think they both make this scenario work by using namespaces. That way the x in foo and bar can be distinct from the x in map. Do you have something like this in mind?

(Edited to add: Oh, right, this is part of the topic of giving globally unique, qualified names to everything, so I can almost answer that question myself. I guess I carried myself off on a tangent regarding nonlocal variable definitions.)

If I want something that does encapsulate the value of x behind a programmatic interface, is that possible to build in your language (for better or worse)?

Once you have nonlocal variable declarations, you have expressions that can't actually use variables that appear in the vicinity, because these expressions are supposed to be meaningful at an outer scope where those variables don't exist.

Expressions are valid in a particular context (set of bindings) and because all parameters are globally available, you can always create a context that has the right bindings.

I think we might be using the word "variable" or "parameter" in multiple ways here. I'll try to elaborate...

I took your example to have Scheme-like lexical scope, except for a surprise: The local definition of bar is actually usable at the top level. So even though in Scheme the occurrence of x in bar would mean "the parameter to the call to foo where the definition of bar occurred," it actually can't mean that for all uses of bar. The definition of bar applies throughout the top-level, rather than to any particular call to foo.

One thing you're saying is that x means "the parameter named x in this call to bar." However, I'm hesitant to believe this is what it means for local uses of bar, since I think Scheme-like closure encapsulation would break, as seen in my map example above.

Map considered harmful

so for your language, what's the difference between declaring bar locally and declaring it at the top level?

I think you might have figured this out midpost, but the difference just involves what's in scope. It's a namespace issue.

That way the x in foo and bar can be distinct from the x in map.

Right.

If I want something that does encapsulate the value of x behind a programmatic interface, is that possible to build in your language (for better or worse)?

Yes, the point of this idea is just to decouple contexts (sets of bindings) from scopes (namespace management). If you want to block access to certain symbols from certain other parts of the program, that's doable. I have opinions on how this should work, but I won't go into them.

One thing you're saying is that x means "the parameter named x in this call to bar." However, I'm hesitant to believe this is what it means for local uses of bar, since I think Scheme-like closure encapsulation would break, as seen in my map example above.

I don't think that's what I'm saying. First, a nit: parameters are named at definition sites, not call sites (I'm sure you were speaking loosely). Anyway, try rewriting your example to use named parameters:

param x 
param y

def foo[x] -- square brackets indicates dependence on named parameters
   def bar(y) = x * y -- bar[x] is inferred

   return map where
       f = bar
       x = [1, 2, 3] -- Error: x already in context

param f
def map[f, x]
    ...

EDIT: Well, my original translation of your code was broken (mistyped), so I made some edits to make the map function take an ordinary style function with an anonymous parameter (with the round parens as syntax). These ordinary functions can be modeled by having a named parameter 'x' (or 'anon' or whatever), which is the same for all values of arrow type. Then 'def bar(y) = ...' converts the RHS, which is an expression depending on context y into a value that depends on this standard 'x' parameter. A function call expression like foo(bar) can be translated to (foo where x = bar). This name forgetting operation of shoving things into a lambda calculus is obviously frowned upon for general programming practice.

In this setup, it's clear that hoisting the definition of bar outside of the definition of foo wouldn't change the meaning of things, as no scoping issues are in play.

If you really want to shadow x in the context above, you need to be explicit about that. Also, there's no sharing of parameter x just based on its name. You had to decide to make map use the same parameter named x as did foo (a dubious decision, but hey, it's a toy problem). I think these two policies are almost a requirement of this programming style, lest it become an endless source of error.

Finally, since you brought up map, let me try to be provocative by claiming that map (and zip and others), as they are often used in functional languages, are bad style. Perhaps not as bad as the dip bloop blarp of concatenative languages, but still.

😒

Your named parameters seem related to a reader monad (or applicative). Do you have a formalization for them?

let me try to be provocative by claiming that map (and zip and others), as they are often used in functional languages, are bad style. Perhaps not as bad as the dip bloop blarp of concatenative languages, but still.

Okay, you've provocated. 😒

I use map, zip, etc. quite regularly, both on sequences and streams. They are very productive for quickly composing large behaviors (processes, data transforms, etc.) from smaller ones, i.e. without stopping to name every intermediate value. For similar reasons, I favor concatenative and point-free programming styles and data manipulation, even in non-concatenative languages (like C++ or Haskell).

Why do you believe this is bad style? Is it just personal preference, or do you have an objective reason?

😏

Your named parameters seem related to a reader monad (or applicative). Do you have a formalization for them?

Well, I have the sketch of a translation into my core logic, but it's straightforward. Instead of trying to think about how to model these things in e.g. Haskell, first consider how to translate to Haskell, which is much easier. Representing them in general is a meta-level problem.

Why do you believe this is bad style? Is it just personal preference, or do you have an objective reason?

My distaste is for using list/stream position to encode an ad hoc index into some data or tuple position to index ad hoc attributes of data. I can understand why you don't want to introduce new names for all intermediate results, but I don't think we need to. In the spirit of provocation (and because I don't want to spill on all my language choices this week), I'll leave it at that. 😏

Positional indices are great.

Positional indices have only one (relatively weak) disadvantage compared to symbolic indices: they aren't self-documenting to a human.

This disadvantage is mitigated considerably if we have types, because type descriptors offer an inferred kind of documentation. The disadvantage is mitigated further if we have graphical visualization; humans have good spatial memory for visual stimuli. And, of course, it is not difficult to alias a positional index using a symbol. Meanwhile, positional indices offer several advantages over ad-hoc symbolic indices: positions are dense, enumerable, computable, readily accessible for metaprogramming or refactoring. Positions also require no translation between human languages.

Of course, it is not difficult to model positional elements using symbols, nor vice versa. So the question is which we should encourage, which style we should favor. Do we want to emphasize marginal benefits of ad-hoc self-documenting structure, or would we prefer relatively simple metaprogramming and tooling even at the syntactic layer?

My own hypothesis is that named parameters will become vastly less useful as PL tooling improves beyond text editors.

But you forgot the killer

But you forgot the killer feature of names: they're sparse and thus support openness. Referring to 'x' in a variety of contexts is vastly preferable to having a different incantation to get at the 'x value' in each context. If you have a visual presentation that makes it clear which value is the 'x value' in each context and your editor automatically updates all of the various incantations that serve to get at that value throughout your code, then maybe the situation is not much worse than with names.

Not a killer

The ability to represent open data structures is a feature. So is the ability to represent closed data structures. Positional structures can be designed for either, without the disadvantages of 'sparse' symbols. One simply leaves a slot open, initially filling it with a default/empty value. Not so different from open hardware slots on your computer.

Referring to 'x' in a variety of contexts is vastly preferable to having a different incantation to get at the 'x value' in each context.

In most cases, we want to refer to values by the role they play in a particular interaction, not by absolute name. There is always a mapping step from values into roles - an 'incantation' that is specific to how a subprogram is invoked in a larger context. For the system you're proposing, this incantation will involve some form of ad-hoc renaming.

Mapping positional indices isn't difficult until there are a lot of positions to consider (more than ~5). And since there aren't so many positions, the mapping itself is subject to easy refactoring and reuse. For long-running roles, it is also not difficult to use a stable position off the main stack, so I don't see any particular disadvantage there.

If [lots of conditions] then maybe the situation is not much worse than with names

Even without an IDE (beyond a text editor and REPL), I don't consider the situation much worse than names. It is a lot different, but only a little worse. Concatenative style offers advantages that aren't part of your accounting.

There is always a mapping

There is always a mapping step from values into roles - an 'incantation' that is specific to how a subprogram is invoked in a larger context.

True, but the mapping step is implicit much of the time with named parameters. You could do something similar with types, though, like the subthread of Sean, Jon and Keean. I think you've mentioned related things before, too.

Even without an IDE (beyond a text editor and REPL), I don't consider the situation much worse than names.

With visual feedback from an IDE, I can imagine your system to be like vi: there's a learning curve but, once you've mastered it, it allows effortless and efficient interaction. Without visual feedback, I imagine it to be like vi, except with characters on the current line permuted in a way that changes from line to line according to byzantine rules. I'm exaggerating, of course, and am interested in trying your approach... particularly when you get visual feedback :).

We're re-covering well trodden ground now, though. Let me deescalate by retracting my negative claim about implicit indices being bad and replacing it with a positive one: I have in mind a nice paradigm that produces clean and flexible code that eschews map, zip, etc. and other forms of implicit ad hoc structural encodings.

Let me deescalate by

Let me deescalate by retracting my negative claim about implicit indices being bad and replacing it with a positive one: I have in mind a nice paradigm that produces clean and flexible code that eschews map, zip, etc. and other forms of implicit ad hoc structural encodings.

Is that something like array programming?

I think HOFs and callbacks are huge usability/understand-ability problems. If you could convert a map into a for loop, or just lift implicitly over a collection (so list.e.toString is the same as list.e.map(_.toString)), that would be a big win (like FRP does with time). But now I've gone back to array programming, and it shouldn't be surprising that lifting over time and space are relatively isomorphic (did I use that word correctly?).

This is too off topic, maybe you could start a new post?

If you could convert a map

If you could convert a map into a for loop, or just lift implicitly over a collection (so list.e.toString is the same as list.e.map(_.toString)), that would be a big win (like FRP does with time).

Yeah, this is definitely on the right track.

This probably isn't a very good format for me try to pitch the idea, though. Let me try to put it into a blog.

the mapping step is implicit

the mapping step is implicit much of the time with named parameters. You could do something similar with types, though

In my experience, it also is implicit much of the time for concatenative styles. This is part of what attracts me to concatenative style.

However, there is a caveat: we must encode compositional models, e.g. favor linear algebra instead of ad-hoc math expressions. When a model is not compositional, or if its composition is ad-hoc (as opposed to a small, universal alphabet of composition operators), it can be difficult to align the elements of the model with the concatenative language, and we end up doing a lot more data shuffling. OTOH, one of my explicit design goals is to pressure API developers to favor compositional models (in both the problem domain and solution space).

I am interested in type-driven approaches for 'smart glue' on the small scale, e.g. auto-wiring and deriving conversions. I have a few ideas there, but they'll not be exercised until I'm well beyond bootstrap.

Without visual feedback, I imagine it to be like vi, except with characters on the current line permuted in a way that changes from line to line according to byzantine rules.

Nah. It's more like Ed. You don't get to see the lines without explicitly requesting, but they're readily accessible (i.e. via REPL), and users quickly gain a pretty good of what's where even without looking. But it is not the most user-friendly (and especially not beginner-friendly) of environments. I will continue to pursue the figurative upgrade to vi. Or even spreadsheets. :)

I have in mind a nice paradigm that produces clean and flexible code that eschews map, zip, etc. and other forms of implicit ad hoc structural encodings.

I can believe it. I look forward to you blogging about it. You should link your blog under your LtU profile.

Type-driven smart glue

I am interested in type-driven approaches for 'smart glue' on the small scale, e.g. auto-wiring and deriving conversions. I have a few ideas there, but they'll not be exercised until I'm well beyond bootstrap.

I've been working on that these last few months, see last-winter blog posts on Singleton types for code inference. My impression for now is that for while very strict type-driven approaches may work in very parametric libraries or dependently typed languages, for day-to-day code in current languages you will want a fuzzier approach based on scoring.

Or fuzzier type systems

Or fuzzier type systems based on scoring :)

Indeed

I also meant to mention your very pertinent remark that limiting oneself to well-typed guesses can be the wrong thing in a program that is still in construction, and the type-level is therefore as likely to be partially-incorrect that the term-level.

Heuristic Techniques

The approach I'm considering is relatively simple, and does leverage heuristic techniques.

Developers would express a space of functions using a notation of the form `(foo | bar | baz)` to represent three choices, which may have different types. When composing this form of expression, the total number of options multiply, such that the total number is exponential but finite. A compiler can easily filter this space for programs that have no obvious errors, including simple type errors and static assertion failures. It's up to developers to use idioms such that most errors are 'obvious'.

However, it isn't always feasible to filter the options down to a single choice. In that case, heuristics become involved.

Developers may annotate options with statically computable attributes, and may supply a separate heuristic function that takes a subset of attributes and estimates some bounds on the final score. Separating the heuristic function from the attributes is excellent for injecting high-level preferences and policies, or non-invasively exploring the solution space. I offer no guarantee that the 'best' solution is found, but I can typically find a few local maxima (a few good solutions) easily using hill climbing and iterated local search.

I've used variations on this technique before, mostly in plugins models (ways to select a subset of plugins that each require/provide a subset of services). One of my motivations is to support good defaults while allowing experimental plugins to take over. Another motivation was automatically switching between GTK and Wx and other windowing models, based on external policy. I've been happy with the resulting flexibility, stability, control, and resilience.

Tagging definitions with domain-specific names

I'm not exactly sure this is what you mean, but if you mean that declarations in the environment would have been tagged with domain-specific information (such as "distance", "remote", "fast", "stable"), and the query would then request for a combination of those features/names to be present, then that is reminding of a brand of work happening in Dortmund under the direction of Jakob Rehof, see for example Using Inhabitation in Bounded Combinatory Logic with Intersection Types for Composition Synthesis (Boris Düdder, Oliver Garbe, Moritz Martens, Jakob Rehof, Paweł Urzyczyn), 2013, on arxiv (good!).

Tags are useful. For

Tags are useful. For heuristics, weighted tags are also useful (i.e. tags 'stable' or 'experimental' might have numbers attached). I did use weighted tags for my earlier plugins models.

My current language doesn't have a notion of 'declarations'. (It eschews both nominative types and variables.) Tags aren't really essential to my current design; I rely more on static assertions and type information. However, tags can still be useful, and may be modeled using structural conventions (such as pairing numbers with unit information) or discretionary value sealing (which operationally approximate nominative `newtype` declarations).

I'll check out the paper. Thanks.

Principal Typings

I am doing something similar, by using a compositional variant of HM type inference function types look more like ML module types, capturing free variables as part of the type signature in type inference:

f y z = (x z) y

has the type:

{x : (a -> (b -> c))} |- (b -> (a -> c))

Compositional typing helps because any code fragment is tapeable in isolation, and any two code fragments can be composed. This copes with multiple occurrences of the same variable by allowing multiple copies of the variable to exist in the same environment, which are then either unified on abstraction, or generalised for polymorphic application.