A Growable Language Manifesto

This is my first LtU post, so I need a quick introduction. I've been an LtU lurker for years. My career has been very applications-focused, and the last compiler I wrote was in high school, but nonetheless I've always been interested in programming languages and their design. In previous lives I worked with Mark Miller (at Xanadu and Electric Communities) and Doug Crockford (also at Electric Communities).

Recently I've been seeing a large amount of interesting work on gradual typing, partial typing, aggressive alias analysis, type inference, static metaprogramming, extensible parsing, type qualifiers, modular grammars and compilers, and all kinds of other techniques. It seems to me that a lot of these fields of research are actually very synergistic. And moreover, it seems to me there's room for a language design that is based on these concepts from the ground up.

What would it look like to create an extensible grammar and declarative compiler environment that, for example, defined primitive integers as a modular type extension on top of a very small dynamic core, and built the whole language up incrementally? Or that supported adding multiple varieties of type qualifiers as an integrated part of the compiler's basic type analysis? Or that allowed type declarations to be inferred wherever possible, while letting the programmer choose whether to view the inferred types and/or augment them with explicit declarations (at module boundaries, perhaps)? In other words, what if you rolled all these different techniques together? What would you get?

I've written a fairly weighty blog post, A Growable Language Manifesto, that tries to relate these various fields in a bit more depth. The manifesto itself may or may not be persuasive -- I'm sure you'll tell me :-) But hopefully, if nothing else, the references will be interesting; I've tried to cite a fairly wide variety of sources (many of which have appeared here, but not collected).

I'm interested in any and all pointers to similar work. I'm not claiming originality here -- obviously these ideas are gaining traction in many places. I'm just suggesting they may work together remarkably well.

I also know that there are huge potential pitfalls with this kind of extensible system -- including rampant forking of the language (everyone creates their own mini-DSL and no one can read anyone else's code), inadequate explicit typing (if strong typing is optional people may not bother to use it at all and the inferencer may fall behind), and the "declarative penalty" (sometimes easy imperative optimizations or analyses are next to impossible to state declaratively).

But even still, the vision is compelling. (To me, anyway!) As with all manifestos, the goal is to be thought-provoking. I look forward to wherever the discussion goes!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Brief typo

I had a typo in the URL for about ninety seconds. Fixed.

What's really missing

I disagree with much of this, but I can understand where you're coming from because I believed as you did not too long ago. The more I read and the more software I write, the more I realize that extensible grammars, partial type systems, and all forms of unchecked dynamism and partiality are actually a burden.

I think there are only 3 features missing from statically typed languages which are preventing them from taking over the world:

  1. Standard support for safe dynamism and polytypism: by this I mean the ability to load and unload code and types dynamically, with the code properly checked against an expected type signature and safe failure semantics. Not a small thing to ask for, but very necessary. I group polytypism with dynamism, because many forms of dynamism can be achieved via polytypic functions. My personal opinion is that polytypism with runtime compilation subsumes all forms of dynamism, but I haven't proven that yet. ;-)
  2. Statically checked runtime compilation with garbage collected code: runtime compilation of some sort is becoming absolutely necessary for implementing certain types of efficient abstractions (embedded DSLs). This is somewhat related to #1, in that code loading can be achieved by runtime compiling a bytecode representation. MetaOCaml supports runtime codegen, but not garbage collected code; besides, it's still a research vehicle. In a non-multistaged language, this means the compiler is available at runtime as in Java/C#, but can be used safely. Oleg's recent Tagless Staged Interpreters for Simpler Typed Languages shows the way.
  3. Safe concurrency and distribution: there is currently no statically checked language with a good story for concurrency. JoCaml has just started up again, but they're missing important features. Concurrent ML has been around for awhile, but who's really programming in that? And there's no distribution there. Manticore looks promising, but they're just getting started. STM is a non-starter, because I think the problems of concurrency and distribution are so closely related, that you might as well solve both at once. This is also related to #1 and #2, in that distribution and mobility can be achieved via dynamism and runtime compilation.

All of the above features are provided by mainstream languages, albeit via unsafe, ad-hoc mechanisms, ie. reflection, casting, runtime codegen, and threads and remoting respectively. No single statically typed language provides all of them, and any one feature that is available is incomplete, experimental and/or unsupported. This is not intended as a knock against these languages, just a description of the reality that people who want to use these languages for general programming face.

If a language like OCaml maintains its current feature set and further provides the above features, there is absolutely no compelling reason to go with Java or C#, other than libraries; and we all know that the need for so many libraries is drastically reduced in a more expressive language like OCaml than in Java/C#.

Even if added incrementally, each above feature will attract a large swath of attention on its own, just like dynamic code loading is for the upcoming OCaml release. I think the answers to the above challenges are: 1. bytecode representation for code load/unload and polytypic programming via representation types ala RepLib, 2. multistaging with runtime support for code GC, or at least reifying the compiler at runtime via Oleg's Symantics interface, 3. Join calculus for concurrency and distribution. Each is simple, safe, and orthogonal to the other features. The whole is greater than the sum of its parts.

If you'll allow my soapbox for a moment, one issue which no language in widespread use gets right is security (E is not in widespread use ;-). Close the language's security holes as Emily does for OCaml and I couldn't ask for anything more.

As for extensible grammars, partial/optional typing, and any such form of non-structured extensibility: it degrades simplicity and the use of code to communicate ideas. If programming has taught me anything, it's taught me that a language should be simple, expressive, and safe.

Try quantifying new features against those three properties. For instance, increasing the strength of typing can increase expressiveness and safety, but sometimes harms simplicity (see Haskell vs OCaml, or first-class labels); sometimes the tradeoff is worth it. Optional typing may sometimes increase expressiveness somewhat, but harms safety and simplicity. Extensible grammars again can slightly increase expressiveness while maintaining safety, but again deteriorates simplicity. Safe runtime compilation + safe dynamism is safe metaprogramming, so I'll agree with you on that point. :-)

Clean

Perhaps you should take a look at Clean.

I have.

I have.

1. No runtime compilation.
2. Concurrency is not integrated in the standard compiler (according to their language report). I'm aware of Concurrent Clean, but if it's not integrated, it's of no use to most people. That language report says they're working on distribution and concurrency based on Dynamics, so hopefully we'll see something soon.

I suspect the convenience of a portable bytecode may a big bonus for most people, but it's probably not strictly necessary. I'm also somewhat skeptical that a lazy language will go mainstream, but I don't think it's a show-stopper if they can make it safe enough (by which I mean, the developer can reason about resource use).

I'm also skeptical of any non-free languages; given so many free alternatives, I think any language that requires a commercial license for commercial use will never become "mainstream". Which isn't to say the devs can't make a good living off it even if it stays a small player.

[Edit: section 8 on Dynamics says runtime compilation and linking is necessary for full Dynamics support, but a) full Dynamics are not available yet, and b) there's no way to control compilation ala staging.]

Clean is LGPL

I'm also skeptical of any non-free languages; given so many free alternatives, I think any language that requires a commercial license for commercial use will never become "mainstream".

Clean has a dual license. It's free under the LGPL license. Note that LGPL is more liberal than GPL under which many other systems are distributed.

If you don't want to be bound by the LGPL conditions you can buy a commercial license, for example if you don't want to distribute your changes as source code.

I was under the impression

For some reason I was under the impression that it was licensed similarly to Qt, LGPL for free software but with a commercial license for commercial software. I don't see that restriction on their licensing page though, so I believe you're right.

[A bit OT] Actual meaning of double licensing

Actually, when a product is doubly licenced with a free license and a commercial one, the adopter has more options, so it's hardly a restriction.
One option is choosing the free license: you have all its advantages and just forget about the commercial alternative. For you it could as well be just LGPL (or GPL or whatever the free license is).
On the other hand, in case you opt for the commercial license you can include the software in a closed-source product, which you may prefer for strategic reasons, company policies or because RMS stole your candies in kindergarten. In this case you are not bound by the obligation to share the source code present in the free license.

Cheers

Antonio

My understanding of the LGPL...

...is that means that as long as you don't modify the Clean runtime itself (those pieces of copyrighted code which are included in a standalone compiled Clean application), you can publish programs written in Clean in pretty much any fashion you like (GPL, proprietary, etc). Changes to Clean itself, or to its libraries and such, must be redistributed as source per LGPL; otherwise you must purchase a license.

Is that how the Clean developers understand things? After all, the purpose of the LGPL is mainly to support free system libraries (OS, language runtime, etc) which don't constrain applications written on top of them; but to prevent proprietary forks of the underyling system.

3 and 4 from the perspective of mass adoption

3. no debugger and no plans for one. (see ghci.)
4. the ide is... i'll just say quaint, at best. (see eclipse support for erlang, scala, etc.)

Debuggers are somewhat

Debuggers are somewhat overrated, and safer languages reduce the need for such debuggers. Even a dynamically typed language like Ruby doesn't have a debugger, and it's garnering lots of attention.

I admit that "intellisense" support in an IDE is pretty darn useful, but if the language is safe and useful without it, lack of an IDE won't prevent wider adoption.

being pedantic re: Ruby

Ruby doesn't have a debugger

I think Ruby has had a debugger kinda for ever via the "-rdebug" flag? And has even had development in that area to improve things over time.

Re: Debuggers overrated

It probably comes down to something of a semantics game regarding the idea of popular acceptance and up-take. My guess (based of course on my own list of things I require from a system) is that a system without a debugger won't be e.g. used for much serious production code these days. But I dunno that it can be much more than an "is too! is not!" kind of conversation?

There are only 3 features missing

naasking: I think there are only 3 features missing from statically typed languages which are preventing them from taking over the world

I totally agree with what you are saying - of course, because (sorry for the plug...) adding these features to an ML-like language in a safe and sane manner was precisely the motivation for the Alice ML project. :-) AFAICS, it provides most things from your list: type-safe module dynamism like you suggest, first-class components, lightweight concurrency and safe distribution, and also code pickling, mobility and garbage collection on the granularity of individual functions. What it doesn't have is staged compilation (but you can safely invoke the compiler at runtime), and full support for polytypism (it is there in a way, but not exposed to the programmer). And of course, we cannot really claim that it is a mature production-quality system.

[OT] Mac OS X PPC Alice Build

Andreas, when you can spare a moment, could you possibly e-mail me whatever you may know about building Alice from source on Mac OS X PowerPC? Thanks!

[OT] Email

Paul, did you get my mail yesterday? I wasn't sure about the email address.

Nice counter-manifesto, naasking :-)

Your reply was a not inconsiderable fraction of my manifesto, so I guess I succeeded in provoking some thoughts!

Edit: I see you already agreed with me that safe runtime compilation + safe dynamism = safe metaprogramming. So, yes, extensible static languages should support safe metaprogramming inherently. In fact, one of the goals of my manifesto is to point out the basic similarity between safe metaprogramming and static language extension -- they are inherently very similar.

I'm not familiar with the term "polytypism". Can you give some good references there?

Overall, I think one of my interests in extensible grammars / declarative compiler structures is just generally to facilitate easier development of compilers themselves. There's an interesting continuum between a compiler that is flexible enough to allow layered types (qualifiers, etc.) to be implemented using the compiler's own type structure, versus a compiler that is implemented with a declarative framework permitting new type systems to be implemented in that metalanguage. That's one of the other points of my manifesto: that extensible languages form a design space occupied at one point by the C++-oriented library-centric software design practitioners, and at another point by the extensible compiler work such as xtc, JastAdd, and Polyglot.

So your main issue with my manifesto seems to be that I'm advocating gradual / partial typing as a key component of a growable language. I agree this is probably the most contentious suggestion in there, and the one which raises the greatest risk, since it opens the door to voluntary divergence from almost all typing discipline. I think a more fundamental point is that a truly growable language should build all of its static typing infrastructure in an extensible and modular way, even if that infrastructure is not optional once constructed. (Another point is that partial typing enables undecidable type systems; I find the SAGE work on undecidable typing fundamentally interesting here.)

I'm not familiar with the

I'm not familiar with the term "polytypism". Can you give some good references there?

Polytypism is also known as "datatype-generic programming" and "generics" (not C# generics which is really parametric polymorphism). Polymorphism generalizes a function to all types, and polytypism defines a generic function on the structure of types. So type-safe serialization, pretty printing, parsing, etc. are all polytypic operations. See: What is polytypic programming?

I think a more fundamental point is that a truly growable language should build all of its static typing infrastructure in an extensible and modular way, even if that infrastructure is not optional once constructed.

Perhaps we have a disconnect in terminology then. This post may persuade you that a type system is essential to a language, rather than an add-on. If by "building a type system in a modular way", you mean being able to exploit the type system to verify certain invariants, you will be interested in Lightweight Static Capabilities, and "refinement types". For the latter, you have to start with a powerful type system.

Scrap your boilerplate, eh?

I see. I actually am familiar with the extended generics work (I've read the Scrap Your Boilerplate series with great interest). Paraphrasing a bit, you seem to be saying that datatype-generic programming in that style amounts to traversing the abstract structure of a program, and hence is in some sense liftable to almost the status of a metalanguage (in which DSLs, new dialects, etc. can be defined). Very interesting proposition!

As for the "Curry vs. Church" post you link to, I'm not entirely clear that a powerful "Churchian" type system can't be built in an incremental way. That Atassonow post clearly states that the meaning of a program is defined by the typing derivation -- and, one might well say, the operational-semantics-based evaluation derivation -- of that program within that language. In this sense, defining an extensible language could be considered to be equivalent to defining an extensible operational semantics in which the features of the language were incrementally constructed and added. Again I cite the JastAdd framework, which could be considered to be exactly this sort of extensible infrastructure for incrementally extending the typing semantics of a language.

Now, this may not be an interesting project to you if you find highly evolved type systems (refinement types, dependent types) to be necessary for the kinds of software you want to build. And you may well be correct that such highly evolved type systems are necessary to achieve a level of expressiveness that obviates the apparent convenience of dynamic languages. As I said, I'm a language implementation novice; part of my interest here is in pondering whether an incremental approach to a language definition could lead to both a more declarative and readable implementation, and to an infrastructure that could scale to aggressive typing systems such as the ones you cite. It's very much an open question.

It's entirely possible that the resulting language would be cumbersome to maintain (the declarative style might wind up being excessively verbose), while having difficulty achieving the level of expressiveness you really need (assuming your thesis is correct). Still, it seems like a worthwhile experiment.

The Notion of an Extensible Operational Semantics...

...reminds me a lot of Semantic Lego. It also reminds me a lot of Figure 1 of Building Extensible Compilers in a Formal Framework [PDF]. I also see a connection to A Certified Type-Preserving Compiler from Lambda Calculus to Assembly Language (denotational semantics are the most easily extensible kind, I think) and to A Very Modal Model of a Modern, Major, General Type System ("Our approach is also novel in that we have decomposed the semantic types in such a way that the principles of safety, induction over the future, type preservation, and so on are each identifiable as a specific operator of the type system. Standard concepts such as (function) code pointers, subtyping, contractiveness, mutable and immutable references are constructed from these new primitives" and "Second, we enjoy the modularity and genericity of the semantic methods. We demonstrate this in the paper, with a nicely modularized proof of soundness which is applicable in principle to any calculus with a small-step semantics.")

Safety vs Flexibility

In this sense, defining an extensible language could be considered to be equivalent to defining an extensible operational semantics in which the features of the language were incrementally constructed and added.

...in a way that does not violate the invariants of the existing semantics, and so break existing code. This involves generating a proof, of which the typing is a weak one unless you're in a powerful language to begin with. This design space has been explored somewhat in Active Libraries and Universal Languages. It's a very powerful approach. You may also be interested in Formal Islands mentioned in the comments.

Now, this may not be an interesting project to you if you find highly evolved type systems (refinement types, dependent types) to be necessary for the kinds of software you want to build.

Oh no, at the moment I think GADTs hit the right trade-off between expressiveness and safety. Many useful patterns become viable with GADTs (representation types, straightforward message-passing and object encodings, etc.). Anything more powerful is still a tad too experimental and advanced for most people I think.

As I said, I'm a language implementation novice

As am I. I've been intensively investigating what's out there to try and find the right balance in the language I'm designing. When I considered all the incredible things I've read about, in the context of the software development I do on a daily basis, I formulated my "counter-manifesto". As mentioned there and in my previous stories here, I'm starting to focus more on safety.

I think the draw of dynamic languages is that the developer can run something immediately without being forced to pass the compiler's static analyses; I certainly felt annoyance when I first tried a real strongly typed language and struggled to get my program to compile... until it finally compiled and it ran without errors. I think such safety is necessary in the long run, as tests are weak existence proofs when dealing with large programs. I share your goal of expressiveness and readability, I'm just taking the path strict typing and safety, instead of partiality and flexibility; I can totally understand the draw of that flexibility, I've just been burned by it too may times (even by my own code!) to believe it's worth it anymore. Best of luck to the both of us. ;-)

representation types

are you able to give any references about representation types? It sounds interesting, and a some searching hasn't turned up anything useful (that i recognise!)

See Stephanie Weirich's

See Stephanie Weirich's RepLib, and her publication page (many papers there, and her thesis). Polytypic programming was also investigated as "intensional type analysis" which is how I was introduced to it. Representation types show up in all sorts of different papers, like type-safe casts and type-indexed functions, all of which are a sort of polytypic/generic programming.

thanks

much appreciated

Quick Execution

I wonder if there is a more principled way to be able to quickly execute only partially type-correct code, perhaps slicing out a maximal type-correct fragment of a program, or replacing expressions hiding type errors with something like undefined that fails fast?

Nice Manifesto

But be careful with expectations of extensible types features, metareflection, and dynamic typing. They conspire globally against counter type soundness and universal properties that are desirable for program verification and concurrency.

Taken to their logical conclusion, you get things like LISP, SmallTalk, C# with LINQ and lose sight of the extensional meaning of your code as it's intractable intertwined its metareflective representation, pointer identities, the uncheckable exceptions it might throw, the interact with macros and syntax extensions which may transform it into something entirely different, etc.

I wasn't advocating pure dynamism

I tried to be somewhat clear that a growable language (in my manifesto's characterization) should promote static metaprogramming, not unrestrained dynamic runtime self-modification, for exactly the reasons you cite.

It seems to me that extensible typing and static checking actually go hand in hand. After all, ownership type systems, integrated theorem provers (a la Matthew Might's LFA analysis, which you yourself provided the motivating example for!), and other kinds of augmented type systems are all conceptually extensions of the typechecker. So I'm confused about how those kinds of extensions fundamentally "conspire globally against... universal properties that are desirable for program verification". My entire point is that extensible type systems basically are layered program verifiers!

Also, I'm not clear on why you bundle Smalltalk together with C#/LINQ. Meijer is doing his best to formalize the C# / LINQ extensions and clarify their static meaning; they explicitly are not implemented by arbitrary runtime code modifications (as with Smalltalk / Ruby / Python metaprogramming). It's that kind of explicit, static internal transformation that seems to me to potentially enhance the language's extensional meaning.

Or are you saying that domain specific languages (e.g. SQL syntax mixed into VB.NET code) inherently compromise the extensional readability of your code? That's an interesting argument, and one that my manifesto explicitly avoids... as I said in the original post here, I'm aware that readability fragmentation is an inherent risk, but I'm not convinced that makes the challenge untenable.

Static metaprogramming

It comes down to a question of metaprogramming by AST translation versus metaprogramming by compile-time evaluation of values within the type system.

Would you have metaprogramming in the LISP style, exposing a representation of the program's abstract syntax tree to a compile-time evaluation phase that enables programmers to transform it?

I'm not a fan of that approach, because it obscures the meaning of the program. The reader has to consider not just the program text he sees, but whatever transformations are performed on it elsewhere to generate actual code that runs. LINQ queries can be transformed into iterator expressions or SQL queries submitted to a database. The decomposition is ad hoc, versus Haskell monad comprehensions which always translate to a specific set of calls to monad operations. So, to understand a comprehension, you only need to understand its underlying monad, not an arbitrary AST translation.

Or, would you expose metaprogramming through compile-time evaluation of certain program expressions using the same evaluation rules as the runtime? This is how many dependent typecheckers work; determining that array(t,i) is a subtype of array(u,j) requires compile-time verification that t is a subtype of u (an ordinary typechecking operation) and verification that i==j (which requires actual evaluation of values).

I like that approach, as it purely increases the expressive power of the language without reducing the soundness of the type system. The stronger the type system, the better. For example, a type system obeying the parametricity property (that, given a value of universally quantified type, you can use the value but can't derive any information from it) guarantees that you can't cheat and decompose arbitrary values by introspection. This is a key to security and gives more information about the behavior of a function from its signature alone.

MSP

A large swathe of Lisp style metaprogramming can be covered by Multi-Stage Programming, especially as they can be used to represent type-safe macros.

While dependant type systems are very powerful adding useful syntactic abstractions, e.g. Haskell do notation, isn't really its forte.

I was definitely thinking

I was definitely thinking more of LISP-style metaprogramming, and you make an intriguing case against it. I'm not entirely convinced. I need to read up on Haskell monad comprehensions to understand more of why you consider them an easier extension mechanism -- it seems to me that a sufficiently clear declarative structure for AST transformations could approach Haskell monad comprehensions in readability, which seems to be your main concern here. (I'm not clear on what is "ad hoc" about a declaratively specified AST translation. If the translation is buried deep in the guts of an imperative compiler, I could see how that is "ad hoc", but getting away from that kind of language implementation is one of the main goals of this manifesto.)

I'm also slightly confused by how compile-time evaluation increases the expressive power of the language. Dependent types can be typechecked at compile-time or at runtime. Flanagan et al.'s SAGE system implements sophisticated range types. These are checked at compile time where possible, and at runtime elsewhere. But in either case, the type system's expressiveness is the same. You could turn off all compile-time evaluation and you'd still get the exact same safety guarantees and type system expressiveness. In other words, it seems to me that compile-time evaluation increases efficiency, but not expressiveness -- the expressiveness is built in, and the compile-time evaluation simply lets you use it without inordinate cost. Maybe you're saying that compile-time evaluation is critical to making a rich type system actually usable (e.g. without efficiency, the expressiveness is unusable)...?

Just out of curiosity, what do you think of Matt Might's work, particularly the LFA paper I cited above? Since you provided the motivating example, I'm curious what you think of his attempt to solve your problem :-)

Safety Guarantees, Quantifiers

You could turn off all compile-time evaluation and you'd still get the exact same safety guarantees and type system expressiveness.

Only if you don't think it is a safety guarantee to know that your functions will actually succeed rather than raising a type error.

I'm not sure how you could even check properties with nontrivial universal quantification at runtime. The value of a sound dependent
type system is knowing that a function is the proof it promises to
be without even running it, not that a little bit of normalization happens during type checking. What kind of runtime checking could
you do that a proposed lock ordering function is actually a total order? How about checking injectivity?

I was under the impression...

LINQ queries can be transformed into iterator expressions or SQL queries submitted to a database. The decomposition is ad hoc, versus Haskell monad comprehensions

I was under the impression that LINQ queries are transformed into standard C# code via a set of rules conceptually similar to the rules for Haskell's "do" (albeit more complicated and less well typed). That's what the spec seems to say.

The C# 3.0 language does not specify the exact execution semantics of query expressions. Rather, C# 3.0 translates query expressions into invocations of methods that adhere to the query expression pattern.

The decision as to whether the result is a SQL call or iteration is left to the object being targeted just as in Haskell where the meaning of ">>=" and "fail" are left to the monad.

If I Understand Correctly...

... Tim's point is that monad comprehensions follow the monad comprehension calculus, and so an important amount of orthogonality with respect to implementation is enforced effectively for free (see Monad Comprehensions: A Versatile Representation for Queries (PDF)): there's a sense in which "the meaning of '>>=' and 'fail' are left to the monad" is accurate, and a crucial sense in which it is not: a monad must fulfill the monad laws, and so the ways in which functions involving monads can be composed is extremely well-defined.

"Methods that adhere to the query expression pattern" does indeed sound similar, of course. The question boils down to how that adherence is enforced, and for those of us familiar with the benefits of strong static typing and abstractions such as monad comprehension, the word "pattern" suggests the kind of hopefully-correct-by-convention workaround for the lack of genuine abstraction enforcement that patterns tend to consist of in other contexts.

Limited typing

a monad must fulfill the monad laws

But they don't as far as monad comprehensions are concerned. They just have to have the right type signatures. Fulfilling the monad laws is optional. Of course, breaking those laws likely leads to nonsensical results - I'm just saying that Haskell's monad comprehensions couldn't care less.

Conceptually LINQ seems to be about the same. The type signatures are a bit more loose, granted, but Haskell's monad comprehensions are only making a slightly stronger semantic guarantee in the end.

[Edit: Also, I'm fairly certain that Tim was saying that the decomposition of a LINQ query depends in some way on the type of query being performed. I was just clarifying that, as far as I know, the decomposition rules are always the same.]

Good Point

It's true that Haskell can't, unaided, enforce the monad laws. Nevertheless, for some reason (cultural, historical, etc.) it does seem to be the case that, in Haskell, people can and do rely on the monad laws being fulfilled, and this allows a wide variety of compositions of code from different developers that behave exactly as expected. So even lacking enforcement, I continue to take Tim's point to be that, because the laws are at least articulated and employed in Haskell's standard libraries, this level of orthogonality is found there, whereas when you find it in LINQ, you're mostly getting lucky, if for no other reason than that LINQ is embedded in a much larger framework, the majority of which predates the interfaces necessary to ensure LINQ's proper behavior.

Of course, this is not necessarily an accurate interpretation of Tim's point, nor, assuming that it is an accurate interpretation, is it necessarily the case that it's a particularly good observation about LINQ. But I think you hit the nail on the head when you said "The type signatures are a bit more loose, granted..." which would seem to me to fall squarely into Tim's "The stronger the type system, the better" point. In other words, I don't actually see a disagreement here. :-)

Haskell can enforce the

Haskell can enforce the monad laws via the observer function in the Unimo framework, which I covered in a recent post. Haskell can't enforce the monad laws denotationally, but Unimo builds an operational monad framework.

Very Nice

Thanks for the reminder! I will have to look at Unimo more closely.

Counterexample?

Nevertheless, for some reason (cultural, historical, etc.) it does seem to be the case that, in Haskell, people can and do rely on the monad laws being fulfilled, and this allows a wide variety of compositions of code from different developers that behave exactly as expected.

ListT creates non-monad instances of Monad, and it's in the standard library. (Though maybe your point was that people do in general, not that there aren't exceptions)

Good Point

I should have offered the caveat that it seems to be generally understood that monad transformers don't necessarily create law-abiding monads and that, in general, monad transformers don't necessarily compose, which is why there's active research into other means of achieving the same semantic ends.

I suspect...

LINQ queries can be transformed into iterator expressions or SQL queries submitted to a database. The decomposition is ad hoc, versus Haskell monad comprehensions

I was under the impression that LINQ queries are transformed into standard C# code via a set of rules conceptually similar to the rules for Haskell's "do" (albeit more complicated and less well typed). That's what the spec seems to say.

I wonder if Tim was referring to a different feature of C# 3.0 that is crucial to the way that LINQ gets implemented. A closure expression in C# 3.0 can either evaluate to a delegate of the appropriate type, or it can evaluate to an expression object that is a representation of the AST for the closure. The choice is made automatically as part of type-checking.

LINQ implementations that interact with, e.g. SQL will then take this AST representation of the query predicates/operations and translate them into the lower-level query language for the target DB engine. I think this is the sense in which the user has to understand the transformations that take place before generating the actual code that runs.

If we on LtU have trouble noticing/remembering that LINQ relies on runtime transformation on ASTs to work, then maybe that is an argument that the approach is altogether too implicit.

[Disclaimer: I haven't read deeply on C# 3.0 in a while, so any or all of these details may have changed in the interim.]

Hmm, I may be recapitulating history

Many of the points in this "soft typing" thread parallel some of the arguments made in my manifesto. The "partial typing" work of Flanagan et al. seems to be exactly the kind of prove-what-you-can, cast-when-you-can't soft type system that Curtis W. didn't provide an example of in that LtU thread.

There is definitely an unavoidable tradeoff between mandatory strictness (which enforces a uniform type discipline, at the cost of convenience for some programmers), and flexible/soft/gradual/partial typing (which lets the programmer choose their own level of typing discipline, at the cost of consistently available type information). However, this entire tradeoff may be orthogonal to the question of whether a complex and expressive type system can be modularly constructed in a declarative manner. My manifesto posited that the static/soft tradeoff was related to the incremental-construction possibility, but perhaps that's not so.

Still, it's worth remembering that the soft-typing approach and the mandatory-static-typing approach become harder to distinguish when looking at analyses that are undecidable. If you care about a uniform level of typing discipline, with consistent adherence to a clear and enforceable typing structure (for reasons of documentation, maintainability, and extensional clarity), then undecidable type systems might themselves be anathema to your goals. Still, it seems inevitable that once you've proved everything that can be guaranteed to be statically provable, all that remains is proofs that CAN'T be thusly guaranteed....

Recapitulating History is Good!

Adding to it is better. :-) The recent work on gradual typing is exactly the sort of thing I'd been hoping for in previous posts of mine where I wished for a "typing slider" that could be set, ideally, on a case-by-case basis (presumably with annotations), or across a whole program. I'm certainly not prepared to give up on the idea, even if today I lean strongly towards the admittedly conservative type-inferencing and type-checking capabilities of today's most promising static type systems.

A type system that might not terminate being anathema, by the way, would already make C++ anathema, so I don't really see that as an issue. On the contrary: when I know I'm using a Turing-complete type system, I know that whatever I want to have the type system prove for me is either possible or not in that type system, and then I can just worry about how much pain I'm willing to undergo to have the type system make the guarantees that I want (hint: in C++, the answer is "not much," but that's much less true for, say, OCaml or Haskell, and if I'm really serious I'll work in Coq, where the only thing, realistically, preventing me from expressing what I mean in my types is my ignorance).

Still, it seems inevitable

Still, it seems inevitable that once you've proved everything that can be guaranteed to be statically provable, all that remains is proofs that CAN'T be thusly guaranteed....

Indeed, but as others have pointed out here before, the question remains whether there are actually interesting programs in that set. I think many here would say there aren't.

I would suggest to just

I would suggest to just follow Robs idea and build what Paul Snively calls a "type slider" for a new language or a set of experimental languages. Researchers could than evaluate behaviour of programmers using those languages and do experimental science instead of just leading scholary discussions. I wouldn't be surprised when the results contained some surprises - something intellectual ping-pong is never going to yield.