Omega - Language of the Future

When I discovered Tim Sheard's Languages of the Future, I realized that PLs do indeed have a future (beyond asymptotically approaching CLOS and/or adding whimsical new rules to your type checker). Compared to languages like Lisp, pre-generics Java, and Python, the "futuristic" languages like Haskell and O'Caml seemed to mainly offer additional static verification, and some other neat patterns, but the "bang for the buck" seemed somewhat low, especially since these languages have their own costs (they are much more complex, they rule out many "perfectly fine" programs).

Ωmega seems like a true revolution to me - it shows what can be done with a really fancy typesystem, and this seems like the first astounding advancement over existing languages, from Python to Haskell. Its mantra is that it's possible to reap many (all?) benefits of dependent programming, without having to suffer its problems, by adding two much more humble notions to the widely understood, ordinary functional programming setting: GADTs + Extensible Kinds.

Sheard and coworkers show that these two simple extensions allow the succinct expression of many dependently-typed and related examples from the literature. Fine examples are implementations of AVL and red-black trees that are always balanced by construction - it's simply impossible to create an unbalanced tree; this is checked by the type-system. It seems somewhat obvious that sooner than later all code will be written in this (or a similar) way.

How to understand this stuff: my route was through the generics of Java and C# (especially Featherweight Generic Java, FGJω, and A. Kennedy's generics papers). Once you understand basic notions like type constructors, variance, and kinds, you know everything to understand why GADTs + Extensible Kinds = Dependent Programming (and also esoteric stuff like polytypic values have polykinded types for that matter).

It is my belief that you must understand Ωmega now! Even if you're never going to use it, or something like it, you'll still learn a whole lot about programming. Compared to Ωmega, other languages are puny. ;P

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It seems somewhat obvious

It seems somewhat obvious that sooner than later all code will be written in this (or a similar) way.

I'm not totally convinced. I think abstract interpretation hasn't had nearly enough exploration, and could yield a programming language vastly simpler yet nearly as powerful as any dependently typed language or an approximation thereof.

Types aren't the only way!

[nm]

[nm]

But AI is not some kind of

But AI is not some kind of magic bullet. To use it in the types of examples in the paper it is still necessary to define abstract domains in such a way that the analysis can terminate with a sufficiently precise value. For complex domains this becomes quite a substantial programming task and the question that naturally arises is how to express those conversions and the rules for abstract analysis.

In short, you'd need some kind of extension to the syntax and semantics of a language to really support AI as a system for proving static properties. Without designing such a system it is kind of hard to say whether or not it would look as elegant as Omega. It also seems worth noting that the computations over kinds / types are already a form of AI for the lower layers in their hierarchy.

After a quick skim through the paper: wow! This does look very nice. We used dependent types a while ago to model quite a complex domain, and as the system became more and more precise we ended up writing larger and more complex mappings over the type parameters. In the end we used a small constraint language to make things work, the approach in Omega seems a lot more elegant and complete.

Oh, I know it's not a magic

Oh, I know it's not a magic bullet, my point was simply that AI has a lot of unexplored potential. The push towards making dependent types more friendly is great, but we shouldn't ignore other possible avenues.

To what extent are AI and types the same thing?

Types give a kind of abstract interpretation. And many kinds of abstract interpretation can be achieved nicely by appropriate use of types.

Types as Abstract Interpretations

This reminds me of the paper by Cousot: Types as Abstract Interpretations

The LtU thread discussing

The LtU thread discussing this, and other papers on AI and type checking.

As for types vs. abstraction interpretations, yes types would be involved even a language using AI, and type checking and AI are formally connected. However, a type system weaker than dependent types inherently limits you in the properties it can prove.

The paper on Near-concrete Program Interpretation is designed to gather and export the value bindings generated from an abstract interpretation of a program, including values generated using general recursion, so that they can be imported into a theorem prover, and even more powerful properties of a program can be reasoned about.

This single, uniform analysis further permits flow, context and path sensitive analyses. I'm not aware of any single type system that allows this breadth of analysis, particularly in the presence of general recursion.

Dependently-typed data structures and reusability

Fine examples are implementations of AVL and red-black trees that are always balanced by construction - it's simply impossible to create an unbalanced tree; this is checked by the type-system. It seems somewhat obvious that sooner than later all code will be written in this (or a similar) way.

The user of the Coq community are experimenting and discussing the use of dependently-typed data structure, and your position is not widely accepted. I have heard several times that dependently typed data structure can be consider harmful for code reuse.

Consider your AVL example : you defined a new data structure that is a "AVL-balanced tree" by construction, and a different data structure that is a "red-black tree" by construction. But in some cases, you want to consider both as standard data trees. For example you will have to define a "size" function for both these data structures, while it is essentially the same operation. If you do things this way, you have code duplication.

A popular approach is to implement the data structure as simple data structure, without dependent types embedding "proofs" inside terms, and to write separately proof of behaviour under certain uses; for example you would write an "AVL" and a "red-black" predicate that check if a tree is effectively avl or red-black balanced. You can write a "cons-AVL" function that guarantees that it preserve AVL-ness, and another cons function that preserve red-black-ness. But you only have to write one size function. The general motto is to "separate computation and proofs".

This is only one of the positions regarding dependent types. Other Coq users have vocally pushed dependent types programming in Coq; see in particular Adam Chlipala's book : Certified Programming with Dependent Types.
Other approaches to dependent types may resolve the code duplication issue that I described. I find the "generic approach" promising; the idea is to define data structure describing types, along with a function projecting these values in the type world (note the crucial use of dependent types here). You can then define generic algorithms that operate on the reified data structures (see the Generic chapter of the CPDT book). This approach, when pushed further, might reconcile dependent types and reusability.

More generally, I think that when you add more expressiveness to your type system, you discover more way to express things, and it is often difficult to evaluate the pros of cons of the different techniques and settle for an harmonized use of the given features.
In ML and Haskell for example, there is a tension between the use of direct data types, and the corresponding church encodings for data. Should your function expect a list of items, or the corresponding fold/enumerator ? You'll probably find software libraries for the two representations (filter on list and on folds, etc.).

Similarly, the "Programming in Omega" paper compares two different way of witnessing types at the value level (3.5 : Witnesses vs. type functions), (Even n) vs. (Proof {even n}).

Witnesses and type functions express the same ideas, but can be used in very different ways. [..] A big difference between the two ways of representing properties is the computational mechanisms used to ensure that programs adhere such properties.

These "more than one way to do it" situations create software engineering challenges that have yet to be discussed, understood and solved. I think that's one of the main challenge of those type systems for the "language of the future".

Thanks

Thanks for the insightful comment and links! What I find so attractive about Omega is its parsimony of concepts: anyone who understands generics at some depth can understand the basics of Omega. IMO Omega is so great in that it shows the power of Curry-Howard in such a simple setting. Solving pragmatic issues, such as the code reuse you mention, may well demand a lot more work, but that doesn't detract from Omega's eye-opening power. GADTs + Extensible Kinds seems to me like a "local optimum", where some benefits of dependent programming can be had, without literally requiring a PhD.

Scaling precision

I agree completely. What's to follow is all pretty hand-wavy, so please excuse the quotes.

I think another way of solving the reusability problem might be to start by looking at the way people are proposing to interface statically typed languages to dynamic ones (see, e.g., Well-typed Programs Can't be blamed), and adapting it to interfacing "less statically" typed languages to "fully dependently" typed languages, or, rather, in the same language: something like extending subtyping to allow upcasting ("approximating"/generalising) code that is "too precise" for reuse.

Of course, the need for casts is usually an indication of a lack of polymorphism, so, rather than casting away the precision, maybe we should allow polymorphism in the "precision" of these proofs, or the proofs themselves, or something else :-) (which, I guess relates to your paragraph about the Generic chapter in CPDT).

This is one of those things

This is one of those things that are cool, fundamentally simple, and yet can be utterly confusing. I think a quick, perhaps even bulleted, introduction would be helpful, and I don't recall seeing one. Something like: what is this good for (we ensure statically important properties/invariants); what is the shortest, simplest example; how is it possible (i.e. how can we do this statically); how is this done (via the type system); what is the cost to me, the programmer; where can I download it.

Why do type-level computations have to terminate?

Omega requires type-level functions to behave like term rewriting systems, with termination being a requirement. But why do type-level functions have to terminate if value-level ones don't have to? Is the reason for this the (implicit) paradigm of ahead-of-time compilation?

Let's say I allow type-level functions to be Turing-complete. Obviously now the compiler or type system may get into an infinite loop. But let's say I can live with that (in fact, procedural macros in Lisp and Scheme already introduce a Turing-complete compile-time). In interactive use, I'd use ^C when I encounter a type-level bug that causes an infinite loop in the compiler.

Do I lose guarantees of the type system when I allow Turing-complete type-level functions? It seems to me that existing Omega programs should type-check and run just fine and with the same semantics in a system where type-level functions may be Turing-complete. Or am I missing something?

Strong Normalization

Strong normalization is useful for logical soundness, not for some kind of practical guarantees about actual evaluation.

I'm not sure how often it matters at the type level, but in a total language, it's not necessary to evaluate terms with types like "Equal a b" at all - the only reason a language with general recursion has to evaluate such a term and perhaps thereby fall into an infinite loop is because it allows infinite loops, so it has to evaluate the term to be sure it's really an equality proof and not just an infinte loop.

Here's a modest example showing that "total" doesn't mean you can run a program in any practical sense:

Fixpoint iterate (op : nat -> nat) base b :=
match b with
| 0 => base
| S b' => op (iterate op base b')
end.

Fixpoint arrow (k a b : nat) :=
match k with
| 0 => a * b
| S k' => iterate (arrow k' a) 1 b
end.

Define grahams_number := iterate (fun n => arrow n 3 3) 4 64.

Non terminating type-level -> Unsoundness?

Sorry for asking questions that may be naive, but do non-terminating type-level functions invariably cause unsoundness of the type system?

(Practically, a program for which the type-level runs into an infinite loop would never be run: since the compiler doesn't return, there's no way to run the executable.)

Type:Type

The Type:Type system first described by Martin Lof is famously non-terminating, however it remains type safe by a general theorem concerning type safety of Pure Type Systems. The situation is similar if you add datatypes and general recursion. However, in addition to loosing decidability of type checking, you also loose the advantages of the Curry-Howard logical view: types can no longer be seen as propositions in a consistent logic.

An example: in a dependent type system with general recursion (e.g. epigram :) we may define the type

OrderedList = {l:List | is_sorted l}

But then ([1,0],proof_of_False) would be a valid inhabitant of this datatype, which as you may suspect seriously diminishes the interest of using dependent types to specify the behavior of our programs.

To put it naively...

The whole point of typechecking, whether it's just Fortran's or Haskell's, is to be able to find errors by a process that involves only deductive reasoning, not debugging. If your typechecker is Turing-complete, it's as hard to find type errors as it is to find errors at the value level, so what's the gain?

Staging

C's preprocessor, Scheme macros, and C++ templates are all Turing-complete. This isn't as troublesome as you might think. It is more a problem if we don't have any expectations about how long the typecheck, at a given layer or stage, should run.

Our 'gain' exists insofar as we use the stages differently.

Is the C preprocessor Turing-complete?

I was under the impression that it couldn't encode general recursion. A quick google seems to support this as TM-simulators have used an external tool to close the loop. I'm just curious as this doesn't affect the point that you were making.

Errata

I was reporting what I've heard, but digging mostly turns up the external tool to close the loop. So it seems I'm in error on that premise.

cpp

I recall from the way-back-machine that the mini-m4 macro processor in "Software Tools in Pascal" (Kernighan and Pike? - don't recall exactly) supported looping via recursion, so I figured that cpp likely would as well.

Cf. Epigram

How does this compare with Epigram?

Re: Epigram

AFAIU, the point of Omega is to keep the standard FP model, and only augment it with some more or less modest extensions, whereas Epigram requires a very different programming model.

[having trouble hitting the right reply buttons]