Is there a language with the ability to write arbitrary type functions?

I'm looking for a usable, fully implemented language that supports a kind of 'programming in types'. That is, it should allow the definition of type functions that take types as parameters, return values that are (existing or new) types, and support compile-time logic to implement those type functions. The generated types should of course then be used to instantiate functions that use those types.

I know that some of this is possible in C++, but the underlying type system in C++ is not really suitable.

I know that anything is possible in Lisp, but again I'm really looking for a language with a modern type system and functions on those types at compile time.

For background, this is in support of my Andl project link here. The need is to define generic operations on tuple types as they are transformed by the relational algebra, with full type inference.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Constructive

Maybe what you are looking for is constructive programming paradigm.

People have been struggling over this one for decades. I have not seen any implementation that would be good both for (1) programming and (2) theoretic knowledge representation. The nearest usable form I've seen is functional programming paradigm, but it is not as popular as I'd like it to be. Maybe the reason is that FP lacks structure of modules (it reminds me of beginning of procedural programming where all functions were on the same plane). OOP has a nice modularization and I think that it is the main reason for its popularity. People like fancy trees, I guess, and OOP has a head, a body and a tail that sounds right. FP in these days still sounds messy and I believe there is a lot of space for improvements.

(by the way, your link for ANDL is not working, but I got there anyway by "view html source" and saw the error. If you replace "<a http=...>" part with "<a href=...>", the link should work)

[Edit]
Java use generics for what you described, maybe it's enough powerful for you... You just pass types as arguments all around to get type checked on later use.

Generative generics

Generative generics would just about do it. If T1 is tuple(a:int,b:string) and T2 is tuple(b:string,c:double) then the result of a natural join is of type T3(a:int,b:string,c:double). I know that, but I can't tell the compiler that, in Java, except by writing it longhand or generating source code. I'm looking for enough metaprogramming to do that generically.

Coq, Agda, Idris

Coq, Agda, Idris are the most main-stream systems (and they're still not very mainstream) that allow to do this in full generality.

Beware that these languages tend to not commit on what the identity of types are. For example: are `natural * boolean` and `natural` the same type, or are there different types? Well it depends what your notion of "the same" means, you could say that they are in bijection with each other... For this reason, you do not usually define a function from a type to another type, but a function from some value (data) to a type -- and this function is evaluated at compile-time.

For example in Coq, here is how to define a function that takes a list of types [ foo; bar; baz ], and builds the type of nested tuples foo * (bar * baz):

Require Import List.
Import ListNotations.

Fixpoint tuple_of_list (tys : list Set) : Set :=
  match tys with
    | [] => unit
    | [ty] => ty
    | ty :: tys => (ty * tuple_of_list tys)
  end.

You can then check that (1, (2, true)) has the type tuple_of_list [nat; nat; bool]:

Check ((1, (2, true)) : tuple_of_list [ nat ; nat; bool ]).

Note that the input of this function is not a type, it is a list (a data structure) whose elements are types. In more general cases, the input may also be build from "codes", data values that represent a specific type but are open to manipulation, deconstruction, comparison etc. Such codes do not have a special status in the language, any type foo with a conversion function foo -> Type can act as a code.

Type functions in Felix

Felix can do this with types, this is not hard:

typedef fun pair (T1:TYPE, T2:TYPE): TYPE => T1 * T2;

On the surface this feature is about as useful as parametric polymorphism: that is, not very. However two additional facilities increase the utility. Type classes in Felix can have type function arguments:

class Monad [M: TYPE->TYPE] {
virtual fun bind[a,b]: M a * (a -> M b) -> M b;
virtual fun ret[a]: a -> M a;
}

[virtual functions are bound by parametric polymorphism, then instantiated by separate specialisations, as in Haskell]

The other interesting feature is the typematch type expression:

typedef fun dom(T:TYPE):TYPE=>
typematch T with
| ?D -> _ => D // domain of function type
| _ => 0 // void if its not a function
endmatch
;

typedef fun integer(T:TYPE):TYPE=>
typematch T with
| int => 1
| long => 1
| _ => 0 // not an integer type
endmatch
;

Type functions have their limits. Type expressions using type functions or type matches must reduce at binding time, i.e. during overload resolution. Functors (type functions) used as arguments to type classes must reduce at monomorphisation time.

Type functions therefore are not first class entities. They're useful as type constraints, which Felix also supports: a constraint is a type expression that reduces to either 0 (false, void) or 1 (true, unit). They can be used to reject overload candidates, and, more recently, select from effect indexed function types (a current experiment).

Jay's pattern calculus would be more general (allows beta-reduction of patterns as well as expressions), but still not first class.

There is a plan for first class types, that is, run time construction of type descriptors of unboxed types. I know how to do this, its just a lot of work. With boxed types, Ocaml for one can already manipulate arbitrary types: you just can't construct them. However this is unrelated to the kind of thing done by say Coq which is interested in proofs, not manipulation of data.

I suspect, however, that programmers have little need for calculating with types: what is more interesting is calculations with data functors, and typing of control structures (which is not well understood).

expand?

"I suspect, however, that programmers have little need for calculating with types: what is more interesting is calculations with data functors, and typing of control structures (which is not well understood)."

Please expand on that in e.g. a blog you can point us to?

blogless expansion

Sorry no blogs. There are papers on functorial polymorphism, eg by Jay.

There's also a recent post about work on the old APL system and rank polymorphism, I worked on Jay's FISh which did that some time ago. Basically, an array is like this:

array[9,5; 3,2; 7] of double

meaning a 9x5 matrix of 3x2 matrices of length 7 vectors. Now you can write a polymorphic map or fold for NxM matrices, where the data type is 3x2 matrices of vectors 7 doubles.

And you can write a map or fold for 9x5x3x2 matrices as well, but its a *different* function.

I am bored with the trivial repetition. I want to write a map or fold ONCE which can do BOTH these and any other I dream up. In other words, it needs to be polyadic: polymorphic over the array data functor.

It can be done. In FISh you can write programs that work on any rank array. For example a test case I remember, using finite element analysis to model heat flow in a steel bar. That's one dimension. Give it a plate, the program works, only now in two dimensions. And of course it works for a block, in three dimensions.

The trick is to be able to "reshape" the array by moving the commas and semicolons about so your map maps the data functor you want, with the left over bit being the data.

Polyadic programming can be extended from arrays to products and sums and thus all polynomial datatypes. See Jay's paper "Functorial ML". Its old. His new stuff on pattern calculus goes even further.

My point is that we don't care about types. Its the functors that we want to manipulate.

There is a reason for this. In a categorical model, a type is an object and objects have no properties at all. They're worth nothing. Its the arrows (morphisms) that connect them that are interesting.

So for example a pair X * Y is of no interest, what's interesting are the projection functions. In fact the object "type" is irrelevant, its only significance is the existence of the projections.

Types don't matter. Its the relations between them that matter.

So we don't need first class types or ways to calculate with types, but ways to manipulate the operators that they connect.

I don't care a hoot that you have an array of doubles. Or matrices of doubles. I want to know how to reshape the array data functor so I can handle your data types by writing a SINGLE map or fold or iterator, or whatever. I'm very tired of writing maps and folds for every type I define in Ocaml when I know it can be done automatically if only the language were smart enough.

The trick is simple enough: to "pattern match" compositions of data functors so they can be split into pieces. But you cannot do this when the only data functors your language allows are constants (lists, arrays, trees), and doesn't provide a system to compose and decompose them. The language "term calculus" must be elevated to have functor variables not just type variables.

sounds cool to me fwiw.

does seem like the children of APL don't get enough attention.

Generics

Two thoughts about this, first I like the generic Haskell approach where you have primitive operators for type sum and type product, and you can decompose types using these operators by pattern match.

Second this is exactly what Stepanov does with iterators, (and coordinate structures). Iterators classify the data access patterns of algorithms. Clearly map and fold are not the limit of this, because some algorithms require random access, but map and fold can be defined in terms of iterators, and each data structure can provide an iterator.

I don't think you can get rid of iterators, because taking a tree as an example something has to determine if you are iterating preorder, in-order or postorder. So going back to the generic Haskell approach, just having pattern decomposition over type-sum and type-product is not enough as it does not account for the semantics of the data represented, is it a tree or something else?

This sound interesting to

This sound interesting to me. I'm toying in build a relational language and this issues hit me at the start.

If I understand correctly, all the array are stored flat, and separatelly exist a tag that say "3*2", so map always work for the flat case internally?

How could look the code to handle this? I try to decipher J/K interpreters before but them are too dense to me for understand what are they doing. A baby-steps in a more mainstream syntax could help a lot in popularize the concepts, I think..

ie: How wonderfull if exist something like http://andrej.com/plzoo/ but for j/K/APL

re: code

maybe things like kona are less terse under the covers than the old school concatenatives, i dunno.

You look at the code? Still

You look at the code? Still full of single/two letter variables all around...

dogged

yeah ok it does look bad.

Second class types are enough

Felix looks worth a look. For my purposes second class types are enough, in support of strong typing but with all types known by runtime.

Are Turing complete type

Are Turing complete type systems a thing? Or would such a system no longer be considered a type system at that point, being just a compile-time macro system?

Any type system with a type

Any type system with a type rule Type:Type can potentially diverge.

Is that the only way we can

Is that the only way we can define a type system though? Is the term intrinsically related to formal judgement-based systems?

Turing-powerful first-class types

Some years ago (iirc, shortly before I hit on vau) I was dabbling in a Lisp with a multimethod-dispatch system based on first-class types. A type was an object with two methods, one for deciding whether an object belonged to the type, and one for deciding whether another type was a descendant/ancestor of the type. Arbitrary procedures could be written to implement those methods. I gave it up as probably a bad idea, on grounds it was too volatile; but it was certainly thought-provoking. I just never figured out a way of getting it to meet my "dangerous things should be difficult to do by accident" criterion.

Interesting

Maybe I'll play with that at some point. I'm willing to discount that principle "dangerous things should be difficult to do by accident" in order to have a new toy. It reminds me of playing the violin, one will make horrible sounds until you become skillful, that doesn't make the violin unworthy of study.

I've been doing similar

I've been doing similar things with graphs lately, we can even get fairly complete type-like systems out of them (as I said before, generics, subtyping, and really good type inference). It makes me wonder strongly if graph formalisms are a better idea for types than judgement formalisms.

Types as Graphs

I think types make more sense to me as graphs than judgement formalisms, so thats the way I have been working more or less since the HList paper was published. Personally I think subtyping is a bad idea, and prefer type-classes, but I am prepared to accept that is a personal preference, however given a language with type-classes I see no need to add subtyping to it.

Graph formalisms, judgment formalisms

Judgments are a declarative way to describe algorithms of a certain shape, but they certainly operate on an underlying AST with a graph-like structure (because of backpointers from bound variables to their binding positions).

In my experience, the (sometimes non-trivial) work of shoe-horning a given analysis one has in mind into the form of a system of inference rules has strong benefits in crystalizing the inductive invariants the system respects, expressing it in a more modular way. But it may not be a good fit for everything -- in my experience, systems whose correctness evidence require inspections of all possible ways to do something (instead of exhibiting one way with its evidence) are convoluted in judgment forms.

(There are also benefit in having a standardized formulation which lets designers quickly understand and get an intuition for others' proposed designs.)

I would be ready to believe that a more graph-based formalism would be more effective at describing certain forms of type system, but this claim should be carefully evaluated, for example by providing an equivalent inference-rule formulation that is as compact as reasonably possible, as a point of comparison.

Finally, one important things about judgment-style formalisms is that they are very precise, they leave fairly little unsaid -- more than actual implementations, but less than most other means of description. When people claim that X is a much better way to describe their problems, it can be the case that X corresponds to their informal intuition better but sweeps many questions under the rug. A useful integration test against X is: does it support proving strong properties of the described system?

types as database ids

I've been having similar thoughts lately. In my current experiment, type objects are just a ID plus a closure over some environment, where the environment is essentially a prolog database. I've got a lot more exploration to do around this, but so far it feels promising.

Clauses are type classes

I have been working for some time on something similar, where types are Prolog facts/values, (or closures if the type has parameters). If this is the case then Prolog clauses are effectively type-classes.

implicits

Modules in Logic?

How do modules relate to this? Type classes map directly to Prolog clauses, what do modules map to? Modules can introduce undesirable things like subtyping, whereas type classes do not introduce such things.

Edit: I realise I posted a while back about the relation between records, modules and type classes, which can be straightforward, modules being records with associated types, used explicitly; and type-classes being records with associated types, used implicitly. Modules would be a kind of namespacing, and probably something that looks like a Prolog module. With type classes a functor is just a function, and this flattening is possible if datatypes can have associated types, which could be a Prolog argument with 'out' mode on any parametric type. Abstract types (strong ascription) are interesting because I don't see any difference from non-abstract (weak ascription) at the Prolog level. I guess this is because a constructor function is not really part of the formalism. At the Prolog level a type is an atom and there is no way to get from one type to another. In fact there is no real relationship with the value level at all. We need to tie the constructor and Deconstructive functions (and their types) to the type-case. The easiest way to do this is probably with some facts with commonly known names. The conclusion at this stage is that the Prolog program has to have a certain structure to be a valid type-system. Which put another way is that a type system is a logic, plus a mapping of certain values within that logic to concepts in value-level program like constructors.

Hypothesis

Hypothesis: All type systems consist of a logic and a value-level mapping. By factoring out the logic, type systems can be expressed declaratively in terms of the mapping of value level constructs to type level values.

Clearly there are type systems with foundations in different logics (linear types from linear logic), but many type systems would map to the same logic.

type = id + env

there is no way to get from one type to another

An annoying property indeed, which is one thing that motivated me to explore this view. I've been thinking of a type as a pair of some unique ID and the environment in which the ID was resolved. This way passing a type by itself is sufficient to include its related types.

Consider it like a functional representation of a graph node, where the node is a pointer to both the whole graph and an individual node in the graph. With this representation, you can't easily take a node from one graph and put it in to the other unless you first strip the graph context, which may not be a sensible thing to do. Alternatively, if you had two nodes from separate graphs and you added an edge between them, you could merge the two graphs.

With this analogy in mind, I'm experimenting with a representation of environments as a stack of modules. Traditional symbol bindings are simply terms in the module with a particular constructor. So you'd have something like (environment (module (binding x 1) (binding y 2) ....) (module ... and any unification query against an environment proceeds by querying each module in order, returning results from the first module to match, giving you back shadowing behavior. The modules are multisets, so you can adequately represent ambiguity, for example if you had multiple definitions of an ordering function for integers, a compile-time query against such a module for the type class would return an ambiguity error. Importing a module simply merges definitions in.

In this world, a type would simply be a gensym. You could have something like:

(module
(binding Point2D (type 1))
(binding Float64 (type 2))
(layout Point2D (struct (field x (type 2)) ....

sequent calculus

if i understand remotely correctly, Shen uses a Prolog to run its type system.

Haskell and Rust

Haskell and Rust get pretty close to allowing arbitrary type functions wih type-classes, especially with common Haskell extensions like multi-parameter-type-classes and functional-dependencies, and now Rust allows specialisation.

Haskell can do it, Rust I'm not so sure

Haskell could do it, but Haskell doesn't really have record types (yet; they're working on it). And I don't really want to use Haskell, just steal some good ideas.

Rust I'm not so sure about how strong the meta programming is. More research required.

Rust Traits

Rust traits seem to provide equivalent power to Haskell type classes with type families (traits with associated types). Negation is problematic in both, I am aware of some work-arounds in Haskell, but I am not sure if the same can be done in Rust. Anecdotally I think the type system used to be turing-complete, but restrictions on trait matching may have prevented this (essentially rust traits should monomorphise, so unbound recursion cannot be allowed).

Rust trait's don't *always*

Rust trait's don't *always* have to monomorphize, you can have trait objects for dynamic dispatch (https://doc.rust-lang.org/book/trait-objects.html). Apparently the original program did break, but people discovered other approaches.
Still, if somebody discovers that your type system is Turing-complete, you might not want to use it—apparently, typechecking performance can be *actually* exponential and it's hard to avoid—see
https://www.reddit.com/r/rust/comments/3on6cf/is_rusts_type_system_still_turing_complete/cvz30i6

/u/paholg found this with his dimensioned library, where types for higher numbers could easily take hours to compile.

Not sure if *that* post is still current.

Trait Objects

Traits when used in Generics always monomorphise, which means that any recursion must be bound. Trait objects are effectively Haskell's existential types.

One of the best statically

One of the best statically typed languages to do this kind of arbitrary compile-time computation on types is the D language. With it you can also write the polyadic stuff skaller talks about.
Show examples of what you want to do (here or better in the D groups), and someone will try to suggest some solutions.

Yes, D is good

Yes, I suspect D can do it; it's pretty strong on the meta-programming. One to look into. Nice language -- pity it wasn't invented by the G/A/M troika.