Type Differentials

I've stumbled across something in my compiler work and wondered if anyone has run into anything similar, or knows of any existing literature about it.

The problem I'm working on is dealing with the template bloat problem in an AOT language, without resorting to type erasure. I know that there are some compilers which, during the code generation phase, will merge functions and data structures which have identical machine representations, but I'm attempting to do this in a more formal and abstract way, and at an earlier stage of compilation.

In mathematics, a differential is a description of how the output of a function varies with respect to a parameter. In my scheme, a "type differential" is a description of how the meaning, behavior, and machine representation of a function changes with respect to changes in a template parameter. An analysis of the expression tree for a function yields a set of these differentials. Once you know all of the differentials for a function, you can then easily compute whether any of the template arguments can be weakened, creating a more generalized version of the function that has the same meaning.

A trivial example is the "CallingConvention" differential. Let's say we have a generic function F which has a generic type parameter T, and which takes a function argument also of type T. Different values of T will have different machine representations, which changes the function signature - however, many different types have identical machine representations, so the set of possible machine representations is much smaller. On the other hand, if we change the type of the parameter from T to ArrayList[T], things are a little different - since ArrayList is a reference type, it's always passed as a pointer, so the value of T no longer has any effect on the function signature at the machine level. So CallConv[List[T] == CallConv[Object].

Other examples of type differential objects include LocalVariable, InstanceOfTest, InstanceMethodCall, FieldOffset, and so on. In some cases, the presence of a differential can be used to decide whether the compiled code should contain a static reference to a type identifier, or whether the type identifier should be passed at runtime as a hidden field or parameter.

Java's type erasure can be thought of as a system in which the set of type differentials for all functions is the empty set.

Now, what's interesting about this is that I realized that there's a kind of algebra of differentials: There are operators that can be used to sum and compose them. For example, if I have a composition of two templates, it's possible to compute the differential of the result directly by composing the differentials of the individual pieces.

Anyway, I'm sure that there's probably some pre-existing term for this - which I'd be able to search for if I knew what to name it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Universally Quantified Type Systems.

Universally quantified types express the idea of a polymorphic value whose type you do not know at function instantiation time. For example:

id :: forall a .  a -> a

So the 'id' function implementation is the same no matter what the type of 'a'. We might assume that id is implemented with a single machine-word that could be an int, float or pointer to a heap allocated object. The key is the type signature guarantees the function never tries to look at the object, because the type must be valid 'forall a', that is for every type that exists or may be declared (and we don't allow a base type for everything like Java's Object type).

This can be extended with type classes:

toString :: forall a . (Show a) => a -> String

"(Show a) =>" is a constraint on the type, so this is any type that implements the 'show' interface. The show interface has instances for each type that print the value into a string for example:

instance Show Int where
show i = intToString i

So now we know inside 'toString' it is safe to call show on the value with type 'a', but again we know nothing else about the type inside the 'toString' function. All we know is it's safe to forward the argument to 'show' but we don't care what the type actually is (If on a 64bit machine we assume all arguments are 64bit stack locations or registers, then we don't care whether its an int/float/pointer etc).

There are also parametric types like Array, List, Set, Map etc. Given a type like this:

lookup :: forall x . Array x -> Int -> x

Here we know nothing about the type of 'x' inside the definition of 'lookup', but we can still define lookup knowing only that it is an array.

The observation about composition is just higher order functions, so given:

lookup :: forall x . Array x -> Int -> x
insert :: forall x . Array x -> Int -> x -> ()

move :: forall x . (Int -> x) -> (Int -> x -> ()) -> Int -> Int -> ()
move src dst a b = dst b (src a)

move (lookup src_array) (insert dst_array) 1 2

Note the definition of move does not need to know the type of 'x' and thanks to partial-application, it also knows nothing about the type of container (it would work on any container with 'lookup' and 'insert' of the appropriate types.

In the assembly we can just assume 'a' and 'b' are single machine word values and call the appropriate functions. Really all move is doing is argument forwarding to other functions.

That's one way to think of it

In the assembly we can just assume 'a' and 'b' are single machine word values and call the appropriate functions.

But it's also valid to think of polymorphic values as sets of values indexed by type. Then the OP would like to conflate members of this set with "equivalent representations" and (I assume) equivalent behavior.

Making all values representable in a machine word does trivially let you conflate multiple polymorphic instantiations, but it may have other undesirable consequences ...

Implementation Details

I agree that you can view polymorphic values as sets of values indexed by type, but the 'move' function is still just forwarding its arguments. There are several solutions to this - you could pass everything by reference, so the arguments are always pointers like Java. You can make 'sizeof' a primitive operation on all types, so that you can write a generic copy function to forward arguments like 'C' (this is probably the way I would actually implement this). I was just trying to keep it simple. I prefer to think of a universally quantified type as 'any possible' type, so you cannot know anything about the type, except for the 'C' like case you would replace "forall a ." with "forall a . Sizeable a =>" which seems reasonable and allows a single instantiation of a generic function to handle all possible types.

Edit: a further thought is that universal quantification implies passing by reference, we can consider copying an optimisation where the type is sizeable. An alternative interpretation is that of beta-reduction, where we effectively remove the 'move' function by AST rewriting before code generation, for this we only require type identity.

Parametrized types can do the job.

Parametrized types can do the job.

For example,

Identity◅aType▻.[x:aType]:aType ≡ x

For example IdentityInteger▻.[3] is the same as 3.


ToString◅aType▻.[x:aType⊇Showable]:String ≡ x.show[ ]

where ⊇ indicates that aType must extend Showable, which implements the show message.

Incremental computation at the type level?

You might want to look, on implementations of polymorphism, at this article. C++ templates relate to monomorphization, you seem to be looking for some analogue of dictionary passing (though your dictionaries are empty, or only contain the result of sizeof on the type).

What you actually show does not look like a differential, just like some type-level function. You write CallConv[List[T]] == CallConv[Object]; in differential terms, I would say something like CallConv'[Object, List[T] - Object] = Nil. That is, the finite difference of CallConv, applied to the input Object and the input change List[T] - Object, gives a nil change (you seem to assume that nil changes don't exist, but algebraically they have the same advantages as the 0 number). Are you looking for the latter?