Defining Types not as Classes but as Mathematical Sets

Hi, long time browser, first time posting.

My background is largely in the C family of languages (specifically C, Java, and C#) but I've been reading into dynamic languages and it got me thinking about the advantages of dynamic type systems and bringing some of their advantages into the static world without sacrificing the bonuses static typing brings. An idea popped into my head when I asked what if an object's type and an object's class were different things and variables only refered to the former and were ignorant of the latter? Furthermore what if you viewed a type as a mathematical set of methods defining one method as equal to another method when their names and signatures are the same?

For example, in such a system the following Java code wouldn't result in an error.


public class A { // Implicitly defines Type A { int op1(int) }
 public int op1(int arg1) { return arg1; }
}

public class B { // Implicitly defines Type B { int op1(int), int op2(int) }
 public int op1(int arg1) { return arg1 + 1; }
 public int op2(int arg1) { return arg1 + 5; }
}

public class C {
 public static void main() {
  A a = new B(); // Not an error since Type A is a [proper] subset of Type B
  System.out.println(a.op1(2));
 }
}

Essentially you get dynamic type membership and static type capabilities.

So I'm pondering:
Does this lose any of the benefits of static type checking? Does this introduce any complications that would prevent a compiler or interpreter from being created? And finally is it dynamic enough?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I belive this is called

I belive this is called structual subtyping, and exists allready in Ocaml.

O'Caml

What you just described is the object system used in O'Caml and is referred to as "structural typing". The type of an object is exactly the set methods its supports (name and type signatures). For example, Class B from your example has the type < op1: int -> int; op2: int -> int > in O'Caml. To translate your example:

class a = object
    method op1 (arg1: int) = arg1
end

class b = object
    method op1 (arg1: int) = arg1 + 1
    method op2 (arg1: int) = arg1 + 5
end

let _ =
    let a = (new b :> a) in
    Printf.printf "%i\n" (a#op1 2)

class in O'Caml does two things: it creates a class in the traditional sense, and creates a type alias: e.g. here, class a creates a type a which is an alias for the type < op1: int -> int >. Because this the type returned by new b (the type I described above) is a subtype of a, it can be coerced with the :> (coercion) operator.

Interestingly, because object types exist independently from classes in O'Caml, objects themselves can also exist independently. You could instead write this:

type a = < op1: int -> int >

let _ =
    let a =
        (object
            method op1 (arg1: int) = arg1 + 1
            method op2 (arg1: int) = arg1 + 5
        end :> a) in
    Printf.printf "%i\n" (a#op1 2)

and the compiler will be just as happy (though you won't get the benefits of inheritance classes give you). You can even elide the a type entirely and replace it with its defintion.

The benefits of structural typing are enormous (as you may expect): a class doesn't need the foresight of what interfaces might be required of it, and it speeds up rapid prototyping. (In fact there's no special support for "interfaces" in O'Caml as there is in Java.) This goes hand-in-hand with O'Caml's type inference, which removes the requirement that you need even declare any datatype beforehand.

[Skip these next few paragraphs if you're tired; I've been doing a lot of thinking about structural typing lately...]

Unfortunately there are also two major drawbacks to structural typing: (1) objects which appear to have the same type may in fact have different semantics (as in your example), and (2) objects with seemingly different types (e.g. differently named methods) may in fact have the same semantics. (1) is purely due to lack of expressivity in the type system... in languages such as Coq where the type of a function can completely capture its semantics this is not a problem. Nominally-typed languages such as Java avoid this problem since they don't care about a method's type. The best way to avoid this problem in structurally typed languages is to use descriptive method names.

(2) is the same problem exhibited in Java, that one can't know in advance which interfaces to conform to, just on a different (slightly less sinister) level. This is not so much a problem if simply-named methods with simple types are used (since there is a higher chance two programmers will make the same choices), but it seems that that requirement conflicts with the fix for (1). We can also easily work around it with the adaptor pattern, at the cost of a little more keyboard work. Again, the ideal solution is a richer type system... the we can throw out the identically-named methods requirement entirely and just say two objects have the same type if they each have methods with corresponding types (semantics). When an object is cast to a different type its methods are renamed to their semantically-identical counterparts in the target type.

There is another problem, related to (1), which affects both structurally- and nominally-typed languages. (My group partner in my Software Eng. class pointed this out to me, which got me thinking about it.) An object can't implement two interfaces which share a method (name and type) in common but disagree about its semantics. In nominally-typed languages it's possible to accomplish this using a mechanism whereby the programmer specifies which of an object's methods implement which parts of an interface, so the interface-dictated method names are only used when the object is cast to the interface's type. (I believe this is how Haskell's type classes work; someone correct me if I'm wrong.) I can't think of any good solution for structurally-typed languages that doesn't amount to a kludge. Of course, the best solution is, as for (1), a richer type system :)

To sum it up (now that I've told you more than you ever wanted to know about interfaces :))... structural typing is extremely convenient in the same way type inference is, but, in the absence of very rich type systems, nominal typing scales better in real-world systems.

Oh, and check out O'Caml; you'll really like it :)

Haskell?

Another take on it is Haskell's type classes. Type classes kind of make me think of CLOS, in a type safe way.

Russell?

... what if you viewed a type as a mathematical set of methods...

I don't know if this is of interest to you, but Russell, created in the late 1970s, has datatypes that happen to be precisely sets of functions, and are themselves values (e.g. can be passed as arguments to procedures). Because all datatypes are treated this way, all of them are abstract (there are no primitive datatypes whatsoever).
Sadly, the language sank into oblivion, but there are an implementation and some publications available: see e.g. here and here. The following: “H. Boehm, A. J. Demers, and J. E. Donahue, ‘A Programmer's Introduction to Russell’, TR 85-16, Dept of Comp. Sci., Rice University, 1085” can be found in the implementation package, in two files with slightly different content.

Duck Typing