Duck typing vs Structural Subtyping

I ran into structural subtyping [vs nominal] this summer with Cannasee's Motion-Types description. While talking to somebody about a joke ["Genetic code is dynamically typed and favors the "duck typing" philosophy. This creates an enormous amount of security holes, where special ducks ("poisons") with appropriate appearance but malicious behaviour could be introduced into the system."], we realized we didn't know the precise difference between the two terms. Duck typing seems to have a dynamic connotation, where as structural subtyping is one of the gems of the static ocaml world. A google search actually hit the LtU forums with someone hinting at subclassing vs subtyping.

I have a couple of ugrad years left and would like to know this one before I'm out :) Thoughts?

- Leo

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I Can't See Any..

...difference between "duck typing" and "structural typing." See the "Static vs. Duck Typing" subthread of this thread. As you can see, at least one poster offered a definition of "duck typing" that isn't reflected in any programming language, but instead goes into features that only exist in description logics. Again, there's nothing wrong with that; I love description logics, but they aren't what people are talking about when they use the phrase "duck typing." I haven't yet seen a definition of "duck typing" that isn't identical to "structural typing," so I continue to maintain that they are, in fact, identical.

The essence of duck typing

I take duck typing to mean the strategy of applying runtime coercions to make values fit into the types required by operations. For example, some languages let you add the string "1" to the integer 2 to yield the integer 3. In C++, you can add the integer 1 to the floating-point value 2.0 to yield the floating-point value 3.0. Duck typing includes the strategy of loosely typing records/objects so that two such types are deemed equal if they contain the same set of field names and the fields are of coercible types. This is similar to the structural subtyping principle.

There's nothing fundamentally unsound or dangerous about structural subtyping. You can reason about them in the same way as you can reason about tuples: tuples map indices to values of some index-dependent type, while structural records map names to values of some name-dependent type. But of course there's no abstraction there!

To gain compile-time security from type abstraction you definitely want a mechanism to augument structural typing. Nominal typing and partially opaque types are all well-understood solutions to this problem.

I agree

I agree with this definition. Duck typing is trying to fit a runtime value into a "type hole" by making appropriate coercions. For objects you don't even need coercions you "let the object pass" and wait some field access failure.

OTOH Structural subtyping is a compile-type check that enable more flexibility in programs : see this other post. I recently added some kind of structural subtyping in haXe as well, but it's only in the form Class :> Anonymous Object.

Bits and Pieces

Hi Tim!

I hadn't taken the concept of coercions to be fundamental to duck typing, instead thinking of them as essentially artifacts of the notion of "duck typing" apparently coming almost exclusively from the dynamically-checked language world. So I suppose my question, assuming that I've understood your point correctly, would be: what's the relationship between "duck typing" as you understand it, and "structural typing" coupled with "extensional polymorphism," as found e.g. in G'Caml? If nominal typing and partially opaque types resolve the tension between type abstraction and structural typing, is A Nominal Theory of Objects with Dependent Types a worthwhile thing to study?

I remain hopeful that within the next few years we'll see a new language that's statically-typed but inferred, dramatically simplifies the development of correct concurrent and distributed software, and supports the E project's "object capabilities" and "auditors" but in as static a fashion as possible (but no "staticer," pace Einstein).

Extensional Polymorphism

I don't fully understand G'Caml, but I'm not a fan of its basic thesis (that abandoning parametricity brings real benefits). For example, the author's really cool generic printing example can be achieved with union types provided that one can distinguish between obviously disjoint types -- e.g. knowing that integers are not lists, and lists are not floating point values. That certainly doesn't require violating parametricity.

I'm a fan of the view that a nominal type just bundles up a value of some type (such as an integer) with some sort of name (in the spriit of LISP atoms). Haskell's "data" declarations do something like this. As an aside, most of the literature on nominal types confuses names with pointer-like identity and runs into confusion over dependent functions, e.g. should List(Int) differ from List(Int) because they're separate function calls generating different type names? That is silly, of course.

So in my worldview, you only need structural types, names, and a means for bundling up types and values with names, from which you can derive nominal types.

Is the Confusion Fundamental?

Tim Sweeney: I don't fully understand G'Caml, but I'm not a fan of its basic thesis (that abandoning parametricity brings real benefits).

So do I understand your thesis to be that parametric polymorphism and an ML-style module system are "sufficient" in some sense? That would tend to imply that Standard ML, O'Caml, and/or Haskell are adequate. If that's the case, I'm sympathetic, as you know—I'm having a great time with O'Caml. But I strongly suspect that you're making a claim that I'm not yet able to derive from the observation above.

Tim Sweeney: For example, the author's really cool generic printing example can be achieved with union types provided that one can distinguish between obviously disjoint types -- e.g. knowing that integers are not lists, and lists are not floating point values. That certainly doesn't require violating parametricity.

I'm curious as to whether this bears some relationship to polymorphic variants in O'Caml. From Code Reuse Through Polymorphic Variants:

It is interesting to see what makes hard to encode the above example into traditional variants. There the standard way to produce the union of two types is to embed them in a type containing a constructor for each of them. In table 5 we give types for such an encoding, and some sample of the code.

One first problem, that you can see immediately by looking at the type definitions, is that this approach does not support multiple inheritance properly: lexpr contains two type of variables, Lambda(VarL "x") and Expr(VarE "x"). This is due to the fact traditional variants provide only disjoint sum, while union of polymorphic variants provides coalesced sum, required to handle the sharing of a common ancestor in multiple inheritance.

I think I'm on the right track. From Programming with Polymorphic Variants:
Not only can the same variant tag be used in many different variant types, but any polymorphic variant may be viewed under different types. Variant types form a subtyping hierarchy, and both variants and their acceptors (functions receiving them) have multiple views according to this hierarchy... a and b are both subtypes of c. They may be used several times, at any of their supertypes. For acceptors, see show both in section 2.
Tim Sweeney: I'm a fan of the view that a nominal type just bundles up a value of some type (such as an integer) with some sort of name (in the spriit of LISP atoms). Haskell's "data" declarations do something like this. As an aside, most of the literature on nominal types confuses names with pointer-like identity and runs into confusion over dependent functions, e.g. should List(Int) differ from List(Int) because they're separate function calls generating different type names? That is silly, of course.

Stated that way, it is rather obviously silly... so why does the confusion seem so pervasive, and persist?

Tim Sweeney: So in my worldview, you only need structural types, names, and a means for bundling up types and values with names, from which you can derive nominal types.

I perceive a lot of devils in the details here, Tim. :-) "A means for bundling up types and values with names" sounds like it could be records, or an object system, or a module system. "Names" can be in multiple spaces or not, interned or not, and so on. Is it still "nominal typing" without "pointer-like identity," whatever problems pointer-like identity may introduce? If so, don't Standard ML, O'Caml, and Haskell all qualify as "nominally typed" given their module/typeclass systems? What should I infer from the observation that G'Caml and Acute both offer type-safe marshaling and unmarshaling, with all that that implies in terms of names and types and abstraction boundaries, and that G'Caml violates parametricity, but Acute apparently does not? I recall that you're concerned about the number of new primitive concepts that Acute introduces, and now you're concerned about G'Caml violating parametricity. Any further insight you can provide into how a language designer might unify systems like these in a principled way would be greatly appreciated.

Update: edited to clarify that G'Caml violates parametericity while Acute (AFAICT) does not.