OOP Parallel class hierarchies

I'm curious if anyone has thought about or know of any languages that are aimed towards solving the parallel class hierarchy problem. If you're unfamiliar it's a term GoF uses a lot -- it refers to when you delegate out part of the responsibilities of a class to another class, so you end up with class hierarchies like:

A -> B
  -> C

AWindow -> BWindow
        -> CWindow

ARenderer -> BRenderer
          -> CRenderer

In this trivial example nested classes can help improve the encapsulation but parallel classes still always have a sloppy feeling to me.

More generally, does the public/protected/private scheme make parallel class hiearchies inevitable? Does AOP handle this problem somehow?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Mixins or traits

Sounds like an application for mixins or traits.
I do not have any links handy, but I think I've seen some papers using a query like that.

good

Parallel hierarchies are a good thing, why would you want to get rid of them? For example, in dynamic languages, parallel hierarchies implementing one feature each can be used to solve the expression problem.

Issues in Parallel Class Hierarchies

Some reasons that come to mind as to why there are issues with parallel class hierarchies include, but are not limited to:

  • How does someone other than the original author extend one of the classes safely?
  • How does one extend one of the classes in the absence of available source code?
  • Does the parallel scheme support separate compilation?
  • If the language in question is statically typed, how is type-safety ensured?
  • Can the classes actually be extended independently, or if one is extended, must its parallel be extended also?

Treatments of these issues generally refer to family polymorphism, virtual types, or virtual classes. As has been discussed before, there's a not-insignificant challenge for statically-typed languages in dealing with the subject correctly, but the challenge seems to be getting addressed fairly consistently in recent language designs: Scala, of course, is all over it, and O'Caml seems to have gotten it right from the outset (about a decade ago).

How does someone other than

How does someone other than the original author extend one of the classes safely?

How does one extend one of the classes in the absence of available source code?

Create another class, of course :)

Does the parallel scheme support separate compilation?

In a dynamically typed language it does.

Can the classes actually be extended independently, or if one is extended, must its parallel be extended also?

I suppose so, although I'm not sure when this would ever be useful.

Problem Statement

Let's try to put some parameters to these questions. One such parameter is that the question is being asked in the context of statically-typed languages such as the ones that most working programmers use to deliver commercial software: C, C++, Java, C#. Another parameter is that, again in a commercial setting, you rarely have the source code to the existing hierarchy, and so you're at considerable risk of violating one or more contracts of the intertwined hierarchy if you just willy-nilly subclass and override. Here's the complete list of desiderata from "Independently Extensible Solutions to the Expression Problem:"

  • Extensibility in both dimensions: It should be possible
    to add new data variants and adapt existing opera-
    tions accordingly. Furthermore, it should be possible
    to introduce new processors.

  • Strong static type safety: It should be impossible to
    apply a processor to a data variant which it cannot
    handle.

  • No modification or duplication: Existing code should
    neither be modified nor duplicated.

  • Separate compilation: Compiling datatype exten-
    sions or adding new processors should not encom-
    pass re-type-checking the original datatype or exist-
    ing processors.

  • Independent extensibility: It should be possible
    to combine independently developed extensions so
    that they can be used jointly.

Hopefully this helps elucidate why people are putting so much effort into addressing these issues.

One such parameter is that

One such parameter is that the question is being asked in the context of statically-typed languages such as the ones that most working programmers use to deliver commercial software: C, C++, Java, C#.

When did the OP say this? Regardless, this technique hinges more on having some sort of open or any type. For example, it would be completely possible to do in a language like Java if it had, say, an 'any' type.

Another parameter is that, again in a commercial setting, you rarely have the source code to the existing hierarchy

Isn't that the premise behind the expression problem?

Hopefully this helps elucidate why people are putting so much effort into addressing these issues.

It's not terribly difficult. Solutions have been around for decades, before the expression problem was even formalized, in dynamic languages. Of course, dynamic languages don't help in the compile time checking area, but soft languages can, so I believe that fulfills all the requirements of a solution.

Self-Selection

Curtis W: When did the OP say this? Regardless, this technique hinges more on having some sort of open or any type. For example, it would be completely possible to do in a language like Java if it had, say, an 'any' type.

All I meant to say is that, since the problem obviously doesn't occur in a dynamically-typed language, the "solutions" under discussion are obviously in the context of static typing—you can think of this as a variant of the Anthropic Cosmological Principle applied to software engineering. :-) Similarly, since the problem equally obviously doesn't exist given a "top" type, the solutions don't talk about that. That's the strict meaning of "strongly statically typed." You can "solve the problem," e.g. in Java, by making everything an Object and attempting downcasts everywhere, but even Java die-hards will blanch at that "solution."

Curtis W: Of course, dynamic languages don't help in the compile time checking area, but soft languages can, so I believe that fulfills all the requirements of a solution.

Except that even a "soft" language involves weakening the type discipline, which not everyone is willing to do. It's important to remember that the question as defined by the list of requirements I posted includes strong static typing. That's admittedly an even stronger requirement than "works in C++, Java, or C#," which are weakly statically typed, but that's why having actual solutions in Scala or O'Caml is so compelling: it turns out that you can have independent extensibility of interdependent classes without sacrificing orthogonality, separate compilation, or strong static type safety. Given that that's the case, I see no reason to needlessly increase my users' risk level, assuming that I can choose what language to use. Of course, pragmatically, my day jobs tend to remain in Java or C++, so I, and more importantly my users, are screwed along multiple dimensions...

All I meant to say is that,

All I meant to say is that, since the problem obviously doesn't occur in a dynamically-typed language, the "solutions" under discussion are obviously in the context of static typing—you can think of this as a variant of the Anthropic Cosmological Principle applied to software engineering. :-)

Oh, ok. My mistake :)

Except that even a "soft" language involves weakening the type discipline,

That's a somewhat subjective point. I don't consider soft typing to be any weaker than static, considering that soft typing is basically static, structural subtyping with implicit "casting."

It's important to remember that the question as defined by the list of requirements I posted includes strong static typing. That's admittedly an even stronger requirement than "works in C++, Java, or C#," which are weakly statically typed, but that's why having actual solutions in Scala or O'Caml is so compelling: it turns out that you can have independent extensibility of interdependent classes without sacrificing orthogonality, separate compilation, or strong static type safety.

Ok, I see. It's not so much that the type system be strong and static, but that the type system be comprehensive in catching bugs, correct? I see no reason to narrow it down farther than that.

Except that even a "soft"

Except that even a "soft" language involves weakening the type discipline,

That's a somewhat subjective point. I don't consider soft typing to be any weaker than static, considering that soft typing is basically static, structural subtyping with implicit "casting."

I would say that many consider casting to be a weakness in the type discipline.

Well, you always have to use

Well, you always have to use some form of casting when you're dealing with unknown input. For example, OCaml has pattern matching that can be incomplete, which is exactly the same as using casts.

No, castings are practical

No, castings are practical but I wouldn't consider them necissary. Mostly cause the input is never completly unknown. There is always some kind of format for it. Even if it is just a string or integer. you don't have to cast it to some other format, its just practical to do that.

And sure, allowing incomplete pattern matching is a weakness in the type system.

Im not claiming that you should always have as strong type system as possible. Actually i feel rather strongly that soft/polite/hybrid typing could become safer than traditional static typing.

Miscasting :-)

Let me just add that it's true that not all coercions/casts are problematic; it's really only "downcasts" that are. O'Caml doesn't support downcasts, but it does support essentially every other kind of type coercion, because these other kinds are safe.

Isn't OCaml's equivalent of

Isn't OCaml's equivalent of downcasting pattern matching? Unless you're talking in terms of its object system, but I don't really know how that works.

Pattern Matching vs. Coercion

Short answer: no.

Medium answer: no, because coercion in O'Caml is never implicit.

Longer answer: Pattern matching is most often used to distinguish among variants of a single variant type (that is, most pattern matching consists of a combination of "or patterns" and "variant patterns"); there is a form of pattern-matching that allows you to explicitly state a type-constraint, but a type-constraint is not a coercion, and if O'Caml can't prove that the portion of the match in question unifies with the type in the constraint, it won't compile. The details on pattern-matching in O'Caml can be found here.

O'Caml does indeed have a class-based object system with multiple inheritance, etc. that C++/Java/C# programmers find very familiar-feeling... except that it has no downcasts. There are essentially two reasons for this. One is that most object-oriented downcasts are simulating variant types, and O'Caml has variant types, so there's no need to simulate them. The other is that downcasting is also often used to overcome mainstream OO languages' confusion of subclassing and subtyping. O'Caml doesn't have any such confusion, being structurally typed, so entire classes of situations where in other OO languages you'd have to have an inheritance relationship and a downcast can be dealt with in O'Caml by not bothering with the inheritance relationship and simply relying on the structural typing to allow the value in question to be operated on by the function/method in question—but obviously you know this. Further details on coercion in O'Caml can be found here if you're curious.

Medium answer: no, because

Medium answer: no, because coercion in O'Caml is never implicit.

One is that most object-oriented downcasts are simulating variant types, and O'Caml has variant types, so there's no need to simulate them.

If they're simulating variant types, there's obviously a connection between the two, the direction of which is unimportant. This is basically what I meant when I was talking about pattern matching effectively being OCaml's downcast mechanism in cases where structural subtyping doesn't take care of it already, e.g.


if(object->type == SHIP) ((Ship*)object)->fly();

is equivalent to:


match object with
Ship x -> fly x;;

Reflection

The direction is important insofar as one is known at compile time ("match object with") and one is only known at runtime ("if (object->type == SHIP)"). Also, you're assuming a reflection API so that the downcast is safe, which isn't available in the general case. The comparison that you originally said you were making was between "((Ship*)object)->fly()" and "match object with Ship x -> fly x", and the differences are rather clear: in the former, "object" can be literally anything and if it isn't a Ship* you'll probably crash at runtime (assuming this is C++ code), whereas in the latter the code won't even compile if "object" isn't known, at compile time, to be of a variant type with a "Ship" constructor.

But I think you've simply made a terminological error, because given the example with reflection, I think your analogy is a good one: reflection essentially gives you a "typecase" construct, and there is a parallel to non-exhaustive variant/or pattern matching: in the pseudo-C++ case, missing a possible object->type results in heaven-knows-what behavior, but in any case it's certainly not intended by the programmer, and non-exhaustive variant/or pattern matching in O'Caml potentially results in an exception being thrown; see this for a carefully-drawn parallel to NullPointerExceptions in Java, with the author admitting that he should win some kind of "pedant" award for constructing it. :-) The differences that I see are:

  • The pseudo-C++ case doesn't generate any warnings; the O'Caml case generates a warning at compile time.
  • The behavior of the pseudo-C++ is undefined; the O'Caml case throws an exception that can be caught and, per Oleg Kiselyov, even resumed.

So despite the analogy between typecase plus downcasting and variant/or pattern matching having merit, there remain clear, distinct, compelling advantages to variant/or pattern matching in practice.

Ok, I see what you're

Ok, I see what you're getting at. Pattern matching isn't casting, but rather it subsumes its capabilities. The kind of downcasting I'm talking about is the same kind that pattern matching handles, which lead to confusion.

Correct, I Think

In particular, since we're talking in hypotheticals about a soft type system, I have to imagine that a typecase construct could be made exactly as static as pattern-matching, and you could pretty easily eliminate the issues with "if (object->type ==...)," at least giving you a nice compile-time warning. Presumably you could also throw an exception vs. having undefined behavior in such a system. So my remaining arguments are those from my other post, which basically revolve around there being no such soft type system (that I'm aware of—please correct me if I'm in error here!) for me to use.

Catching Bugs When?

Of course, the point is that a static type system catches bugs at compile time instead of runtime, i.e. you catch them instead of your users catching them. A soft type system, particularly one that does implicit casting, is strictly weaker/riskier than a static one in this sense.

Well, you're somewhat

Well, you're somewhat correct. The fact that soft typing throws less errors at compile time is true by its very purpose. The conclusions you make based on this, though, are quite a leap; specifically, you assume that complaining more at compile time is a good thing, which depends on a few things.

If the compiler finds an error at compile time and it's absolutely gauranteed to be an error in the programmer's mind, that's obviously a good thing. On this point, both soft and static type sytems are compatible, and for good reason--the sooner you get an error, the better off you are. The differences start to show when you get in to possible errors. These occur in situations where the exact type of an expression isn't known at compile time--in the OCaml example, this is analogous to knowing what the type constructor for a given expression is. The two different approaches to dealing with this--flagging it as an error or inserting runtime checks--really depend on one thing: whether the programmer knows what he's doing. Basically, soft typing favors experienced programmers who listen to warnings, whereas static typing favors sitations where the programmer's ability can't be trusted.

I suspect most programmers are the former, and the one's that actually can't be trusted would probably gain more if they actually learned to become better programmers, rather than using crutches in the language.

Programming is a team sport

Basically, soft typing favors experienced programmers who listen to warnings, whereas static typing favors sitations where the programmer's ability can't be trusted.

I know I'm getting a rep as repeating this point, but the idea about static typing (or design-by-contract, or nullity annotations, or indeed structured programming) isn't that you can't be trusted, it's that your teammates can't be trusted. In the wild, most programmers worth hiring have teammates who, sadly, weren't. Soft typing, in those circumstances, isn't worth a project manager's damn.

Once we have a wiki

Once we have a wiki available we're just going to have to write up Joe and Moe, aren't we?

Trust issues

...isn't that you can't be trusted, it's that your teammates can't be trusted.

If we change the term trust (which can be very subjective and controversial) for reliability or vulnerability I think the issue is clearer: programs heavily rely on their programmers and are highly vulnerable to them. Without any other mechanisms in place the software is defenseless.

Independent verification is a way to reduce the risks on this vulnerability, be it code reviews, unit-testing, GADTs, PCC, runtime checks, DbC, etc.. We roughly have three dimensions to evaluate this flavors of verification: coverage, soundness and phase. Also there's the issue of proof vs. evidence: do we need proof that the system work or do we only need enough evidence that it works.

IME it's pretty clear that you always need independent verification for software, even if it's only runtime checks and running the script and evaluating its results.

I prefer static analysis (i.e. verifications that have complete coverage, are sound and happen before deployment) because I don't want my software to fail latter if it can fail earlier and I don't mind having to rewrite an algorithm to make it's contract explicit enough for the static checker to assert its correctness. I also a strong fan of unit-testing (with TDD) and DbC, but these tools (as other tests that aren't complete and/or sound, e.g. DbC with only runtime checking) only increase the confidence that the software works (which is essential information IMO) without proving that there aren't errors laying undetected.

Right On

In an ideal world, I'd be able to say, on a language-feature-by-language-feature basis, what misapplications I want to flag as an error at compile time, and which ones I just want warnings for—in previous threads on the subject, the notion of a "slider" from static typing to dynamic typing has come up, and I essentially like the idea.

But all languages in the real world necessarily commit to a static and a dynamic semantics. Hopefully it's obvious that, from the user's point of view, the dynamic semantics are the only place that errors can occur (that is, errors of static semantics are by definition caught and corrected before the software gets to the user). So again from the user's point of view, it seems reasonable to want to ensure that the static semantics capture as many classes of errors as possible. The counterargument that we should trust the programmer not to make mistakes, I regret to say, doesn't hold water with me after over 25 years in the business, some of which were spent at a company responsible for a debugger in which only one bug was ever found (and that was due to an error in a Motorola spec), some of which were spent at Apple, some of which were spent consulting for the third largest administrator of 401(k) portfolios in America, and so on. The simple truth of the matter is that even extremely intelligent, experienced, committed, motivated, hard-working programmers—and it's been my great pleasure to work with many dozens of them—make errors that fall into classes that strong static typing routinely prevent, and that's just in reference to current languages such as O'Caml 3.09 or GHC 6.4, nevermind what we find in still more forward-thinking languages like Epigram or Acute or TyPiCal or FlowCaml. After all, people aren't doing language design research today just because it's fun; the point is precisely to allow programmers to express what they mean, but also to get asymptotically closer to only what they mean—that is, to minimize the impact of the Law of Unintended Consequences. It will never be a perfect science—thanks to Gödel it can't be—but we can certainly help the lion's share of programmers in the world avoid imposing the unintended consequences of huge swaths of their errors on their users by continuing to expand the set of classes of errors that simply never make it to the users. Until someone comes up with a soft typing system that actually does allow us to decide on a case-by-case basis how static or dynamic to be, strong static typing continues to be the best currently-available tool for improving the overall quality of software, with limiting programming to only those who are capable of developing zero-defect software coming in a very distant second.

But all languages in the

But all languages in the real world necessarily commit to a static and a dynamic semantics. Hopefully it's obvious that, from the user's point of view, the dynamic semantics are the only place that errors can occur (that is, errors of static semantics are by definition caught and corrected before the software gets to the user).

Yes, but that's not to say that errors don't occur, of course. This statement also suggests that all of the "errors" static typing checks are actual errors--more on this later.

The counterargument that we should trust the programmer not to make mistakes,

No, the counterargument is that the programmer doesn't need to be forced to do something, in this case correcting an error which might not even be one.

The simple truth of the matter is that even extremely intelligent, experienced, committed, motivated, hard-working programmers—and it's been my great pleasure to work with many dozens of them—make errors that fall into classes that strong static typing routinely prevent, ...

I suspect using a soft type system would've been fine in those cases. The programmer doesn't have to be perfect, he just has to care about what he's doing enough to read warnings. Let's take a hypothetical example; let's say we have a new programmer on the team at, say, Microsoft and they've switched to OCaml for the possibility of less bugs. Let's say he gets an error and he just can't figure out what it is. He looks around and finally figures out how to fix it, so what does he do? He writes the least amount of code that will get it to compile--who cares whether it runs well or not? This is a perfect example of how forcing someone to do something, especially when it's not even obvious that they need to be forced, doesn't necessarily have any advantage over simply telling them. It's impossible to force someone to write bug-free code, and I suspect the type of people who would ignore warnings wouldn't bother writing bug-free code regardless, so I see no reason to cripple the language to cater to a group that screws you regardless.

Backwards

I don't know what to say to this other than that it's exactly backwards: static type checking doesn't give you "errors that aren't really errors;" if I work around an O'Caml compile-time error using, e.g. Obj.magic, when that code path is hit at runtime, my program will behave incorrectly, probably even crashing. On the contrary, the reason that most languages generate warnings in some situations as opposed to errors is that they can't be sure that the behavior is in error; it might be perfectly OK, in context, to use a variable of a certain size in such a way that it loses information—using a long as a short, for example. Of course, in dynamic languages, there are no "compile-time errors" other than syntax errors; you can say all kinds of meaningless tripe in them. Soft type systems are, again, a valuable step in the right direction, but I would reiterate that I only know of a few of them (DrScheme's and IronPython and StrongTalk and...?) and they make their own assumptions about what constitutes an error without letting me be the judge of that. You might think that the situation with a statically typed language is even worse, but as I said above, it isn't: I know that my statically-typed language guarantees progress and preservation ("safety = progress + preservation"), so the errors I see are real.

Curtis W: It's impossible to force someone to write bug-free code, and I suspect the type of people who would ignore warnings wouldn't bother writing bug-free code regardless, so I see no reason to cripple the language to cater to a group that screws you regardless.

First of all, while it's true that it's impossible to force programmers to write bug-free code, the richer the type system, the harder it is to write buggy code, which is why I was extremely careful to say "asymptotically closer to only what they mean." Secondly, you're forgetting the issue of well-intentioned, smart people writing buggy code by accident. Finally, statically-typed languages aren't crippled relative to dynamically-typed languages or the barely-existent softly-typed languages; they're all capable of computing anything that can be computed. And again, having come from decades of working in the granddaddy of dynamic languages, Lisp, I can assure you that I don't find my productivity adversely affected by switching to a statically-typed language such as O'Caml, although my productivity in Java and C++ is certainly quite a bit worse. :-)

I don't know what to say to

I don't know what to say to this other than that it's exactly backwards: static type checking doesn't give you "errors that aren't really errors;"

Actually, now that I think about it, you're right in terms of OCaml. So really, now the only difference between OCaml's type system and a soft type system is that downcasts are explicit in the former (pattern matching) and implicit in the latter. That's not to say that one's visible and the other isn't, though, as a decent IDE would show downcasts in the soft typing casing, meaning their practically identical to the programmer. Well, except he has to write them manually in the first case rather than letting the compiler do it.

Secondly, you're forgetting the issue of well-intentioned, smart people writing buggy code by accident.

No, I'm not. Accidents happen, but not listening to warnings is not an accident if they're presented to you in an understandable and direct manner.

Errors in dynamically typed languages

Of course, in dynamic languages, there are no "compile-time errors" other than syntax errors

This is false. For example my Kogut compiler reports at compile time the following errors and warnings, besides syntax errors and some context-free warnings:

  • references to unknown names
  • conflicting definitions or imports of the same name
  • assignment to a constant
  • calling a function with a bad number of arguments (this may be a dynamic error too)
  • using a name when its definition can't possibly have executed (this may be a dynamic error too)

"Syntax"

You're right—I should have made clear that I was defining "syntax error" to include the kinds of modest checking of the environment that you're referring to, which you also tend to see in any reasonable implementation of even the most dynamic of the dynamically-typed languages such as Common Lisp or Scheme.

Your scenario.

What is the point of your hypothetical scenario? If the same code were dynamically typed the programmer's error might go uncaught until demonstrating the software to the client. At which point the terminally lazy/stupid programmer would go back and and do the same thing as he did with static typing, namely to apply a non-fix... which would only be discovered the next time the software is demonstrated to a client. In short, dynamic typing gets you nothing over static typing in this scenario.

I think you're arguing against something which people aren't claiming -- namely that static typing is a substitute for testing. Testing should always be performed and would hopefully catch all the bugs whether or not the language is dynamically typed. The advantage of static typing here is that non-bad(*) programmers don't have to wait for testing to get feedback for whether their program might fail -- test procedures may take a very long time, so this is a real concern in many cases. Another advantage is that the programmers also don't have to write test cases for things which the type system already proves cannot occur.

Furthermore -- and I think this is generally underestimated -- once you get to a certain level of proficiency at exploiting the static type system to prove general properties of your code, you will often (not always) find that "if it compiles, it works" actually applies! (Note: I'm mostly talking about type systems with at least the expressive power of OCaml's or Haskell's here. The type system in, say, C++ is too weak to express much of interest unless you go all-out with templatized code.)

Finally, the "static typing is too restrictive" bit seems highly subjective. I have programmed extensively in both dynamically typed languages (Python mostly) and statically typed languages (C++, OCaml), and in the vast majority of cases typing errors caught at compile time were actual errors. In the very few cases that weren't, I was able to find some other way to express what I really wanted once and put it into a module/library for reuse. Your ratios may be different, but that's my experience.

(*) That is, non-stupid, non-lazy, and most importantly not afraid to ask his coworkers for help.

What is the point of your

What is the point of your hypothetical scenario? If the same code were dynamically typed the programmer's error might go uncaught until demonstrating the software to the client. At which point the terminally lazy/stupid programmer would go back and and do the same thing as he did with static typing, namely to apply a non-fix... which would only be discovered the next time the software is demonstrated to a client. In short, dynamic typing gets you nothing over static typing in this scenario.

I agree. If you read my post, though, you'd find I was talking about soft typing, not dynamic typing :-)

Furthermore -- and I think this is generally underestimated -- once you get to a certain level of proficiency at exploiting the static type system to prove general properties of your code, you will often (not always) find that "if it compiles, it works" actually applies!

You can get the same thing with a decent soft type system, as well.

Finally, the "static typing is too restrictive" bit seems highly subjective. I have programmed extensively in both dynamically typed languages (Python mostly) and statically typed languages (C++, OCaml), and in the vast majority of cases typing errors caught at compile time were actual errors. In the very few cases that weren't, I was able to find some other way to express what I really wanted once and put it into a module/library for reuse. Your ratios may be different, but that's my experience.

In a soft type system, you don't need to rewrite what you're trying to do to get it to compile. Again, the difference isn't that big, but soft typing requires, well, less typing.

I did read it...

... I just have a mental block when it comes to the soft/dynamic distinction.

[Myself:] "If it compiles, it works!"

You can get the same thing with a decent soft type system, as well.

The depends very much on your "decent soft type system" -- a notion which is far too vague to discuss further. Generally though, any time the programmer can subvert an abstraction boundary by telling the compiler "shut up, I know what I'm doing" all invariants about that abstraction may go out the window. Sure, it's the programmer's fault, but I prefer my abstractions to be ironclad.

In a soft type system, you don't need to rewrite what you're trying to do to get it to compile.


Sure, but again the hypothetical stupid and lazy programmer just says "ignore this potential error" instead of actually trying to understand the issue. So what's gained by soft typing in your example? AFAICT nothing.

The depends very much on

The depends very much on your "decent soft type system" -- a notion which is far too vague to discuss further.

My notion of soft typing is basically dynamic typing with the obvious errors caught at compile time. If it's gauranteed to be an error, it should be caught at compile time. Otherwise, just tell me that it might be an error and leave me alone.

Sure, but again the hypothetical stupid and lazy programmer just says "ignore this potential error" instead of actually trying to understand the issue. So what's gained by soft typing in your example? AFAICT nothing.

I think you misunderstand my argument. The stupid person doesn't gain anything, but the people who care about correctness do.

Your notion.

Is way too vague to discuss without a specific language definition or examples. What are "obvious errors"? What is "guaranteed" to be an error?

Otherwise, just tell me that it might be an error and leave me alone.

My fear would be that any soft typing system applied to projects of any non-trivial size would just swamp you with warnings (rendering them useless), or not provide enough warnings to be useful.

I've said what I wanted to say, so I guess I'm done.

Is way too vague to discuss

Is way too vague to discuss without a specific language definition or examples. What are "obvious errors"? What is "guaranteed" to be an error?

Soft typing is identical to structurally subtyped static typing except with implicit downcasts. That's not vague at all, and I seem to recall given this definition before (might've been somehwere else, though).

My fear would be that any soft typing system applied to projects of any non-trivial size would just swamp you with warnings (rendering them useless), or not provide enough warnings to be useful.

Not an moreso than you get "swamped" with pattern matches in OCaml, although to be fair most pattern matches are used for polymorphism, not casting.

downcasting / pattern matching?

Your understanding of pattern matching and downcasting is giving me difficulties. What would you make of the following O'Caml type?

type tree = 
    Branch of tree * tree
  | GreenLeaf of int
  | YellowLeaf of int

"Implicit downcasting" would conflate GreenLeaf and YellowLeaf, no?

Ultimately, I do not see how casting and pattern matching are related at all, I think. On the other hand, if my tree is to be "used for polymorphism", could you provide an example of a cast and explain the difference?

[On edit, removed uneeded parameter to the tree type.]

"Implicit downcasting"

"Implicit downcasting" would conflate GreenLeaf and YellowLeaf, no?

No, maybe you're thinking of structural subtyping?

Ok, so I'm desperately

Ok, so I'm desperately confused. Would you consider the following two functions to be downcasts?

let greenint_of t =
  match t with
      GreenLeaf i -> i
let yellowint_of t =
  match t with
      YellowLeaf i -> i

The type of greenint_of, according to OCaml, is tree -> int:

# greenint_of;;
- : tree -> int = 
# (GreenLeaf 4);;
- : tree = GreenLeaf 4
# let j : int = greenint_of (GreenLeaf 4);;
val j : int = 4
# let j : int = yellowint_of (GreenLeaf 4);;
Exception: Match_failure ("Tree.ml", 19, 2).

So, greenint_of is a "cast" from tree to int and in fact is a "downcast". But what happens if you have "implicit downcasts" and I write:

# let j : int = (GreenLeaf 4);;

Is j set to 4, or do I get an exception?

you're...

So, greenint_of is a "cast" from tree to int and in fact is a "downcast". But what happens if you have "implicit downcasts" and I write:

# let j : int = (GreenLeaf 4);;

Is j set to 4, or do I get an exception?

This is more related to subtyping policy than downcasting as far as I can tell, so I guess it really depends on the language.

In Fairness...

I agree with all of this—my only issues are:

  • Soft type systems are more hypothetical than real—I don't know of any that I can actually use today.
  • Soft type systems draw an arbitrary line in the sand as to what they check statically vs. what they let pass to runtime, requiring considerable experience on part of their users to avoid engendering a false sense of security. In some respects, this alone prevents me from paying much attention to them; with static typing I know I have progress and preservation guaranteed and as long as I understand the language's dynamic semantics I know the scope in which errors can occur; with dynamic typing I know that it's all on my shoulders to be correct. Soft typing requires a great deal more intellectual effort because it's in the middle somewhere.
  • The previous point could be mitigated by having some level of programmer control over how much "has to be" static and how much "is allowed to be" dynamic, but this is somewhere between excruciatingly difficult and impossible to implement in practice.
  • Soft type systems obviously target languages that don't lend themselves to static typing in the first place, making it even harder to accomplish the stated objectives.
  • Getting "more staticness" out of such languages (here I'm thinking of SBCL and its type inferencing) tends to require excessive annotation relative to a type-inferred statically-typed language such as any of the members of the ML family, loosely defined so as to include, e.g. Haskell and Concurrent Clean.

So I dunno. As much as I like soft type systems conceptually, the lack of any to play with and the points above leave me shrugging my shoulders whenever they come up in discussion—and I should add that I check DrScheme again at every new release to see if they're ever actually going to ship MrFlow.

Soft type systems draw an

Soft type systems draw an arbitrary line in the sand as to what they check statically vs. what they let pass to runtime, requiring considerable experience on part of their users to avoid engendering a false sense of security.

Arbitrary? I find it pretty straitforward:

"...soft typing is basically dynamic typing with the obvious errors caught at compile time. If it's gauranteed to be an error, it should be caught at compile time."

The key thing about soft typing is that, if the user's code possibly failing at runtime is completely correct behaviour in the user's mind, there are no errors flagged. Compared to static typing, soft typing follows the user's mental model more closely and thus produces more obvious results.

Soft type systems obviously target languages that don't lend themselves to static typing in the first place, making it even harder to accomplish the stated objectives.

Or they target languages which were designed with soft typing in mind.

Too Much...

At this point I have to beg off, because while the discussion has been fruitful, here I find simply too many appeals to "obviousness" and "user's mental model" to be meaningful. If things were really so obvious, and there were such readily-apparent mappings between a user's mental model and a piece of software, programming language design in general and type theory in particular wouldn't be open areas of research! So perhaps you could refer to a soft type system that actually exists, and provide some concrete examples of problems that it addresses that aren't equivalently addressed by existing statically- or dynamically-typed languages. Otherwise, I'm afraid, you're simply guessing as to what's possible. Not that there's anything a priori wrong with guessing; it's just hard to establish a basis on which to ascribe pragmatic desirability ("I'd love that; where can I get it?") to a guess.

Adding apples and oranges


def add(x,y):
    return x+y

The "mental model" is almost always wrong about the world i.e. it relies on hypotheses that are either too general or too narrow. But it is seldom plain wrong as long as it has a single valid use case.

And here we go:

>>> add(3,5.5)
8.5

The "mental model" can get stuck immediately ( just numbers? ) but become enhanced very soon:

>>> add(apples, oranges)
<instance fruitsalad: ingredients apples+oranges>

C: Obviously apples and oranges are addable.

P: I don't think it is a good idea to talk about "adding" apples and oranges. This is misleading to many people. But anyway I accept it for a moment. What I'm really interested in: is it known statically that these apples are really addable apples and oranges are really addable oranges?

C: I don't care. They might not be addable initially but just in the presence of someone who wants a tasty fruit salad. Someone who wants them addable makes them addable. That's it.

P: It makes the semantics of cooking according to recipes incredibly complicated when I don't know how to use apples and oranges by their basic specification but everything is weirdly ad hoc.

C: It makes composing fruit salad complicated when I have to care for all the things that could happen and how information is propagated across the whole chain of actions from the beginning of the world till now. The only thing I want to control is the moment where apples and oranges are actually be added to fruit salad and the result of course.

P: You cannot prove that no wrong ingredients are in the fruit salad.

C: I don't want to prove it. I just make the salad, test it and offer it to other people after believing it tastes fine.

P: But you don't really know it.

C: I know that there will always be people who dislike it and would select other ingredients and would prepare the salad completely different than me. I have no hope that this will ever change.

P: I hope there will be an answer to this problem in the not so distant future. There has been done much research about it. I believe in scientific progress. Finally we will have proven fruit salads and individuals who know they will like it before testing it. Right now we have already certain methodologies to achieve fruit salad with proven ingredients. Why don't you use them? If you and many others start using these methods we will all eat better fruit salad in the end.

C: I checked out one of your methodologies but I got a "type error" when I tried to add apples and oranges and there was no easy way to make them addable.

P: As I said above. You shouldn't "add" them anyway, because it is confusing.

C: To whom? To the pot and the spoon?

P: It is semantically confusing. You shouldn't really want it. It is particular confusing to many bad cooks who feel encouraged to create bad salads.

C: Those who don't use your future proof methodologies?

P: I didn't say this...

C: Thanks for the talk. It is hot. I guess I need some beer.

Target domain semantics

Using a ""logic" "programming"" approach with target domain semantics we can make the idea of adding apples and oranges completely clear along this the meaning of salid. One can add apples and oranges in a salad. My only question is why don't we do this more often? A type system with definitions of apples and oranges and salad could do the same thing. Not that I think this is necessary, however target domain semantics does make the purpose of a program clear.

Latent salad

Your conversation is strikingly realistic. However, as with the many real conversations it's modelled on, it misses an important point.

P: ... What I'm really interested in: is it known statically that these apples are really addable apples and oranges are really addable oranges?

C: I don't care. They might not be addable initially but just in the presence of someone who wants a tasty fruit salad. Someone who wants them addable makes them addable. That's it.

My response to this is:

A: But when you write add(apples, oranges) you're essentially making a static assertion that you believe that apples and oranges are addable. That's interesting, isn't it? What are the static program properties that you're relying in your own mental reasoning in order to ensure that you can write something like add(apples, oranges) and be right, most of the time, as opposed to spending all your time debugging "runtime type errors"? Isn't it worth thinking about these static properties, and perhaps reasoning about them consciously? When you do that, you might realize that your mental processes are very similar to the algorithms used by an automatic type inferencer. Perhaps that can be exploited to good effect, e.g. to save some time during testing, speed up refactoring, or prevent some kinds of runtime errors?

And in fact, I think that Curtis is aiming in that direction, so I don't think the mock conversation accurately captures the issues in the current discussion.

Hypothetical deduction

I think we are very close here. Methodologically speaking I'm expecting most benefits from type feedback combined with testing in the presence of latent typing. That's first of all a departure from a certain tradition of axiomatic deductive reasoning about types and a shift towards a Popperian methododology of "hypothetical deduction". The axiomatics is always as if i.e. we want it to be revocable through further experiments/observations.

Let me explain it in brevity. Each test is an experiment. The tester can hold a set of assumptions about the types of the program ( either nominal or structural types ). That's the "type profile". A test generates a trace through the program were types and structures may be recorded at runtime. This causes updating the type profile. At any development stage the programmer may believe in having enough information collected to express a stable hypothesis: he freezes the type profile and merges the source with the profile. From now on he has a program that may be statically checked. But this merge is revocable. He might add a new test that would succeed but cannot pass the type-checker. The succeeding test disproves his current type assumptions but the type profile gets enhanced immediately after the test and he can proceed programming at a new stage and decide again for freezing / non freezing. It might be possible to augment this method with type-inference or handcrafted type-declarations. Code coverage will always show the holes. I guess this can be quite far reaching and subtle invariants may be declared and checked.

This process is of course open ended. There is no closure over progress or change.

At this point I have to beg

At this point I have to beg off, because while the discussion has been fruitful, here I find simply too many appeals to "obviousness" and "user's mental model" to be meaningful.

This argument is meaningless in the first place when you're comparing to languages like OCaml, which doesn't even have strict pattern matching. Since pattern matching can be used to create downcasts in the cases when it's not emulating polymorphism, the only real difference between my notion of soft typing and OCaml's type system (besides OCaml's funky union types) is really that, what you would need to write in OCaml is simply inferred and displayed to you in my type system.

Reasons Not to Continue

Curtis W: Since pattern matching can be used to create downcasts in the cases when it's not emulating polymorphism, the only real difference between my notion of soft typing and OCaml's type system (besides OCaml's funky union types) is really that, what you would need to write in OCaml is simply inferred and displayed to you in my type system.

See, this is where we go off the rails: pattern matching can't be used to create downcasts. Pattern matching can't be used to do coercion at all! And there's nothing "funky" about O'Caml's union types; they're bog-standard throughout the entire ML family. So this is why your recent comments raise eyebrows among some of us—they're speculative and hand-wavy, which isn't so bad in and of itself—we're into speculation and hand-waving here, actually :-) but there are still no references to published papers or books or extant research work at any institution. Between your errors of understanding relative to, frankly, very basic existing language technology, and the lack of anything to substantiate your claims as to what soft typing can do, what is anyone reading you supposed to think?

Novelty is good—it just needs to be backed up by the scientific method.

Two Other Reasons Not to Continue

We've had at least parts of this conversation twice before: http://lambda-the-ultimate.org/node/1527 and http://lambda-the-ultimate.org/node/1514.

An example

I would think that Curtis W means something like this (in SML):


datatype intOrReal = I of int | R of real

fun castToInt (I n) = n

This to me seems very much like a cast, and it is acomplished by incomplete pattern matching.

Yes, that's exactly what I

Yes, that's exactly what I mean. The variant type as a whole represents a base class, of sorts, and pattern matching allows you to work on specific details depending on the type via matching the type constructor.

I think it's worth noting

I think it's worth noting that it's far more common for OO-style polymorphism to be used to emulate this than the other way round.

No downcasting in the sample ML

Perhaps your confusion is due to the fact that you assume that there are two different types in the example? But there is only one type that is called intOrReal with two constructors - one which happens to take an int and one which takes a real. There is no casting taking place in the pattern matching (up or down). There is only a compare that dispatches based on the constructor of I or R. And there definitely is no idea of subclassing going on here - which is where downcasts come into play.

You could argue that I and R

You could argue that I and R are latent subtypes of intOrReal.

The dispatch is latent (i.e. at runtime)...

...though it is statically guaranteed to be correct - if you assume the static semantics of the language are guaranteed, then it can never dispatch on the wrong constructor. It should also be noted that the intOrReal type is contained within an abstraction which can not operate on the contained data lest it pass through the constructor guards. That is, there is not an operation which applies to both constructors, unless the operation is specified for each constructor.

Yes

You could argue that I and R are latent subtypes of intOrReal, but you'd be wrong, by which I mean that the definition of ML tells us what is and is not a subtype, and I and R are not subtypes of intOrReal.

A major point here that you and Curtis seem to be missing is that you don't get to make analogies, even good ones in some human sense, and then claim that they represent reality. Here's another good counterexample, from Peter Norvig's wonderful Java IAQ:

Q: Is null an Object?

Absolutely not. By that, I mean (null instanceof Object) is false.

That's crisp, that's definitive within the language in question, that's the answer-in-advance to comebacks of the form "But I can pass null anywhere I can pass an Object," etc. Claims about what is or is not a subtype in a programming language must adhere to the same level of precision in order for the discussion to be meaningful.

Multi-system reasoning

You could argue that I and R are latent subtypes of intOrReal, but you'd be wrong, by which I mean that the definition of ML tells us what is and is not a subtype, and I and R are not subtypes of intOrReal.

A major point here that you and Curtis seem to be missing is that you don't get to make analogies, even good ones in some human sense, and then claim that they represent reality.

Sure, but you do get to define the context. In a discussion like this one, the context isn't necessarily simply the definition of ML. The discussion involves comparing types across systems, and so the definition of more than one system comes into play.

For example, consider the standard embedding of a latently-typed program in a statically-typed language, by encoding all values in a universal type. Now compare that to what a soft typing analysis of that program produces: it doesn't show a single universal type at every term, but instead shows more specific types at many of the terms. The soft typer might identify some terms as int, and others as real, whereas in the ML-embedded version of the program, those same terms would have the universal type.

In fact, a soft type analysis can be applied to the ML program, and similarly identify more specific types than the ones which the definition of ML say the program has. In this sense, ML programs can have latent types. Those types don't exist relative to the definition of ML, but they can exist relative to some other type system definition, which is essentially what was being proposed here.

Fair Enough

As usual, all perfectly valid points, Anton, but as I believe you would concede, that isn't what Curtis was doing; he made specific claims about O'Caml's semantics without providing even a notional mapping to some system that he was attempting to embed in O'Caml.

In fairness to Felicia, she did at least say "latent subtypes," which I suppose could be interpreted in such a way as to engender the follow-up question "What does that mean?" and perhaps I should have asked. But once again, the fuzziness around the understanding of how sum types and subtypes actually work in either Standard ML or O'Caml has left me leery of receiving a coherent response. In particular, if one were to attempt to embed the notion that variations of a variant type are subtypes of the variant type in some type system written in Standard ML or O'Caml, I'd fully expect the embedding not to compile—at least not without appeal to unsafe operators in the underlying ML implementation. That's not necessarily bad—theorem provers such as Coq that support extracting code from proofs do exactly that—but I'd certainly want to see the proofs of soundness before using such an embedding!

Anyway, this is all a shame, because I'm completely serious when I say that I look for MrFlow at every new DrScheme release, and in general I wish I had a soft type system that I could actually play with, especially to contrast with SBCL's type inference. So my bottom line is that I share your concerns about concreteness, and all I'm trying to add is the observation that when fairly fundamental errors of fact are presented regarding extant languages and type systems, it doesn't bode well for receiving sound descriptions of brand new, state-of-the-art developments in wide open research areas such as soft type systems.

I was going to ask if you

I was going to ask if you had looked at Sage, but then I went to the LtU story about it, I saw that it was published by you, so I guess the question is mute. But anyway, the refinement part of types in sage, seems to be a good vision of that a soft type system can and should be like.

No argument

all I'm trying to add is the observation that when fairly fundamental errors of fact are presented regarding extant languages and type systems, it doesn't bode well for receiving sound descriptions of brand new, state-of-the-art developments in wide open research areas such as soft type systems.

I'm sure you know I agree. I was really responding to a delimited subset of the discussion, starting with Felicia's comment (or should that be Li's comment? Username changes confuse me ;)

As for MrFlow, I don't know how that's going, but there's other work in that general area going on amongst the PLT (Scheme) folk - see the last sentence of this post (the rest of the post provides context), and a tiny bit more here. Note the OCaml connection...

Yes, That's the Analogy

I responded at some length to this observation here. The summary is that choosing a variation of a variant type isn't a type coercion at all, let alone a coercion-to-subtype, and that at least non-comprehensive pattern-matching gives a warning at compile-time and an exception at runtime if the execution path follows the unhandled match. Particularly by way of comparison to the ((Ship*)object)->fly() example that Curtis gave, the differences are quite striking!

See, this is where we go

See, this is where we go off the rails: pattern matching can't be used to create downcasts. Pattern matching can't be used to do coercion at all!

Sure it can, take a look at my reply further down--the variant type represents the base class and the separate type constructors represent derived types. You make it sound as if OCaml completely eliminates the need for down casts, which is simply not true. They might be in a different form, but they're still there, extra type checking or not.

And there's nothing "funky" about O'Caml's union types; they're bog-standard throughout the entire ML family. So this is why your recent comments raise eyebrows among some of us—they're speculative and hand-wavy, which isn't so bad in and of itself—we're into speculation and hand-waving here, actually :-)

While my replies are theoretical in nature simply because of the sort of discussion this is, they certainly can be proved/disproved by analysing them. The mental model post being an exception, but that doesn't even apply to this discussion at all, as I've stated before.

Back to the Books

Curtis W: ...the variant type represents the base class and the separate type constructors represent derived types.

No, they don't. Please see section 11.9 of TAPL, "Sums," and the entirety of Chapter 15, "Subtyping."

Curtis W: You make it sound as if OCaml completely eliminates the need for down casts, which is simply not true.

"Indeed, narrowing coercions would be unsafe, and could only be combined with a type case, possibly raising a runtime error. However, there is no such operation available in the language."

Curtis W: They might be in a different form, but they're still there, extra type checking or not.

It's incoherent to talk about downcasts with "extra type checking."

Curtis W: While my replies are theoretical in nature simply because of the sort of discussion this is, they certainly can be proved/disproved by analysing them.

That's precisely what we're waiting for.

This discussion is really

This discussion is really going on a tangent.

Let's take an example. Let's say we have a function that reads a value from the user and evaluates it, similar to the input function in python. Let's see what using it would look like in C and OCaml:


void* input();
int val = *((int*)input());


let val = match input () with
Int x -> x;;

Disregarding the fact that C doesn't insert runtime checks for this sort of thing, the two do the exact same thing. Whether or not you consider the latter to be "casting" is of no importance to me, just assume that this is what I mean by it.

Anyway, I should note that this is a branch of a discussion that itself was a diversion from the main topic. A new thread would probably be the best place to continue this conversation, although I've no idea whether or not you'd like to continue.

They don't do the exact same

They don't do the exact same thing. Where's this magical return type for the ocaml input coming from?

The reason you're getting prodded about this is that it's really, fundamentally, not the case that pattern-matching on sums is "just safe casting". It'd be more accurate to describe safe casting as a way to implement pattern-matching on sums if you don't have it natively. And that becomes important the moment you're discussing any mildly complex use case.

They don't do the exact

They don't do the exact same thing. Where's this magical return type for the ocaml input coming from?

How, exactly, do they not do the same thing, ignoring the difference in error handling? I'm not sure I understand your last statement, nor what relevance it has to the conersation.

The reason you're getting prodded about this is that it's really, fundamentally, not the case that pattern-matching on sums is "just safe casting".

I don't believe I've ever said that. I was talking about how pattern matching can replace downcasts to relate what I've said previously about soft typing to OCaml.

It'd be more accurate to describe safe casting as a way to implement pattern-matching on sums if you don't have it natively.

Regardless, my intention should be clear now, so we can continue with the previous discussion.

In your examples you

In your examples you simulate casting with ocamls union-types and pattern matching. It's not the same as casting, but you can use pattern matching to simulate it.

And yes, thats a difference. In you C example there is a universal type 'void*' which can be cast to whatever you want. In Ocaml there is no such type, if you want it you have to simulate and use it where you need it.

On some level all turing-complete languages can do the same. But that doensn't mean that they really equal, because you need different ways of programming in each language to acomplish the same thing.

In your examples you

In your examples you simulate casting with ocamls union-types and pattern matching. It's not the same as casting, but you can use pattern matching to simulate it.

I'm seeing the word "simulate" used way too often in this discussion. It implies a sense of subordination that simply isn't present--casting to me, ignoring C/C++'s incompatible conversion casts, is changing the interpretation of a given value or object. Both of the examples I gave do exactly that, whether or not using a built-in casting mechanism is unimportant. If you disagree with this, fine, whatever, it's my definition; this discussion isn't about what a cast is or is not, but rather what advantages soft typing has over static typing. I've already clarified what I meant by "cast" several times, so this topic no longer need to be discussed. Speaking of which, I think I will start a separate topic. Maybe it will help get past this mindless side argument.

Pattern-matching on

Pattern-matching on algebraic datatypes is more akin to accessing fields in a record. In fact, it's dual to it. So yes, "simulate" is an appropriate term - there is no change in interpretation happening.

Different values

casting to me [...] is changing the interpretation of a given value or object.

Except that this is not what is happening in the ML example. The values n and Int(n) do not only have different types, they are also different values, semantically as well as in the implementation. Pattern matching extracts the argument value n from the constructed value Int(n).

Casting, as usually understood, is different, because it does not change the underlying value, but as you say, only its interpretation.

Example is confusing


void* input();
int val = *((int)input());

On the surface, this looks more like a case of Coercion rather than Downcasting. Typically, the input function will be a character or a list of characters. Applying the (cast) operation doesn't do downcasting or upcasting, it just does casting (i.e. coercion).

let val = match input () with
Int x -> x;;

Typical user input does not return a number type (Int x), but rather it usually returns a string. In that case, the function wouldn't be a pattern match, but rather a conversion function along the lines of

int val = Integer.fromString (input())

Which would more closely map to the sugar of coercion that is applicable to the C code.

[Edit Note: Yes, I realize you could construct an input() function that would return a Number type that could be either Integer or Float. But the example is confusing because it uses the analogy of keyboard input, which just returns characters - which would make neither the C or ML code legal code. Perhaps specifying the Type of the input() function might help, though it would likely end up being somewhat similar to the intOrReal datatype given above].

On the surface, this looks

On the surface, this looks more like a case of Coercion rather than Downcasting.

It's both. A coercion changes one value into another, or in this case one pointer type to another. A downcast is a specific way to reinterpret something, going from less specific to more specific. In this case, you're coercing from void*, the less specific type, to int*, the more specific type. To make this obvious, you could replace void and int with classes and get the same thing, similar to Java's Object base class.

Typical user input does not return a number type (Int x), but rather it usually returns a string. In that case, the function wouldn't be a pattern match, but rather a conversion function along the lines of

int val = Integer.fromString (input())

Which would more closely map to the sugar of coercion that is applicable to the C code.

[Edit Note: Yes, I realize you could construct an input() function that would return a Number type that could be either Integer or Float. But the example is confusing because it uses the analogy of keyboard input, which just returns characters - which would make neither the C or ML code legal code. Perhaps specifying the Type of the input() function might help, though it would likely end up being somewhat similar to the intOrReal datatype given above].

"Let's say we have a function that reads a value from the user and evaluates it, ..."
Just bear with me and assume there's a reason you're using an input function that evaluates whatever the user types. Maybe you might want to extend it later to allow for string input, who knows.

Still leaves open the question of....

...what type is actually returned by the ML input() function? The example is confusing because there is no standard ML definition of directly taking an input() from the user and having that function return a type that was constructed in the form of Int n. I suppose you could define a function that is named input() that returns the type, but the example is confusing precisely because it uses the analogy of free-form input to construct a type. Ok, so if we start with:

datatype 'a intOrFloat = Int 'a | Float 'b
fun input () = Int (Integer.fromString '123')
fun extractInt (Int n) = n

Then it might be more obvious what this is actually doing (sans the idea that the user somehow mysteriously constructs a value.

But this operation is not a cast (up, down, or coercion). It is simply a function which takes a type of intOrFloat and returns an int. The compiler will warn in this case that the function does not provide full coverage. And, similar to the C++, it will throw an exception if the type the value is not constructed with (Int n).

In terms of types, perhaps the following function would be informative:

fun inputConst (Int n) = 123

This would produce the exact same error as the input() function above if the value was constructed with the (Float n) constructor. Yet, this allows us to see that the idea of extracting a value from a signature is not what is the cause of the problem. Indeed, the type of the value does not change at all - it's just a function that takes one type of value and returns another.

...what type is actually

...what type is actually returned by the ML input() function? The example is confusing because there is no standard ML definition of directly taking an input() from the user and having that function return a type that was constructed in the form of Int n. I suppose you could define a function that is named input() that returns the type, but the example is confusing precisely because it uses the analogy of free-form input to construct a type. Ok, so if we start with:

The input function I was using is a direct copy of the python input routine for demonstration purposes. To clarify, it looks like this in python:

def input():
return eval(raw_input())

(Not exact, that's what I assume it looks like.)

But this operation is not a cast (up, down, or coercion). It is simply a function which takes a type of intOrFloat and returns an int.

Let's take a look at this. intOrFloat is a more generic type than int. A function that converts an intOrFloat to int is basically going from generic -> specific. What do you think a downcast is? Anyway, what you think it is doesn't matter--I've already clarified what I mean by "cast" to the point where this discussion should no longer be necessary.

Also, I started a new topic to continue the soft typing discussion in.

nope these are abstract types

intOrFloat is a more generic type than int.

Actually it is not, as there is an abstraction to it. none of the functions defined for int is by default defined for intOrFloat. they are two compleatly diffrent types. just as anotherInt* isnt an int.


2 + 2

would typecheck, but

Int 2 + Int 2

would not, whitout overloading of + to handle anotherInts to.

* datatype 'a anotherInt = Int 'a

Actually it is not, as

Actually it is not, as there is an abstraction to it. none of the functions defined for int is by default defined for intOrFloat.

How do you know? Furthermore, what does this have to do at all with the discussion? Two types don't have to share an interface to be related. For example, just because a parent class is empty doesn't suddenly negate the fact that it's still the parent.

Presumably Felicia knows by

Presumably Felicia knows by having a basic understanding of sum types. It's true in ML that a type with a single data constructor that takes an Int as a parameter is fully distinguishable from the Int type, and that they're only related insofar as that constructor takes an Int. You might like to think of naming the data constructors as a form of nominal typing in systems where sums have structural subtyping.

No, I meant how did she know

No, I meant how did she know that I hadn't, say, written a library of functions that do work on intOrFloat.

Well i said with the

Well i said with the exception of overloading, aka polytypism. And such a library would by definition have to include polytypic functions. But to my knowledge ocaml doesn't have any suport for polytypic functions.

But if you could, sure int could be a structual subtype of intOrFloat, but realy any type could become a structural subtype of any other, so I dont see what this has to do with anything.

You can't dereference an

You can't dereference an integer ;)

Fixed.

Fixed.

Vapor types

the only real difference between my notion of soft typing and OCaml's type system (besides OCaml's funky union types) is really that, what you would need to write in OCaml is simply inferred and displayed to you in my type system.

Is this something your type system does now, or something you're planning to make it do? Either way, without having a reasonably detailed description of your type system, or code to experiment with, comments like the above are simply frustrating — it's not possible to agree or disagree with them, because we don't have sufficient information.

I've already talked about

I've already talked about this earlier up. To understand what I'm talking about you need to take my posts as a whole. The "...inferred and displayed to you..." part, which I assume is what you're talking about, references the idea of the compiler warning you rather than throwing an error or forcing you to acknowledge yourself. I can understand that it is somewhat hard to understand what I'm talking about, although I'm pretty sure I've covered most of the major points.

More concrete

You apparently misunderstand my concern. I understand what you're suggesting, in broad terms, but your descriptions are so broad as to be almost useless. The most substantive comments you've made in this thread about the soft typing system you have in mind are "soft typing is basically static, structural subtyping with implicit 'casting'," and "soft typing is basically dynamic typing with the obvious errors caught at compile time". These statements leave a huge laundry list of open questions unanswered.

How those questions are answered defines an entire universe of possible soft typing systems, the members of which could differ from each other in substantial ways. Speculating about this largely unexplored space in general, abstract terms isn't going to get very far: for this subject in particular, the devil is in the details. Hence my question about whether this is something you've actually done, or are just speculating about.

Perhaps it would be useful to contrast what you have in mind with soft typing systems such as SoftScheme, MrSpidey, MrFlow, or the soft typing system for Erlang, for example.

Still, I'd slide in the other direction..

You definitely have a defensible position.

While we are talking about hypothetical systems though, I'll add my opinion, which isn't really that different from yours.

If we have a static type system that is very expressive, and lets us type things very accurately -- like the Ontic-like systems that have been brought up occasionally -- then you can pretty much write the same programs as you can in a dynamic/soft system, in the same way (at least I think, with a considerably different notion of type inference). I'd apply the "slider" at the boundaries to dynamically loaded/runtime generated code, and quickly merge back into the statically-typed semantics.

The difference is that when your code has compiled, you'd know it has been proven correct, in it's entirety. When you use a third party library, you'd know the guarantees it makes hold; and so on.

Let's not forget that

Let's not forget that parallel hierarchies, by definition, introduce coupling, with all the disadvantages it entails.

Maybe you should explain

Maybe you should explain your concerns a little more detailed. In the case you describe I would assume a standard OO design where AWindow and ARenderer are both abstract classes ( or interfaces ) and AWindow holds a reference to an ARenderer. In C++ this could be encoded as:

class AWindow
{
public:
    void update() = 0;
private:
    ARenderer* renderer;
};

class ARenderer
{
public:
    void draw(AWindow*) = 0;
};

A client object will hold a reference of AWindow. When an instance BWindow is created it requires also a concrete instance of a Renderer.
The parallel hierarchies follow directly from the abstractions / specifications. Maybe I'm overlooking something but what do you consider as "sloppy"?

I've sometimes encountered

I've sometimes encountered the case that there is some kind of mutual relationship which isn't expressed clearly by using parallel hierachies. For example:

class Doc { 
	void register(View view) { ... }
	void notify(Notification n) { ... }
}

abstract class View {
	abstract void paint(Doc doc);
	abstract void update(Notification n);
}

Then there has to be a relationship between the doc and the view that every doc only can be register a matching view. And in the View.paint method there has to be a cast to a certain document class to retrieve the data the view has to paint.

class MyDoc extends Doc {
	void register(View view) {
		if (!(view instanceof MyView)) throw new IllegalViewError(...);
		super.register(view);
	}
}

class MyView extends View {
	void paint(Doc _doc) {
		MyDoc doc = (MyDoc)_doc;
		... // paint with 'doc'
	}

	void update(Notification n) {
		...
	}
}

My actual solution to this problem is to use inner-classes for Doc and View and use a outer class 'DocView' which handles the registration, notifications etc. Not a perfect solution, but better then using totally unrelated hierachies where the relationship is only expressed in the documentation.

Another idea is 'capture of recursion'. It's described in Object-Oriented Type Systems.

And of couse there is the 'Scala approach', but the Palsberg/Schwartzbach approach looks to me the most elegant one.