Typed X (Racket, Clojure, Lua) just doesn't pan out?

I have used Typed Lua. I have kept an eye on Typed Racket, and Typed Clojure. Overall I currently get the impression that they somehow don't quite get over some hurdles that would allow them to really shine. And thus people who wanted to love and use them are leaving them instead.

A post today re: Typed Clojure echoes this previous one:

In September 2013 we blogged about why we’re supporting Typed Clojure, and you should too! Now, 2 years later, our engineering team has made a collective decision to stop using Typed Clojure (specifically the core.typed library).

I come not to bury Typed X, but to praise them. Can somebody please work on figuring out what it is that is missing or needs to be tweaked to make them more usable? (Anything from speed to culture.) Or should we truly conclude that trying to augment / paper over a dynamic ecosystem is just for the most part doomed to fail?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Performance problems and "Is Sound Gradual Typing Dead?"

For starters, have you seen "Is Sound Gradual Typing Dead?" in "Progress on Gradual Typing" (http://lambda-the-ultimate.org/node/5292)? The point is not quite that the project is dead (because of performance issues in some scenarios), but to point out some serious questions.

apologies

Yes, sorry, I utterly failed to draw any links to previous things. I will go back and re-read those topics. Mea culpa.

I'd still like to suggest that even if there are new amazing things out there in the Land of Gradual Typing, I want to know:

  • what went wrong usability wise with things to date
  • that the new approaches acknowledge each and every one of those issues
  • and say which of them they specifically address
  • and how they are going to user test it to see if it really makes a qualitative difference in the end.

:-)

Interface

I guess the problem is that whenever there's an interface from untyped to typed code, the data has to be walked and its type verified, which is a pointless waste of computing.

So we need a model for programs where that NEVER happens.

So I guess the solution is that gradual typing can't be:
type variables one at a time
it has to be:
make a given type/class typed everywhere in the program all at once, so that no references to that type EVER need to be walked and verified.

That suggests a more limited way to program.

Offloading typechecking costs to the past or future

I guess the problem is that whenever there's an interface from untyped to typed code, the data has to be walked and its type verified, which is a pointless waste of computing.

So we need a model for programs where that NEVER happens.

I think with languages that are fully statically typed, that "waste of computing" happens at compile time. I don't expect anyone to make it so that this check never happens.

Still, I would like to have better control over when the performance costs of typechecking when it does need to happen. Sometimes we might want to typecheck some parts of the code earlier on to save the trouble later, and sometimes we might want to speculatively interpret the code first, while a typechecker keeps processing it in the background.

I think the "typecheck some parts earlier" use case benefits from the state of the art in separate typechecking and modularity. Or perhaps the interpreter might take in a hybrid of typed and untyped syntax so that previously typechecked code can be embedded in the next layer of code. Either way, I think this is largely a straightforward problem that's just waiting for a good implementation and an application where it actually matters. :)

As for the "speculatively interpret" use case, I think it would benefit from something more exotic: An incremental self-interpreter, in a language where dynamic branching can be upgraded to static knowledge. (By "upgraded to static knowledge," I mean the way props-as-types dependently typed code can make a proof object available to each conditional branch. Typed Racket's occurrence typing seems to serve this purpose without first-class proof objects.) The incremental interpreter would be run alongside a typechecker, and once the typechecker was finished, the latest incremental state of the interpreter could be converted directly to a typed value. If necessary, the interpreter itself could be a primitive like "eval," but I'd prefer a metacircular interpreter. That way programmers can fork it to implement various incremental reduction strategies corresponding to their specific performance needs.

As the mention of Typed Racket might suggest, I think gradual typing could develop toward that kind of incremental interpreter. If it's languishing in low adoption right now, and if this low adoption is really caused by performance concerns, then I think that's something to pursue.

(I've actually been working on an incremental metacircular interpreter myself, but it's all untyped. That way it's a lot easier to reason about the cost of loading code -- no typechecking at load time at all! -- so I feel like it's a justified place to start. I'm going to try to integrate a type system at some point.)

make a given type/class typed everywhere in the program all at once, so that no references to that type EVER need to be walked and verified.

What types would you like to restrict this way? I get the impression that you're talking about passing, for example, a list of strings to typed code and watching as the program exhautively checks every element of the list to make sure it's a string. But in that case, would you want to restrict lists? Strings? Would you want to put the whole list into some wrapper value and restrict that?

Actually, in Typed Racket, can't you put the list into a wrapper value and then just not export the wrapper type, so that the clients have to call your (typed) code again when they want to unwrap it? That would seem to enforce that kind of restriction, although I guess Typed Racket might not notice this encapsulation, so it might still check that the wrapper hasn't been tampered with just in case.

Even if I'm not sure of your use for it, the kind of restriction you're going for seems pretty elegant to me in a broader context: Viewing the untyped side as though it's unityped, I can think of contracts as being untyped values that wrap up a typed value and ensure that it can't be used improperly. Maybe those restricted types of yours simply have no support for being wrapped up in an untyped wrapper, so they can never be passed to untyped code for that reason.

Opportunistic static typechecking?

Could you (type)check "untyped code", relative to a typed callsite, to see if there's any actual chance of a type error, and elide typechecks at a boundary?
Say, suppose typed code calls (1) a closure with an untyped body, that somehow calls typed code (2). Arguments of call (2) will be typechecked at runtime, whether that's needed or not.
However, typechecking might figure out that some of those checks could be elided.

Of course, typechecking is designed for programmers to produce error messages; you'd probably need to convert typechecking into abstract interpretation (or some other analysis) instead. I've never read "Types as Abstract Interpretation", but it seems relevant:
http://www.pps.univ-paris-diderot.fr/~mellies/mpri/mpri-ens/articles/cousot-types-as-abstract-interpretations.pdf
I'd guess type errors could be turned into "unknowns", but some types would stay known.

Of course, I don't know how often this would produce a measurable speedup and be worth the overhead.

On a different point, how many standard optimizations are done? If call (2) is inside a loop and some arguments are produced outside, it'd be good to hoist the check.
The problem, I imagine, is that you need a JIT that inlines at runtime a function wrapper from the heap (assuming immutability or deoptimization traps on mutation), and IIUC this thing happens to be uncommon since most JITs aren't optimized for immutable languages.

EDIT: after posting, I saw that people are doing something similar for C to good effect — see this comment on another thread:
http://lambda-the-ultimate.org/node/5027#comment-91335

Type checking as ``abstract'' interpretation

Blaisorblade wrote:

Of course, typechecking is designed for programmers to produce error
messages; you'd probably need to convert typechecking into
abstract interpretation (or some other
analysis) instead. I've never read "Types as
Abstract Interpretation",

First of all, there is the work by Cousot
http://www.di.ens.fr/~cousot/COUSOTpapers/POPL97.shtml
with exactly
the title ``Types as Abstract Interpretations''.
(In general, one can bet that there is a Cousot's paper with the title
``X as abstract interpretation'' for all reasonable X.)

I think however that the following paper may be closer to what
Blaisorblade's had in mind:
``Interpreting types as abstract values: A tutorial on Hindley-Milner type''
inference
http://okmij.org/ftp/Haskell/AlgorithmsH.html#teval

Just saw more similar work

Just saw more similar work (conditionally accepted) at PLDI2016: "Run-time Static Type Checking for Dynamic Languages". While only the abstract is available, it seems highly relevant, even though it *does not* mention gradual typing in particular (so I don't know what happens with "runtime static type errors").

http://conf.researchr.org/event/pldi-2016/pldi-2016-papers-run-time-static-type-checking-for-dynamic-languages

Abstract:

Dynamic languages have many compelling benefits, but the lack of static types means subtle errors can remain latent in code for a long time. While many researchers have developed various systems to bring some of the benefits of static types to dynamic languages, prior approaches have trouble dealing with metaprogramming, which generates code as the program executes. For example, Ruby on Rails is a powerful framework that relies heavily on metaprogramming.

In this paper, we propose Hummingbird, a new system that can type check Ruby code even in the presence of metaprogramming. In Hummingbird, method type signatures are gathered dynamically at run-time, as those methods are created. When a method is called, Hummingbird statically type checks the method body against current type signatures. Thus, Hummingbird provides thorough static checks on a per-method basis, while also allowing arbitrarily complex metaprogramming. For performance, Hummingbird memoizes the static type checking pass, invalidating cached checks only if necessary. We formalize Hummingbird using a core, Ruby-like language and prove it sound. We also implemented Hummingbird and applied it to six apps, including three that use Rails. We found that all apps typecheck successfully using Hummingbird. By applying Hummingbird to earlier versions of one app, we also found several type errors that were introduced and then fixed. Finally, we found that Hummingbird’s performance overhead is reasonable.

Oxymoron

Isn't run-time static type checking an oxymoron? Isn't that simply dynamic type checking?

It appears to type check functions when they are called, which is fairly dynamic. Why not type check the metacode generation, it would seem reasonable to only allow well typed metacode to be produced?

Possibly misuse of terminology

If "static" type checking really means it happens before run-time, then "run-time static" is self-contradictory. (The word "oxymoron" refers to a very special linguistic device that gets its power from being not quite self-contradictory.) If I've not misunderstood, they want something that means "checked on a whole compound function before running that function"; perhaps "run-time whole-compound-function type checking"? "run-time pre-compound-call type checking"?

A Modest Proposal

Replace "dynamic type checking" with "tag checking", "static type checking" with "type checking", and things start to make sense again, I think. In particular, "run-time type checking" is not the same as "tag checking". What I understand from the above abstract is that Hummingbird type-checks method bodies and thus source code in the usual syntactic way, but at run-time, using information collected dynamically. This is very different from so-called "dynamic type checking", which works by computing with untyped but tagged values.

Yeah but what's the difference in results?

Other than being a weaker check (ie there are times it can't be done), I don't see what the difference in outcome/results are from checking code rather than checking tags when running primitives.

Maybe it gives more opportunities to lift out tests?

Safety, efficiency?

It's difficult to answer such questions since the full paper is not available yet. I can only hazard some guesses.

One important benefit is the ability to detect errors earlier than tag checking and to reject more incorrect programs. This is how I understand their abstract: their system offers a new expressiveness/safety tradeoff compared to both statically type-checked languages and untyped languages with tag checking.

Another, more hypothetical, benefit of their "run-time" type checking might be efficiency, by allowing the generation of type-specialized code. In a sense, this is what advanced JIT compilers such as V8 already do, probably in a slightly more ad-hoc manner. But the last sentence of the abstract suggests that this possibility has not been explored yet.

The post lays out the issues

It's not a rejection of static typing or even statically typed Clojure. There are two complaints: the implementation is inadequate (too slow, not enough coverage) and the network effects are working the wrong way (upstream libraries aren't typed, so they have to maintain types for them).

So that says: make your type checker imperceptibly fast, make it check the whole language, and make it so easy to adopt that essentially everyone adopts it. Easy-peasy. :-)

Point taken

I guess then I wonder if we think those are doable things. :-) And I suspect there are of course always other further issues e.g. that I've seen mentioned on the TR list in passing but I can't go back and find now.

(It bugs me that it comes across as if the perspective TX people have is oh obviously this will be great, we don't need to think ahead to what might really be buzz killers. I mean, I was hoping for that, too, but oh well.)