## Facebook releases "Flow", a statically typed JavaScript variant

The goal of Flow is to find errors in JavaScript code with little programmer effort. Flow relies heavily on type inference to find type errors even when the program has not been annotated - it precisely tracks the types of variables as they flow through the program.

At the same time, Flow is a gradual type system. Any parts of your program that are dynamic in nature can easily bypass the type checker, so you can mix statically typed code with dynamic code.

Flow also supports a highly expressive type language. Flow types can express much more fine-grained distinctions than traditional type systems. For example, Flow helps you catch errors involving null, unlike most type systems.

## Comment viewing options

### a nit: misleading ltu title

"a statically typed JavaScript variant" makes it sound like e.g. Dart when in fact this is more like e.g. Dialyzer or FindBugs, no?

### Depends on definitions

Many valid JavaScript programs won't be valid Flow programs because they won't type of course, but that's true about FindBugs/Dialyzer/et al.

But also any valid Flow program with any type annotations won't be a valid JS program. To me that implies a variant rather than just another static analysis tool.

Mechanically there's a transform from Flow to JS which we can think of as "compiling" even if the compiling step is pretty trivial.

### Not true of Dialyzer

Dialyzer only rejects Erlang programs that it can prove will generate dynamic type errors at run time, so every valid Erlang program is a valid Dialyzer program. This is what makes it different from conventional static analyzers, which reject every program that cannot be proved not to generate a dynamic type error at run time.

### Philosophical difference

Only if you choose to define a program that (eventually) generates a dynamic error as "invalid". In general, that still rules out "useful" programs, because they might still do an unlimited amount of useful stuff before throwing that error. So I'd argue that the difference is mostly a philosophical one.

### This is the first time I've

This is the first time I've heard "can have false positives" and "can have false negatives" as being only philosophically different.

### You probably misheard

That wasn't what I said. What I said is that this kind of checking can have false positives as well, up to a philosophical argument about how to define falsity.

### Also, dialyzer rejects

Also, dialyzer rejects programs which never generate dynamic errors. For example, if you enumerate the naturals until you find one that refutes Fermat's last theorem, and then do something type incorrect, dialyzer will reject the program, even though it never has a dynamic error.

The best way to think about dialyzer is as an unsound static analysis with a particular interesting set of goals. This is distinct from Flow, which I believe is intended to be sound if you don't have any untyped JS code.

### Flow still is unsound

Unfortunately, Flow is unsound, see e.g. the lower half of the page on Objects (Adding Properties and particularly, Objects as Maps). I also wonder about subtyping rules and intrinsics, which do not seem to be discussed at all in the language reference.

### Unconvinced

After thinking about it more, I'm not convinced by your point here.

If someone said "dializer-like tools are fundamentally *better* than type systems because they never reject the programs you want to write", I think your answer would be appropriate. Dynamic-typing proponents argue for much more than "never reject correct programs", they also want to be able to run programs that will crash only after an unberably-long time or under provably-impossible path conditions; or even programs that they know will crash in a few minutes, but they're interested in what happens at the start and they haven't debugged the rest yet. (So they want to be able to run analyzed-incorrect code, regardless of whether it was rejected by a sound-type-system-like or dializer-like tool).

On the other hand, I think there may be important practical difference between those kind of analyzes (to the point that we may want to run both simultaenously on the same programs). In particular, trying to preserve the invariant that we only reject programs that we know will lead to an error gives a system design that is conductive to actually *showing a counter-example* to the programmer -- this is how the exhaustivity-checker of OCaml pattern matching works (on the well-behaved subset that doesn't use arbitrary boolean conditions) and I feel it is an important aspect of it.

I would say that answering that "the difference is philosophical" in your way is adequate to react to points that were "philosophical" (for some sense of the word) in the first place. But when talking about practical interactions with the program analysis tool, I think we can say more than that, and the details may matter in practice.

One could conjecture that proposing a counter-example is much more importants for results of analysis that are hard to believe (showing a counter-example is a good way not to get the user angry at your negative result and build trust in the tool). Interestingly, sound type-sytems or analyses have ventured very far in abstraction and specification power, in a way that I'm not aware counterexample-providing-analyses have.

PS: to make an analogy, whenever we discuss proving termination someone points out that "what if your program terminates in time longer than the age of the universe, it's not observably different from non-termination and allowed by your system". Sure, but that shouldn't distract us away from the practical benefits of termination checking -- in fact, we could answer this question by also reasoning about complexity and costs, which is *even harder* than termination and will build upon this termination work, it goes in the right direction.

### Fair enough

Fair enough, but I was specifically replying to this claim:

Dialyzer only rejects Erlang programs that it can prove will generate dynamic type errors at run time, so every valid Erlang program is a valid Dialyzer program.

My point was that the second half of this sentence relies on a rather subjective definition of "valid". Wouldn't you agree? (The first half is probably incorrect as well, as it would generally require either being conservative to the point of being useless, or solving the halting problem.)

### No, I think that's right.

Valid programs will never, under any circumstances, be among the ones that Dialyzer can prove generate type errors at runtime. If Dialyzer had a bug that could "prove" a type error would occur at runtime, when in fact none can, then it would generate a false negative result and reject a program which will never generate a type error.

When you say that's a poor definition of validity, I think you may be responding to the possibility of a false positive -- that is, Dialyzer accepts programs it can't prove will generate runtime errors, and sometimes they do. As the people who program it make it better, we can expect the rate of false positive results to fall, but mean old mister Godel, and the Halting Problem, both militate in favor of the perpetual existence of a few false positive results; programs Dialyzer can't prove valid, which in fact are not.

It's rather the reverse of the usual type checker; most static checkers are trying to prove that a program *is* valid; Dialyzer is trying just as hard to prove that they *aren't.* Both return a binary result, and both sets of tests are testing for the same thing; programs that will produce runtime type errors. But mean old Mr. Godel (and Turing's Halting Problem) say that systems of the first type will always return a few false negatives. And the very same set of results also assures that systems trying to prove the opposite proposition will always have a few false positives.

What it comes down to is what you can prove. There are a large class of programs that you can prove will never generate a type error at runtime, and there are a large class of programs that you can prove *will* generate a type error at runtime. You can try to prove a program is in the first class before running. You can try to prove a program is in the second class before running. But there will always be a few programs that are in niether class.

### I spent a bit of time at

I spent a bit of time at Coverity, which has a pragmatic tool that sports both false positives and false negatives. High false positive rates were considered much worse than high false negative rates as they would annoy customers with extra noise. Selling these tools was about finding enough problems to show value while not burdening programmers with much more work.

A tool without false positives but sporting false negatives is definitely valuable; and so vice versa (as to eliminate false negatives, you have to reduce the expressiveness of your analysis/type system, hence accept false positives).

### Not about false positives or negatives

No, my remark wasn't about false positives. Let me try to clarify again. Take the following degenerate program for computing travel routes:

  ComputeAndDisplayTravelRoute(inputs);
"boo" - 1;


This will throw a type error on the second line, and a tool like Dialyzer would (correctly) diagnose that (it's obviously trivial in this case). However, before this error is raised, the program actually successfully completes its designated job, namely computing a travel route and displaying it to the user. Yet such a program is defined as "invalid". I'm asking why.

Edit: Before you answer, compare to

  ComputeAndDisplayTravelRoute(inputs);
if Condition then "boo" - 1;


Is that also invalid? Or valid? Or does that depend on the condition? Or what you can prove about the condition?

If your answer is "always valid" then that seems like a completely arbitrary distinction from the first program (because the condition could literally be "true"). Practically speaking, that would be a false positive, but you're defining it to be none. If, on the other hand, your answer is "always invalid" then that would clearly include many false negatives. If your answer is "valid if the condition is always false" then that's undecidable. If your answer is "valid if I can prove that the condition is always false" then you have bought into both false positives and false negatives.

According to the example Sam mentioned, Dialyzer chose the last option, and would hence reject some programs although they never actually can create a runtime error. That is, contrary to other claims in this thread, it actually creates both false positives and false negatives.

### But Dialyzer never says a program is valid.

Dialyzer might look at the above and say "I can't prove that it's invalid." That is not the same as claiming that it is valid or that it isn't.

If it can prove that the condition is always true, hence it can prove that the program definitely will throw a type error, it will call it an invalid program. Otherwise, whether or not it can prove that the condition is always false, it will simply say that it can't prove the program is invalid.

Under no circumstances will it ever claim that any program is valid. It can't have a false positive because it never makes ANY positive claims.

### Now what?

Now we're into definitional games. John prompted my original comment by making a statement about "valid Dialyzer programs". You have effectively just defined such programs away. That would directly contradict John's statement, unless there are no valid Erlang programs either. Now what?

Anyway, I realise that my edit above has again distracted from my main point, which is about the definition of "valid programs". Forget Dialyzer for a moment. Regardless of what it does, which of the above programs would you define as valid, and why?

### I'm pretty pragmatic about the valid/invalid thing.

I am purely pragmatic in the matter. If a program can never produce a runtime crash, executes no statements which are ambiguous or nonsensical due to type conflicts, and does what I wanted it to do, I call it a valid program.

If unreachable code is ambiguous or nonsensical due to type conflicts, I call it valid but ill-formed.

I prefer to do type checking before runtime, and happy when I get a proof that my code has a complete and consistent type solution. But when it can take hours for some of my programs to compile, I'll often start one up using an interpreter that does runtime type checking while waiting for the statically checked version. Sometimes I find errors before the compiler gets to them because I know exactly what code I've been working on and can execute it under runtime type checks immediately.

I think that what you're taking as a definitional game, was in fact the point; Dialyzer rejects only those programs which it can prove are invalid. It's not claiming, nor even trying, to prove a program is valid. It is literally the dual of the usual static type checker, which accepts only those programs which it can prove are valid.

### Pragmatic validity is unproveable

To state the "obvious", pragmatic validity is unproveable, because it depends on the inherently subjective notion of what a program should do. The standard approach is to choose an arbitrary formal criterion, call it "validity", and then make a huge deal out of whether or not it can be proven. Realistically, proving it cannot guarantee pragmatic validity. In order to guarantee this less-than-pragmatic-validity formal criterion, the price you pay is whatever pragmatically valid programs do not satisfy the formal criterion. Whether or not that price is too high depends on how useful you believe the formal criterion to be: what advantages does one get from the formal criterion, and what disadvantages would one take on by not insisting on it. Alas, these rather fascinating questions tend to get suppressed through over-enthusiastically conflating one's formal "validity" critierion with pragmatic validity.

### Formal Specifications

It seems an advantage to be able to express what a program should do, and English is not a very precise tool for doing this, as it is open to cultural and habitual contextualisation. A formal specification language allows the cultural and habitual elements to be factored out, giving something that would be interpreted the same by a programmer with a different background or culture (which is useful, as nobody has exactly the same experience, and hence no two people interpret English with exactly the same meaning).

Given the usefulness of a formal specification to express what a program should do, it seems obvious that mechanical checking of how it does it against what it should be doing is advantageous, and saves the programmer much effort in doing so manually.