Facebook Introduces ‘Hack,’ the Programming Language of the Future

From Wired, Facebook Introduces ‘Hack,’ the Programming Language of the Future

You can think of Hack as a new version of PHP. It too runs on the Hip Hop Virtual Machine, but it lets coders use both dynamic typing and static typing. This is what’s called gradual typing, and until now, it has mostly been an academic exercise. Facebook, O’Sullivan says, is the first to bring gradual typing to a “real, industrial strength” language.

Hack is open source and is available at hacklang.org. It supports type annotation, generics, lambdas and host of other features on top of PHP.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Gradual typing is also in

Gradual typing is also in dart and typescript, which are hardly academic.

Edit: unless of course, I'm convoluting gradual typing with optional type annotations.

Dart

As I understand it, Dart doesn't actually check any types -- it leaves that up to your IDE. (Which, on the face of it, sounds like a bad idea since there will be differences between how different IDEs see your code, plus it pushes a lot of non-trivial type checking work over to IDE developers who may not be as strong on type theory as compiler writers would hopefully be.)

Typescript does indeed do checking at compile time, so would fall under the rubric of gradual typing, I think.

Perspectrive

TypeScript is a level 1 gradually typed language, according to Jeremy Siek.

Also, there is a definite false dichotomy between "IDE developers" and "compiler writers" being (not) "strong on type theory". Pushing the theory into the IDE can only help pull theorists away from the compiler and into the IDE. Sometimes the law pushes society along, and sometimes society pulls the law along, and so it is with mathematics and common practice.

Also, who chooses what code makes it into popular IDEs is extremely political. You only have to track the history of Danny Dig's micro refactorings work in Eclipse to understand the politics involved in many open source projects, and that it has absolutely nothing to do with code that is "strong on theory" or written by people "strong on theory".

Also, if you are not familiar, please see Gilad Bracha's arguments on why types should not influence program evaluation, namely see the LtU op-ed piece by Erik Meijer, Gilad Is Right.

I don't want to rehash the

I don't want to rehash the dynamic/static thing yet again, but Tim Sweeney and Tim Foley both had pretty convincing (to me, at least) counterarguments to Gilad/Erik in that thread. One of the biggest advantages I personally see in static typing is the use of monads to handle the distinction between effectful/pure code -- very few languages can give you the algebraic reasoning power of, say, Haskell.

I tend to use both dynamically typed (JavaScript, mostly) and statically typed languages -- and while I prefer static typing (Haskell, Scala) -- but what really perplexes me about Dart is the addition of completely unchecked type annotations. It would seem that they're of pretty limited use and could even become actively misleading over time during maintenance. If you're going to eschew static type checking, then I'd rather have a large set of test cases as documentation than type annotations (which could easily be out of date).

Anyway, this is getting off-topic so I'll leave it there.

Effects are ok, we can have

Effects are ok, we can have them and be referentially transparent without a static type system; it only matters how you apply them and when they are seen.

Examples?

Do you have concrete examples of a dynamically checked language which separates effectful and pure computations? I'd be really interested to learn a little more about what such a language would look like.

Referential transparency (RT) is easy if you control all your code and you never make mistakes -- which isn't a very practical set of assumptions. I'm thinking more along the lines of: Say you're writing a function, f, that you want to be RT. You discover that it would be really handy to call another function, g, inside the body of f. How do I know if that's permitted, i.e. if g is referentially transparent without going through all the source code for g and its transitive dependencies to verify that it indeed is RT? The power of effect/pure separation in Haskell comes from the ability to look at nothing but a function's declaration and tell immediately if it's RT or not.

Of course this assumes that one agrees with the premise that RT is an important concept when coding.

(I realize that there are some loopholes in Haskell like unsafePerformIO, etc., but let's just stipulate that We Don't Do That.)

control of effects

In a dynamically typed language, one would control use of effects by means other than type - such as use of object capability model. Look into E language and Newspeak.

Leveraging sealer/unsealer patterns, we can also enforce staged use of effects where authority is distributed separately from activation. A sealed value can enable users to model abstract data types or first-class classes.

Syntax can also control effects by structuring the computation into multiple stages. It isn't a method I use much in the last few years, but it was my preferred technique five years ago.

Substructural types can also be useful to control effects even if enforced dynamically. Though, these are more relevant with respect to concurrency and control of aliasing, and separating concerns for structured behavior vs. structured code.

this assumes that one agrees with the premise that RT is an important concept when coding

We consider RT good because it simplifies some forms of reasoning and refactoring (or program transform). Which is to say, RT is not an end in itself, and there may be other approaches to similarly effective ends.

Values in Haskell are pure /

Values in Haskell are pure / RT without help from the type system. Haskell doesn't use the type system to tame impurity as does e.g. Clean. You can encode a similar scheme of encoding effects as instructions of what to do in a pure dynamic language.

Hm

That's not true of the ST monad at least, which I think is a very important part of the "effects" story. I do agree that RealWorld-passing (for IO) would work in non-statically-typed languages. In fact Clean-style linear worlds would probably also work, provided you enforce linearity dynamically and failed on reuse of a world resource.

Not true of ST?

I'm prepared to accept that I'm wrong of Haskell, but I'll need a reference or argument of some sort. The implementations of ST or IO is obviously going to rely on impurity, but why do the semantics need to? I'm guessing that you're thinking about the role of the type system in guarding references, but maybe you're thinking about strictness...

runST runs effects

Even if you model a `ST r a` as a "list of actions to produce an `a`", the fact that `runST : (forall r . ST r a) -> a` exists as a term-former means that term exist that have an effectful semantics. The type system prevents this from breaking referential transparency (RT), but if you compose (potentially ill-typed) expression formers and execute them without type-checking first, they may break RT if they use runST.

I'm not following the

I'm not following the argument. The State monad has a run function and it can have a pure implementation. You can do the same thing with ST, but now you have to worry about references leaking out. But how does this break purity? If each runST keeps track of its own dictionary of references (modeled as integers, say), then bringing references in from some other ST just produces spurious results. Purity doesn't seem in jeopardy. No?

Hm

Thinking about this more, I think you may be correct: it's probably possible to do a kind of dynamic region sealing where you assign fresh nonces to each computation `runST`, give that nonce to each created reference (for the newSTRef example), and check that the reference nonce corresponds to the runST nonce when evaluating read/write effects.

and check that the reference

and check that the reference nonce corresponds to the runST nonce

You can't do this part in a pure way unless you provide the runST nonce as an argument to runST.

I think Matt's point is simply that we aren't controlling side-effects using static types. The ST existential world type is preventing a certain class of errors (sharing references between different runST instances), but those errors are categorically different than side effects.

Nonces are fine

Indeed, using nonces means that the implementation of the language would use effects. But those effects are not observable for correct programs: if a program succeeds without dynamic errors, then it means all nonces tests succeeded, so it is equivalent to the same program where a constant id is used instead of fresh nonces -- it is observationally pure. If the program fails with a dynamic error, well, dynamic failure is an effect anyway, as (as soon as it gives an error location) it allows to observe the evaluation order for example -- breaks transparency.

nonces with assertEq

I agree that if the only operation on your nonces is `assertEq`, essentially treating failed comparisons as a dynamic type error, you could usefully protect some properties in a dynamic model while maintaining a useful degree of 'observational purity'.

Though, we'll still lose some useful equivalences. Such as `runST foo` is not obviously equivalent to another expression of `runST foo` if any nonces are exposed. If we attempt to compare the results, we could have an assertion failure despite equivalence.

Semantics of errors

The semantics of ST when you use a reference from a different ST is the same as if you use a string or boolean in place of a reference -- it's an error. The relevant question here seems to be what an implementation is required to do in the presence of such errors. I was considering that "undefined behavior" is a good enough answer. I think David is correct that to actually detect this undefined behavior dynamically requires impure constructs in the implementation (generation of those nonces), but I don't consider such detection to be part of the semantics of ST.

A possible dynamic implementation.

For a dynamic language to do this you must prevent any side effects from escaping. One way to do this would be to use the equivalent of 'eval'. Ie run the code inside in a separate interpreter that only implements permitted operations, in this case pure functions and a single bounds checked memory array. I think you could allow arguments to the function to be evaluated to be mutable, but only read access to non-parameter variables in scope, and any errors terminate the interpreter and return a default value specified in the 'eval' call.

How do you represent

How do you represent references and how do you prevent them from leaking out? (Returned from one call to runST and used later in another)

Its a State Machine in a Box

So the model semantically is a state machine, where it is impossible to change the state from outside the box. Only the commands (monadic functions) can affect the state. This means it is safe to allow mutable references to be passed into the 'box' and have their values mutated as a result of the state inside the box, but it must be impossible to return a mutable reference to something inside the box. I also think it would not be semantically very nice to have a read-only reference where the content changes over calls into the box. So I would implement copy semantics for things returned from inside the box.

I understand how monads

I understand how monads work, but I'm still missing something about your idea. Namely,...

it must be impossible to return a mutable reference to something inside the box

How are you going to do that without types? How are you going to detect the following code dynamically?

ref = runST $ do
    ref' <- newSTRef 0
    return ref'

print runST $ do
    readSTRef ref

EDIT: Markup ate the last half of my example code

Interpreters as values

If you make the only way to use a reference to use a nested interpreter, then the outer language is pure anyway, so there is no problem, you just pass interpreters around as values. runST causes that value to be executed as a program which results in the return value.

This is pretty much the same as 'eval' in that the argument to eval is a (read only) string that represents a program, the result is another value. If the interpreter run by eval has no IO, just access to a single mutable array, from the outside it has no side-effects.

runST world inside a world

runST provides side effect control by creating a box that behaves as if it were pure. The ST monad cannot 'go wrong' by bad pointer access like in languages like 'C', so from the outside you cannot tell if the function inside runST is pure or impure. The key point is the state storage reference is existentially contained inside the monad. This allows faster imperative implementations to be used inside, whist the user retains RT. Think of something like a matrix or linear maths library. Obviously this is enforced by the type checking, and obviously you can't execute expressions without type checking.

The language semantics are defined by the type system, not the other way around.

Give me a few days, and then

Give me a few days, and then my position will be more clear. Rather than control what effects a function can perform, effects can be managed by the programming model to ensure that they are relatively sane.

Which, on the face of it,

Which, on the face of it, sounds like a bad idea since there will be differences between how different IDEs see your code,

Those differences show up in practice in most other languages. Think all of the warning settings in GCC and Clang. Likewise, consider third party static analysis tools like Coverity.

It's important for a language to have a well-defined baseline set of static warnings that all tools agree on. Otherwise, it's difficult for users to write warning-free libraries that will work in the IDEs of their users.

But, beyond that, it often seems to be a feature that different tools provide different warnings and analyses that may be more helpful to one user than another.

plus it pushes a lot of non-trivial type checking work over to IDE developers who may not be as strong on type theory as compiler writers would hopefully be.

The Dart SDK includes an implementation of the type checker (well, two actually). It's available as a library, so tools can just use it as-is. Of course, IDE vendors may choose to write their own as a competitive advantage, but it's really handy to have a nice code analysis library come with a language.

TypeScript yes, Dart no

I think Dart is not gradually-typed in any real sense -- it neither features the interoperation that is the hallmark of gradual typing since Jeremy Siek and I independently started this work, nor does is gradually evolve from untyped to typed. Dart is just a language with a weird and unsound type system.

TypeScript no, Dart no

Well, TypeScript is weird and unsound, too. For example, function types are bivariant in their arguments. Let alone generic type constructors, whose arguments can vary freely under subtyping. Arguably, that's even worse than Dart's covariance.

(And to be honest, I'd be pretty surprised if Hack wasn't weird and unsound either.)

Needless to say that I find this state of affairs rather sad.

Maybe instead of having a bottom value

Such languages should have a frown face?

P.S. Hate to say it, but as a practitioner, TypeScript provides me with the best safety/readability/modularity when working with JavaScript. Worse is better, no matter how sad it makes you feel.

Hack is intended to be sound

The developers of Hack have said that they intend the type system to be sound. I think they might do it, as well, since they want to use the types for optimization, where soundness is required.

Can you explain what you

Can you explain what you mean by interoperation here? It is certainly possible to write both untyped and typed Dart code, and the two interact with each other easily.

It's also been my experience that many Dart libraries do in fact start out untyped and gradually evolve towards fuller annotations.

Just to note that we have

Just to note that we have two languages on the home page at the moment, one for helping with privacy and the other one from facebook...

Jeeves and Facebook

I think Facebook showed an interest in Jeeves or the concepts behind it… since Jean Yang worked there the summer after the first Jeeves paper was presented at POPL '12.