Should let be generalized?

I came across this paper proposing that Let Should Not Be Generalized (TLDI 2010, Vytiniotis, Peyton Jones, and Schrijvers).

While generalisation for local let bindings is straightforward in Hindley-Milner, we show that it becomes much more complicated in more sophisticated type systems. The extensions we have in mind include functional dependencies, GADTs, units of measure, and type functions. The only technically straightforward way to combine these developments with let-generalisation has unpalatable practical consequences, that appear to be unavoidable.

I'm actually fairly convinced by this. Among other things, it provides a fairly clear path to eliminating the monomorphism restriction:

Our [] proposal completely subsumes the [Haskell monomorphism restriction] for local let bindings, so the MR would then only apply to top-level bindings. In that case the arguments for its abolition, or for generating a warning rather than rejecting the program, would become very strong. A nasty wart would thereby be removed from the face of Haskell.

Are they right? Are they wrong? Has anyone laid out the case for the defense?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Outside ACM Paywall

Same paper via Microsoft Research.

None of the above

None of the options considered in that paper seem like the right one to me. I think we want full generalization over constraints, but this example should be an error:

data S a where
    MkS :: Show a => a -> S a

fs :: a -> S a -> String -- Fixed return type error from paper
fs x y = let h z = show x -- Error: No instance for Show a
    in case y of
        MkS v -> show v ++ h ()

It's not a matter of qualifying over the constraint or not. Consider this example with multiparameter type classes:

class Foo a b where
    foo :: a -> b -> Bool

instance Foo Bool Bool where
    foo x y = (x == y)

data S a where
    MkS :: a -> S Bool

fs :: a -> S a -> Bool
fs x y = let h z = foo x z -- Warning?: No instance Foo a anything
    in case y of
        MkS v -> h v -- Error: No instance Foo a Bool

This should also be an error. The type of 'h' should be 'forall b. Foo a b => b -> Bool', but the 'a' that occurs there shouldn't ever unify to Bool when the GADT is opened. Type class selection should be about choosing the instance that best matches the local pattern seen, not about global constraint solving.

Good point

At least your first example is a very insightful point. I can't be sure about the second example, but only because I don't quite understand it.

It's not immediately apparent to me how to go about specifying or implementing this rule, though. Combining that with the existing MR complexity, relatively rare use of polymorphic lets, and the fact that polymorphic lets with explicit type signatures would still work makes me wonder if the approach in the paper is just easier even though your approach would admit more programs.

Very interesting, thanks Matt.

GADTs are bizzare, type classes arent relations, and Tim's right

In the setting I'm working in, types denote sets of terms. In this setting, GADTs don't make any sense. That you can learn about one parameter by inspecting a different parameter is rather bizarre to me, unless the first parameter is dependently typed to the second. So I don't really encounter the same issues as in the paper.

In general I think the viewpoint of type classes as a global relation on types is more harmful than helpful. It doesn't explain local or overlapping instances, and I think it contributes to making this example confusing. I think framing the mechanism as implicit parameter inference is more useful.

Regarding why we should generalize, I agree with Tim that the top level shouldn't be special, as it hinders refactoring and is just in general ugly. Also, look at the example of a local swap definition (mentioned in the paper):

swap x y = y x

I think it's clear that you'd want that to generalize, and the paper acknowledges as much. The argument from the paper is just that they tried everything they could think of to make this work smoothly with the other cases and the various alternatives failed. So that doesn't preclude another solution they didn't consider. I think type classes (or similar) should be used pervasively, and thus it's important to get the details right.

Right on

I think you may have hit the nail on the head. Maybe it's ADTs that should not be generalized, and the monomorphism restriction should just be a warning about potentially unexpected performance characteristics.

Top-level vs other

Haskell's structure endows top-level code with various magical properties that don't apply to code in other contexts, such as groups of declarations inside let-bindings or groups of fields in a record-declaration. I don't much like this magic.

For example, I had a large block of top-level Haskell declarations that I wanted to move inside a function returning a record containing those declarations. Declarations became recursively invisible; type signatures were no longer allowed; and even after the jumping through syntactically ugly hoops to work around those problems, monomorphism restrictions applied in subtle ways that required elaborating on type signatures.

Module capabilities in Haskell further separate top-level code from other code. This makes it hard to move something from a module to a record in a module, or vice-versa. In general, it forces an unduly flat structure on the program.

Ultimately, I feel all code should be endowed with equal rights and responsibilities, whether it's top-level or otherwise. Without this property, transporting code between contexts becomes excessively tricky.

Is this really a problem with generalized let?

Is this really a problem with generalized let, or is this really a problem with static parameterized typing of variables?

In fact, it's a problem that appears when you have both, but which is not manifest in either one.