Inference of Polymorphic Recursion

In the following (Haskell) example, the type annotation on f is required:

f :: a -> (Int, a)
f x = (g True, x)

g True = 0
g False = fst (f 'a') + fst (f 0)

main = do
    print (fst (f True))

I can understand why in general, but I wonder if we could just decide to generalize arbitrarily in the order that declarations appear so that in this case the type of f would be inferred but if you switched the definition order you'd get a type error. When f is generalized, g would be constrained Bool -> b where b would be unified after generalization. Is this something that might work (but isn't done because it's arbitrary and makes definition order matter) or are there hard cases I need to consider?

Thanks

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

monomorphism restriction

It's a default in Haskell, but you can add the NoMonomorphismRestriction language option to avoid it. See also:

https://wiki.haskell.org/Monomorphism_restriction

I don't think this is the

I don't think this is the monomorphism restriction (and compiling with -XNoMonomorphismRestriction doesn't help) - note that neither f nor g look like values. I think this has to do with requiring explicit type signatures for polymorphic recursion (typability of polymorphic recursion is undecidable).

I see

It seems to me that the whole "let's not solve a problem for common trivial cases because the general case is undecidable" philosophy is hurtful to programmers. Theorem provers should be able to apply all sorts of ad-hoc programmer-provided tactics before they return to the human for a little extra help via annotations or hints.

Bad?

You could say that is bad, because types are no longer predictable. When you have type inference, and the programmer has no idea what types are being inferred by various ad-hoc rules, how is the programmer supposed to reason about the code. There is a clear school of thought that a programmer should be able to write a program on a piece of paper, and be able to decide if it is well typed. This becomes almost impossible if ad-hoc rules are involved. No developer wants to memorise a bunch of ad-hoc typing rules which someone else decided seemed sensible to them.

context

Types are still predictable, I think, in the sense that any code that I, as a human, could accurately analyze and predict to be correct should also correctly be proven by a proper computer that operates at a billion times my speed. Automatic type checking with resource limits becomes "unpredictable" only when I start working with programs so large or convoluted that they'd almost certainly also be incomprehensible to me as a human. And in that case, we can easily tell a human that the type checker gave up and to please provide some extra annotations or better tactics.

Also, the 'school of thought' you mention seems a bit unrealistic in context of modularity. If I write `import xyzzy; main = foo bar baz` on a piece of paper, I cannot locally decide whether it's well typed. I need a lot of information from non-local context - e.g. I need to look at the `xyzzy` import. We'll frequently have type abstractions and aliases. I think ad-hoc tactics aren't a new problem, assuming those rules are suitably presented in the context, subject to similar import and abstraction as everything else.

Module import

It might be better if you had to provide the type you expect xyzzy to be as part of the import statement, then that preserves local reasoning.

Yeah but

"Don't select language features to win internet arguments" is one of my core philosophies.

Reading and Understanding

Well that made me laugh :-) I wouldn't have posted it if I didn't think there were some merits to the idea. Personally I find searching through source files for definitions tedious, especially when you have many source windows open. It would be genuinely good for debugging to have all the information you need to understand a module in a single place. As debugging is harder than coding, would it not be sensible to make it easier, at the cost of a little more typing when coding. A language optimised for reading and understanding algorithms (rather than writing them) seems a good idea to me.

Type annotations for imports

I agree with this, but I think there are two kinds of imports (and modularity) to think about here.

If some information is part of the language spec, then that information is available to use for local reasoning/typechecking. If we import something else with great confidence in its stability, then that import is effectively a language declaration, and its undeclared specifications might as well be used throughout our local reasoning/typechecking just like the language's are (which means the module will need to be available at compile time for the typechecker to consult, so it isn't exactly separate compilation). For a less confident import, when we don't want to take its stability for granted, we can still use it in local reasoning/typechecking if we declare our assumptions about it in its type signature.

Agree in general, but not in this case

I agree with this sentiment for proving correctness properties, but not for inferring semantics. And I see generalization as more of a semantic issue (what are the parameters of this symbol) and so I'd rather have it decidable and fully predictable by a simple rule. I'm proposing a different simple rule than the one Haskell uses.

Singularity

I think that programming languages in general are far away from singularity point, not because they are bad, but because singularity point probably could never be reached. There will always be this or that example where users could see a space for improvement. I'm sure that noted problem will be solved in the future version(s) of Haskell (or may be implemented in some other languages). What we, users can do is to make language designers aware of solvable ideas.

However, there was one, clear example where technology reached singularity at the very beginning, and that is mathematics. Its base laws are so clear and simple that even now, after all this time, are not even shaken a bit, regardless of their misinterpretations at some higher level. The base is as solid as a rock, but as things get more complicated, debates are jumping out about laws interpretation.

The noted Haskell problem has some degree of complexity, and as such, the time has to pass to implement its solution. In my humble opinion, regarding to stated specific example, the problem is solvable without explicit type annotation.

[edit] I'm of opinion that we should try to partially solve this problem, regardless of its general undecidability. I'm of the same opinion for halting problem undecidability: we shouldn't give up at the start. We should isolate cases for which the problem is undecidable and provide a solution for the solvable part of their domain.

Alternative grounding

It also compiles if you ground the loop at the other side:

g :: Bool -> Int

If you don't ground the terms in the type somewhere, why would the solution be constrained to Ints and Bools in the sub-terms? I think this is roughly the same as asking: how can the compiler prove that f and g are not polymorphic.

You're right that adding an

You're right that adding an explicit annotation for g solves the problem, too, because it eliminates the need to generalize g, letting f be generalized in isolation. But the compiler is already willing to make the assumption that f and g aren't polymorphic. This works:

-- f :: Char -> (Int, Char)
f x = (g True, x)

-- g :: Bool -> Int
g True = 0
g False = fst (f 'a') -- use f monomorphically

main = do
    print (fst (f 'b'))

Because f and g are recursively defined in terms of each other, they are generalized together.

A modified H-M

Without type annotations, my (modified) H-M can infer the following from your post's example :

f : ∀α. ∀β. α → (β, α)

and

g : Bool → Int

Visible here :

https://repl.it/FTwz/16

Without changes, the same can also infer more interesting types for both createList and removeList in the "Isomorphic Compositions" example of [Hallett & Kfoury]'s paper (page 13)

in that example of theirs, they infer:

createList : int → int list
removeList : int list → int
comp : ∀α. ∀β. ∀η. (β → η) → (α → β) → α → η
appComp : int → int list → bool

while I infer:

createList : ∀α. α → α list
removeList : ∀α. α list → α
comp : ∀α. ∀β. ∀η. (β → η) → (α → β) → α → η
appComp : ∀α. α → α list → bool

But friendly enough, if I define:

( createList ( ( x ) => ( ( + x 0 ) : ( ) ) ) )

instead of:

( createList ( ( x ) => ( x : ( ) ) ) )

I then infer:

createList : int → int list
removeList : ∀α. α list → α
comp : ∀α. ∀β. ∀η. (β → η) → (α → β) → α → η
appComp : int → int list → bool

[Hallett, Kfoury]

http://itrs04.di.unito.it/papers/hk.pdf

Sound?

f : ∀α. ∀β. α → (β, α)

That type (specifically the β) doesn't appear sound. Can you explain?

Unsound, because resulting from under-reported generalizations

Indeed, in the strict sense, it isn't, if we are talking about one type only for one f only in the environment.

But what actually happened in my modification to H-M, is what we could maybe refer to as a sort of type erasure:

it's actually two f's overloads that have been considered, say f1 and f2, and then have been inferred independently, for the polymorphic applications of "fst", on the return types of those f's:

f1 : α → (Int, Char)

and

f2 : α → (Int, Int)

the confusion comes from the fact that the implementation (my bad, of course) just didn't even bother reporting that it was generalizing them into that single (and thus, possibly -- and here, eventually -- unsound) definition.

So, it would be possible (and I suppose, desirable) to keep track of all such overloading-driven generalizations and report them as such (instead of only showing the weird f : α → (β, α) that came out as a by-product).