Algorithms for inferring most specific types?

In my reading so far on ML type inference algorithms, the object seems to be to infer the most general type of expressions. If I have the terminology correct this is the "principal type".

Can someone point me at algorithms for inferring the most-specific or least-general types in ML programs? I'm interested in analysis methods, type inference being one of them, for find information about a program. So rather than most general types to encourage reuse of code, I'd like to find the most-specific types to see exactly and only the way the functions are being used in a program.

thanks in advance!
bill

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It sounds like you want an

It sounds like you want an effect analysis, ala "type and effects" system, rather than just a type system.

Perhaps what you are after

Perhaps what you are after are intersection and union types. You can check out an old paper by Tim Freeman and Frank Pfenning, Refinement types for ML, which described a type system based on abstract interpretation to infer the precise types of functions, with respect to some finite lattice structure on datatypes (e.g., we can declare even and odd as refinements of the integers, and infer that type of addition is the interesection of even * even -> even and even * odd -> odd and odd * even -> odd and odd * odd -> odd).

Note that this pure-inference-based approach is seen as the wrong approach to designing refinement type systems--precisely because of the modularity issue you cite (as well as decidability in the general case with union types)--but it seems like you are after analysis methods, rather than type systems. If you want to verify that a program has a very precise type, you should check out some of the more modern work on refinement type systems.

A points-to analysis tells

A points-to analysis tells you what an abstract location (e.g., variable) may point to. You can inverse the relations (a flows-to analysis) to see where it goes, such as call sites. Lots of open source implementations for enterprise languages out there.

Not enough desired task info to say even whether typed approaches are appropriate (e.g., points-to analysis generally isn't).

Control-flow analysis...

...judging from the variety of responses you've gotten, you can see that the question you've asked can be interpreted in many different ways. So here's another:

ML-style type inference gives you a kind of upper bound on the contexts you could possibly use an expression in. Turning this upside down, useful lower bound would be the smallest complete set of contexts in which the expression is actually used. Obviously, this is uncomputable in general, but what you can do is use an interprocedural control flow analysis (such as 0-CFA) on your whole program to figure out the places an expression could be used, and then look at the types each of those places are expecting.

That's essentially the

That's essentially the points-to / flows-to ;-)

Edit: in practice.

I thought you had something

I thought you had something fancier in mind, since I think of pointer analysis as a fancier analysis than control flow analysis. :)

I think 0-cfa is a common

I think 0-cfa is a common pointer analysis for java. Fancier versions are for folks who want more publishable results :)

I am curious if there are substantial differences in rich languages, however, either theoretical or empirical.

Aren't most of the

Aren't most of the suggestions here the same, or at least related? Effect systems are a control flow analysis too.

1) Don't region and effect

1) Don't region and effect systems typically lose context sensitivity? Perhaps you can play games with dependent typing etc., but I don't think they're normally explored in such a context.

2) The analyses we're talking about generally work in the context of languages with features that don't play nicely with type-and-effect systems. More strongly, they *work really well* -- they're the foundation of many basic compiler optimizations, while effect systems seem to still coming into their own (speaking as an outsider).

3) When casting regions or effects as refinements, as Noam pointed out, to make it look like an analysis, you'd need to phrase at as inference... which is very unclear.

You're determining things about terms, so saying 'types' is great -- I can claim the type is the may and must sets -- but that's not really fair. How you get these sets (e.g., abstract over them) is the interesting thing.

Edit: not to say some type approaches -- like regions -- don't play along with low-level languages (which was much of the motivation!). The question is granularity. The deterministic Java project is an interesting modern experiment that, in part, explores how deployable these ideas are.