¿How can a dynamically typed language not actively prevent static checking?

With all due serious respect to: TypedRacket, TypedLua, Dialyzer, TypeScript, Flow, et. al., I see most if not all attempts to add some form of static type checking to a dynamic language to be indictments of the whole venture. Sure, they might get close to having something, but the something is so far away from 'the goal' that it is proof of the tilting at windmill nature of it all. From my Day Job Joe Programmer Static Typing Bigot perspective.

Cf. the long and complicated list of Flow issues on GitHub.

I realize that building a language with typing from the ground up is empirically hard to do, or empirically hard to conceive of doing anyway. Thus instead of asking for Magic Pixie Dust Retroactive Static Typing libraries, I think I would like to ask: How can dynamic language creators at least make their dynamic language more rather than less amenable to some possible future static typing analysis?

(Of course the answer is to use some static typing engine to lint the grammar/possible productions/whatever, which is obvious and sad and ironic and undermines any such dream ever being fruitful.)

Or should we just admit that anybody who thinks we can have our dynamic cake and eat it too is barking up the wrong tree?

Comment viewing options

A few heuristics come to mind

Most grammars are fairly neutral as between static and dynamic typing, provided there is some extensible mechanism such as decorators, annotations, or pragmas that allow explicit typing information (broadly construed) to be embedded in the dynamically typed code. Other points:

1) Have a proper boolean type, not a list of random falsy and truthy values.

2) Avoid weird objects like Common Lisp's NIL, which is the only object that is both a list and a symbol (as well as the only falsy object). Wherever possible, make sure your type lattice is a tree, with the possible exception of a bottom type, preferably uninhabited.

3) In the standard library, avoid functions that return union types like "an integer or false": this can be done by providing option objects. Likewise, exercise caution when adding procedures with ad hoc polymorphic arguments, like Scheme's SRFI-13 string searching procedures that accept a predicate, a single character, or a character set.

algebraic types

The can return number or false sounds like an algebraic type.

Maybe those are a much better fit for a dynamic language than regular strong typing.

Callable sealed tuples are all I need

I'm working on a language (Cene) that has a kind of sealed tuple type. I rationalize it as a kind of content negotiation: If I don't know how to interpret some document, I can invoke it as code, passing it a representation of the intended use case as its argument. If that code doesn't know how to interpret (unseal) that use case, it can invoke it right back, passing back something that I hopefully do know how to interpret.

With a callable sealed tuple type like this, I technically don't need any other type to do programming. I've been taking advantage of this to develop a unityped language that completely revolves around callable sealed tuples.

In practice, I haven't actually taken advantage of that content negotiation support; I always keep callable values and unsealable values separate. On top of that, any value I unseal has only a handful of possible sealer/unsealer tags it can possibly have, so I'm basically dealing with algebraic data types.

I think if I add syntaxes for function types, union types, recursive types, and a type family for each sealer/unsealer tag, I'll be able to derive types that completely formalize that functions-and-ADTs-like usage pattern. This would be enough to keep programming in mostly the same way I have been, but with type annotations taking the place of some of the explicit error messages. I haven't tried this yet, so I'm mostly speculating here.

What does this mean?

Can I ask you to explain something from your language in concrete terms with examples so I can understand it?

Assume that I'm a schemer but haven't used Haskell or taken CS courses more modern than 30 years ago... And you can assumed I understand CPS.

Since Cene makes continuation-passing style so much more palatable, Cene's side effects are monadic, with asynchronous callbacks if they're observable to the outside world. If they're not observable to the outside world, they use a world-passing style instead. Cene calls worlds "modes" to call to mind modal logic. (TODO: That's probably an incorrect usage. Just use the name "world" instead.)

Cene documentation is a work in progress

Thanks for the interest. I tried giving a comprehensive answer to this question, but I wasn't sure where to stop. Could you make your question more specific, perhaps?

In the meantime, the code below that paragraph is already one example of what it's talking about: The bind-effects and no-effects operations are the monad operations, and each of the mode variables holds a representation of the current state of the world. The (input-path-blob-utf-8 mode in) operation takes the current state of the world and reads a file from it, for instance.

There are no good examples of asynchronous effects in Cene yet. I think I might have deleted the only asynchronous effect I had in the language, so it will be difficult to talk about that until I use Cene for a substantially different purpose than I'm using it for now.

Right now I'm using Cene as an alternative to writing JS code. I've found Cene-style concurrency to be obstructive to the JS event loop micromanagement I want to do there. When I want concurrency and I don't care about that kind of micromanagement, I think I'll revisit those techniques.

sounds like a nice start of a list of considered harmful's

• instanceof style runtime type checking conditional code (cf. overloads, and also non-Option returns).
• implicit type casting (cf. multiple truthy/falsy things).
• non-tree type lattices.

Context

The "takes different sorts of inputs" which you called "implicit type casting" is just how you do overloading in a language that lack type matching at compile time.

It's not "harmful" it's how you get a higher amount of information per unit of code (in an information theory sense) - meaning of a symbol depends on context.

That makes code more readable.

It's a GOOD thing.

Taking different sorts of input

Taking different sorts of input leads to intersection types which is generally not a very usable type system, and certainly type inference is not decidable. You therefore have to add a lot of type annotations which puts people off using the static typing (it feels like a bad fit for the language). Ad-hoc overloading is not as bad, but you end up with something like C++, and again not much in the way of type inference. Overloading constrained by type-classes (like Haskell and Rust) seems the best compromise between flexibility and type inference to me.

Ceylon does pretty well with

Ceylon does pretty well with intersection and union types.

Type inference and expansion-variables.

Perhaps I should just say it doesn't work very well with type inference, is undecidable and the mechanics of expansion-variables needed to make it work properly are complex. I am not sure how far Ceylon goes in this way? Does it include expansion-variables in its type system?

I think Ceylon takes a

I think Ceylon takes a unique approach to intersection and union types that avoids some of the typical difficulties, but I don't remember the specifics of why. It's definitely worth a top-level post for discussion since they integrate it nicely with subtyping and type inference.

Unification

Unification with expansion variables in undecidable, and expansion-variables seem to be the best way to handle intersection types in unification. I cannot find a clear definition of the Ceylon type system, so I would be interested to see how they handle this.

When I wrote a type-checker for intersection types two things put me off, the undecidability of unification, and that it made type inference almost useless, as it would allow almost any use of an argument for example:

\x . (x "hello", x 3)


would have the type (String -> a) /\ (Num -> b) -> (a, b) so there is almost nothing we can type-check about a function until we know what type is passed as the argument.

It doesn't look like Ceylon has universally quantified types or partial application, which may be the reason it does not run into these problems.

Ceylon does have ordinary

Ceylon does have ordinary generics/parametric polymorphism on types and methods, and type constraints on those parameters. Unless you meant something else by universally quantified types?

I believe it requires type annotations on methods though, so this probably alleviates much of the difficulty you encountered.

Parametric vs universally quantified.

I don't believe these are the same. A universally quantified type can have an unknown type variable, for example: forall a. a -> a, this is a concrete type that can be instantiated (in implementation terms you can instantiate a type schema). With parametric types, you cannot instantiate a type without knowing all the type parameters, so: <a>(a) -> a (using Rust syntax) cannot be instantiated without a concrete type like 'String' as the parameter. To work around this languages like this provide subtyping and types like ' Anything'. However it should be clear that forall a . a -> a is very different from (Anything) -> Anything. The former instantiates to a function that returns the same type for any type, the latter could take any type and return any other (or the same) type. Rust has added specific syntax for universal quantification of lifetimes for<'a> f<b>(&'a b) -> &'a b which makes it explicit that this is not the same as f<'a, b>(&'a b) -> &'a b.

Often identified

Many core languages / logics don't distinguish between the two: a quantified type requires a witness (aka a parameter) be passed. System F, calculus of constructions, etc. work this way.

Witness Types

The question is when the witness must be provided. In a universally quantified type system the intermediate steps of type-inference or type checking can propagate non-ground type variables up and down the AST, resulting in the need for expansion variables with intersection types. With parametric types this is simply not necessary as you cannot refer to a type without knowing its type parameters. In other words a parametric type is not a type before its type arguments are supplied, it is a type-template or some other sort of thing, not considered an actual type. It may seem like a small difference, but I think it makes a big difference to the implementation techniques you have to use to implement type-inference or type-checking.

I'm lost

I didn't follow all of that. I thought you were saying something simpler. Disregard my comment.

It sounds like you're

It sounds like you're talking about higher rank polymorphism. Ceylon indeed does not have that from what I understand, but higher rank polymorphism != universally quantified. The existence of any universal quantifiers qualifies as universal quantification, no matter where they appear in a term.

Not higher ranked.

No i'm not talking about higher ranked polymorphism, just intersection types. Try this: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.6857&rep=rep1&type=pdf for a look at an intersection type system. How does Ceylon avoid needing expansion-variables?

I know the paper but I don't

I know the paper but I don't see its relevance. We're talking here about universal quantification, and parametric polymorphism contains universal quantifiers in leftmost position.

Your examples of embedded forall a.a... quantification are examples of higher ranked polymorphism, but you just said that's not what you're referring to, so I'm still confused what sort of universal quantification you think Ceylon doesn't have. You're referring to a "universal quantification" that isn't parametric or higher rank polymorphism, so it's a type of universal quantification that I'm not familiar with.

Maybe higher ranked

I didn't have any explicit forall in the type system, but still needed expansion variables. Maybe I was working with implicitly higher ranked types as each function has an implicit 'forall' in the leftmost position. I think I need to go through my old code and see exactly what problems I was having. All I was really saying is that either Ceylon avoids the problems I was having, or they are still there but not apparent, and I don't know which.

Parametric vs Universally Quantified.

One difference is that parametric types can be monomorphised completely, whereas the universally quantified ones cannot. You cannot define polymorphic recursion with parametric types.

How about with a tracing jit

Basing your theories on before-hand compilation is becoming out of date.

I believe this qualifies as

data Nested a = a :<: (Nested [a]) | Epsilon
infixr 5 :<:

nested = 1 :<: [2,3,4] :<: [[4,5],[7],[8,9]] :<: Epsilon

-- length requires polymorphic recursion
length :: Nested a -> Int
length Epsilon    = 0
length (_ :<: xs) = 1 + length xs


re: context sensitive code

I'm not inclined to agree that "symbol's meaning varies by context" is "a GOOD thing". It implies that I cannot understand, debug, test, etc. a subprogram in isolation. Latent bugs might go undiscovered until I use the subprogram in a new context.

To me, making code more 'readable' is valuable primarily insofar as it makes code more 'comprehensible'. I believe tradeoffs that aim to improve readability at the cost of comprehensibility to be rather dubious. Why do we read code, if not to understand it? As a programmer, I want to understand my tools, develop and master a portfolio of software components that can be adapted to many different problems. For adaptable components, I do need some variant of generic programming, though not necessarily one based on overloading. For understanding and mastery, it's convenient if each component has one very predictable, context-free meaning and behavior.

Techniques exist to improve readability (avoid overly verbose symbols, etc.) without overloading meaning. Overloading is convenient for text-editor environments where we cannot easily supply metadata without cluttering our code. But even there, we have other options like namespaces or language support for EDSLs. With a projectional editor, we can additionally leverage nicknames, legends with color or iconography, fading out less important code, and a variety of other techniques to enhance readability.

Algebraic data types are also a nice boon for predictability. A lot of type ideas like ad-hoc type unions or nullable types seem unnecessarily context sensitive. E.g. (Int union Int) collapses to (Int) whereas (Int + Int) does not collapse; (throws Foo union throws Foo) typically collapses to (throws Foo), losing information about origin; (Nullable (Nullable Bar)) collapses to (Nullable Bar), hindering development of generic code that operates on Nullables. Algebraic data types have exactly one context-free meaning.

Useful context

Useful context sensitivity generally means
"has the same meaning, but operates on different types"

generic code

There are a lot of ways to have code operate with the same meaning on multiple types. For example, collections-oriented programming styles are good at this (e.g. folding over lists).

With overloading, I'm asked to assume the meaning is the same. That assumption might work out if I'm careful and control the codebase, but I know from experience that, for example, multi-method based overloading is rather difficult for me to track and control, esp. in a team programming environment. I have more confidence in other mechanisms for generic code, where the meaning and behavior is inherently stable.

Related: The Power of Irrelevance. It's easier to comprehend and debug code if we can quickly isolate which parts of it we need to read to understand a given behavior. If we must debug a 'same meaning' property, suddenly we've made a lot more code relevant to our search.

Information theory.

Well I just like to consider abstract ideas about programming and my theory is that programming is:
1) a language for communicating algorithms
2) the primary objective is to communicate succinctly so that the programmer isn't overloaded with detail.
3) any complexity resulting from that objective should be handled by the software, not the programmer.

succinctness

Avoiding unnecessary details is a good goal, IMO. A programming environment that can resolve human ambiguity could feasibly help with avoiding unnecessary details. And overloading is one well known means of resolving human ambiguity. (Of course, there are many alternative means to avoid or separate details, such as edit-time resolution, staged programming with constraint propagation, EDSLs, effect handlers, etc.)

I have a different view on the nature of programming - not 'communicating algorithms' so much as 'a medium for encoding and manipulating data and behavior'. Single-use code doesn't strike me as any worse than single-use post on this forum, or a single-use document. A consequence of my different view is that, rather than pursuing generic overloaded algorithms, I'd tend to favor malleability - that code can be refactored, reshaped, repurposed with confidence. We can make code reusable with hindsight, after noticing a pattern.

I know from experience that,

I know from experience that, for example, multi-method based overloading is rather difficult for me to track and control, esp. in a team programming environment. I have more confidence in other mechanisms for generic code, where the meaning and behavior is inherently stable.

Do you have an example?

If you're asking for an

If you're asking for an example of 'other mechanisms for generic code', I have so far mentioned: collections-oriented programming styles, use of algebraic data types, and effects handlers (aka interpreted effects). Favoring free monads instead of typeclass-based monads would be an example that aligns with several of these alternatives.

Typeclass based monads

All the implementations in the linked free monad example were implemented using type-classes. I guess if you hard coded the particular type classes for Functor, Applicative and Monad into your language, you could define new monads with the free monad without explicitly seeing any type-class machinery, but it would still be there, or am I missing something?

concrete monads

The free monads presented by Kiselyov are concrete. They use GADTs, not typeclasses. He does provide typeclass instances for them to demonstrate how they fulfill the normal interfaces. But those instances aren't essential - i.e. you could delete the instance lines and just work with concrete functions.

If we have FFree - or preferably a variant with better performance properties (e.g. with tree-structured continuations for efficient composition) - then that's the only monad instance we need. We can define other monads as simple parameterizations of our free monad without relying even on a Functor instance. Our monadic syntactic sugar (do notation, etc.) can bind to concrete symbols in the namespace. Typeclasses can be eliminated entirely from this role.

Addendum: There may be some confusion because Oleg also describes the older Free monad, which did require a Functor instance for each effect type. I'm primarily referring to FFree (the Free-er monad) which doesn't impose any typeclass dependencies on the effects.

Only Monads?

If I understand this correctly, the type-classes are replaced by a data-type and a GADT based interpreter? Type-classes provide controlled overloading, can all uses of type classes be replaced in this way? Is this a convenient way to write overloading in general? For example could a type class like an AdditiveMonoid be implemented in this way so that definitions of addition can be provided for arbitrary numeric types?

not just monads but

Not every use case for type-inferred overloading is conveniently and efficiently handled by free structures (at least, not within my knowledge). We need other techniques for other use cases.

To add arbitrary numeric types without typeful overloading, the most direct way is to parameterize the 'add' function. Preferably in a manner that allows an optimizer or staged processing to inline the adder code. Less directly, we may treat addition as a monadic effect. In that case, the overhead is going to hurt performance unless we're working with very big numbers or we've got a beyond state-of-the-art compiler (one that supports Futamura projection).

Specialisation

I do not believe compiler optimisation can be efficient in all cases. Beyond that, a single parametric implementation of addition seems impossible if we include vectors and matrices. We need specialisation to allow optimal algorithms to be selected based on the data type. I am not convinced that free structures provide an easier (for the programmer) way to think about interfaces and overloading than type-classes, however there is definitely something interesting going on I need to think about further.

Looking at Rust recently has reinforced my view that type-classes (Rust traits) are the right abstraction for generic programming. Now Rust has added multiple-dispatch, specialisation and function-traits I have been able to write all the examples I have tried so far from Stepanov's "Elements of Programming" very neatly. Looking at the way generic algorithms fit into the type-class framework, and the zero-overhead abstraction you get from monomorphisation, along with the selection of the optimal algorithm for datatype properties, I have a hard time imagining another technique could replace type-classes as elegantly.

Re: specialization

Typeclasses are typically implemented by implicit parameterization. Of course, they also are frequently inlined and specialized when types are known statically. I've experienced huge performance hits with GHC using typeclasses through dynamic module boundaries. When a function parameter is known statically, it also can be inlined and specialized to the same extent.

In any case, you seem to be making an assumption I'm not comfortable with: that our goal is to "think about overloading" or (earlier) "a convenient way to write overloading". Reusable code is a goal (not always a worthy one). Overloading is at best a means. I mentioned free monads not as a way to write overloading, but rather to accomplish generic programming without any overloading.

Generic Programming

I am not sure what GHC does, but traits in Rust are monomorphised and are very fast.

I think there is a requirement to write specialised implementations in order to write generic programs, and I don't see how to do that without overloading. The classic example is a sort function that depends on the type of iterator, so collections that only efficiently support forward-iterators use one sort, whereas collections that support random-access use a different sort algorithm (note: the most efficient algorithm changes depending on collection type, so this is beyond what can be achieved with optimisation). It is clear that the most efficient algorithm depends on the properties of the types being processed, and yet to write truly generic code you do not want to bother with knowing what sort of collection you have been passed. I just want to write  y = sort(x)  and have the most efficient sort algorithm available for collection type 'x' to be used. To me this seems like overloading. I don't understand how you would provide equivalent functionality?

Two comments

- Haskell combines overloading and type-directed meta-programming into a single mechanism, but I think they are conceptually different and the type based meta-programming is really the important feature that's needed for this example. Overloading is where you have several intended meanings for a name or symbol. Haskell doesn't really support ad hoc overloading, except that it can kind of be shoe horned into the type class mechanism sometimes.

- If you separate semantics from representations, then this is a straightforward exercise. There can be one declarative definition of sort and then efficient representations can be extracted for specialized purposes. This is the ideal way to model this situation, IMO. Mucking up the semantics with optimizations should be avoided.

Algorithms

I like the way Haskell does it with type-classes, and it facilitates a nice generic programming style. I like C++, but templates are too verbose and unsafe. From what I have seen of Rust, it seems to be the best option for generic programming yet, with its traits providing type-class like generic programming. I think any attempt at a generic programming language has to cope with the code in Stepanov's "Elements of Programming". I think the algorithm selection by type-class / trait is a key requirement.

I don't think you can have one declarative definition of sort, as there are many different algorithms to chose between that may be optimal in different circumstances (depending on iterator-type, mutability etc). I think algorithms are fundamental, and the idea that a compiler can optimise a single declarative sort definition into quick-sort or merge-sort depending on the circumstances is a pipe-dream. I don't deny writing a single parameterised definition is theoretically the best way, but I don't think it would actually work. The compiler would effectively have to 'discover' the sort algorithms by some sort of search operation, or have hidden definitions of the alternative sort algorithms inside the compiler, which both seem like bad ideas to me.

Type classes, representations

There are some things I don't like about type classes. There are lots of things I don't like about C++, mostly stemming from the focus on ad hoc mechanisms instead of clean semantics. I've only read a bit about Rust, but generally I think I'd rather have sugar over modules than over traits.

Regarding representations, I'm not suggesting a heroic compiler. The representation mechanism will be under programmer control. But rather than supplying a bunch of optimized code as the definition of 'sort', I'd rather define 'sort' in an obviously correct way and then work with the compiler / IDE to prove an implementation correct. You can achieve something similar in this case if you can put a precise specification of 'sort' in the definition of your type class. It looked tempting to me to try to reuse my module/type class mechanism for this purpose, but I ended up concluding that it wasn't a good fit for that.

Traits

The proof requirements could be part of the 'trait', so that any implementation has to include a proof that it meets the requirements of the trait it is implementing.

The point with sort is that there are different algorithms that you want to use depending on the type of object being sorted. Not that you want to implement a single sort algorithm.

Modules have the problem of making the configurations explicit and not automatically choosing the best implementation based on specialisation. The number of module parameters quickly spiral out of control if you have algorithms that depend on multiple things.

To me the point with sort is

To me the point with sort is that the programmer shouldn't be binding to a particular algorithm, but to a particular semantics (bind to what sort does, not how it works). There's a subtlety, though, because you may not want to specify in the definition of 'sort' what to do when elements are equal. Any reference implementation will overspecify sort in this case. If you use sort in a general enough setting, an optimization that relies on changing the equality behavior at different types might not be semantically sound (e.g. might violate parametricity).

You can view type classes as a mechanism for automatically building the modules that you want. I prefer this viewpoint to the view of type classes as relations over types.

Large Programs

Having written many large programs, I find myself drawn to the style of programming in Stepanov's "Elements of Programming". Languages either make this style easier or harder. My experience is type-classes are a good fit for this kind of programming. I think its telling that the STL has been far more successful than any object-oriented class based standard library for C++, so much so that the STL is now the official C++ standard library.

Yeah, you've referenced

Yeah, you've referenced Stepanov's book a number of times. I'm not familiar with it, but from how you've described it, it would probably find it mostly agreeable. Though I may be more focused on semantics and than on algorithms, I am a big fan of generic programming.

Elements of Programming

It's not a long book, so I would highly recommend reading it. I don't think you can seriously critique this style of generic programming without reading it. It would be like a mathematician being unaware of Euclid's elements.

Edit: Here's a link to Amazon for Stepanov's "Elements of Programming" http://amzn.com/032163537X

Maybe some day

Well my training was in mathematics and I haven't read Euclid either. It's too far down the list. What did Mark Twain say about classics?

From mathematics to generic programming.

If your training is in mathematics, then I think you would enjoy starting with "From Mathematics to Generic Programming" http://amzn.com/0321942043 Which effectively presents the idea that programming a branch of abstract algebra.

I read that

and found it in no way useful.

Of course I'm a programmer, not an academic.

I'm a programmer too.

I am a programmer too, with lots of experience of building commercial software in teams of all sizes, and several different languages for many clients, and I found it useful. However I would recommend reading "Elements" first for programmers.

The "Elements" book which covers pretty much the same material but from a programmers perspective rather than a mathematical history approach, changed the way I write code, and renewed my interest in C++ after all but abandoning it in favour of Haskell for many years. Now I am finding that Rust might be my language of choice, getting the best bits of C++ along with a Haskell like type system and safe memory management without GC. Rust is still not the language I would develop myself, but it's getting pretty close.

I'll look into it

I'm interested in how to structure libraries since my current hobby is, you guessed it, writing a language.

Someone here asked me how I would fix scheme after complaining about it, and that turned into a long series of answers.

Since my own language is being built on a lua/luajit backend rather than scheme (not out of love but for practical reasons), I suppose iterators might be something I should look into. I would also like to see about more functional approaches to library design though.

Lua can do functional programming (it has closures and guaranteed tail call optimization) but usually isn't used that way, due to coming with no library and having a base type that has the kitchen sink so thrown in that it has numerous abstraction leaks - and people may avoid copying it because it's not possible to copy in 100% of possible cases (such as having proxies and other monstrosities) and has no built-in copy function.

To me the point with sort is

To me the point with sort is that the programmer shouldn't be binding to a particular algorithm, but to a particular semantics (bind to what sort does, not how it works).

There are a few sorting properties though, like stable .vs unstable. If your "sort semantics" don't parameterize over such properties, you have a lot of duplication over sort, but that seems like a lot of complexity of its own.

Right, that's what I meant

Right, that's what I meant by "behavior on equality." Having sort parameterized to specify that behavior (or leave it unspecified) seems appropriate to me.

Parameterisation vs Specialisation

I agree this kind of behaviour is better expressed parametrically, but this is a different kind of thing to selecting an external or in-place sort depending on the mutability of the container to be sorted. I think both ways are required.

Mutability is a bad example

I'm having trouble imagining code that uses "sort" in a way that's agnostic to the mutability of the underlying container. When it comes to mutability, the road towards C++ or Rust is a dead end.

Sort Specialisation

The point would be to provide two different sort algorithms depending on the mutability of the container. If you are using Rust style ownership and borrowing, when you know you own the container, it is safe to destructively sort (as you consume the container with move semantics). In this case you are free to chose an in-place mutating sort if possible, or an external sort if the container is read-only. You would return either the original mutable container, or a new read-only sorted container. In Rust this is completely safe as when you take ownership nothing else can have borrowed the value. This specialisation can be provided by traits / type-classes, allowing algorithms generic with regards to mutability to be written.

Overloading and

Overloading and specialization are not coupled in any general sense. Many forms of overloading (interfaces, multimethods, etc.) and overloading in many contexts (e.g. dynamic languages, dynamic libraries) hinder support for specialization. And specialization is just optimizing with statically known data - it can be accomplished, for example, with constant propagation across a function call boundary. If Rust protects specialization of its traits, that is certainly convenient. But there are other ways to protect specialization or staging, e.g. an annotation that some argument or value should be statically computed.

If you want to write y = sort(x), that requires either overloading or that x carry relevant metadata (e.g. sort(x)=x.sort() being the extreme). If your goal is "generic programming" in a more general sense, then insisting on y = sort(x) is overly constraining your design space. Modulo universally quantified computations, I only rarely feel need for generic programming, so it isn't a hassle to add a parameter. Specialization, OTOH, is much broadly useful even within a type (e.g. pow(x,2)).

Overconcerned with implementation in an abstract sense

That's my characterization of this discussion.

My own point of view that languages are meant to convey means that I don't care how a language figures out which algorithm to pick, I don't care about whether it's typed or typeless, has classes or typeclasses or algebraic types or templates or manual case statements.

I just want it to understand what I mean when I say "sort(y)" and I want it to do that without me having to give it extra hints.

I don't usually want the fastest dispatch, I want the method that never gets confused, gives me an error or makes me solve a problem other than the algorithm. I just want it to stay out of my way.

yes but but

The devil is in the details. Your idea of what "sort(y)" should do is not the same as everyone else's. Let alone that of all the sundry programming languages.

"Because extensibility" is usually a main reason, I feel.

Do what I mean, not what I say...

Seems you are optimistic about what a PL (or any language) can accomplish, especially in a team programming environment. I hope that works out for you. For all of us, really.

To be honest

I don't program in a team environment, and the best "team" I worked with had the odd idea that we simply broke our program in huge very loosely connected parts gave each part to ONE and only ONE person and kept our work from interconnecting much with each other.

So we avoided problems.

I like the idea of solitary programming and languages that support that.

Probably not coincidentally...

The best team I ever worked with had exactly the same approach. Each module had an "owner." One guy whose job was to know everything about that module, do most of the design and coding work relating to it, coordinate with owners of other modules to decide its interfaces. etc.

This is probably not a coincidence.

No generality without specialisation.

I don't think you can write a generic program without specialisation. If you consider something simple like addition. We can clearly see the elegance of being able to define:

double x = x + x


As a generic function, we only have to write once and we can use double on any addable type. The existence of this generic function absolutely requires the specialisation of addition. If the '+' operator were not overloaded with specialised add functions for each numeric type, it would be impossible to write the generic double function. As such generality and specialisation are duals and one is not really useful without the other.

In Rust the dependency on being addable would have to be explicit:

fn double<T : ops::Add<Output = T>>(x : T) -> T {
x + x
}


conflating efficiency and generality

You probably can't make a generic program efficient without function specialization, e.g. inlining code for + rather than performing a call with attendant overheads and lost opportunities for contextual optimizations. But PL has a long history of implementations of generic programming with widely varying degrees of efficiency.

Consider: It's easy to understand function declarations such as your double as taking two parameter sets, i.e. <static parameters>(dynamic parameters). There's no fundamental reason these need to be separated with angle brackets vs. parentheses (as opposed to some other form of annotation). If we were interpreting the code at runtime, rather than compiling for performance, we'd manage the static parameters dynamically. We'd ultimately have exactly the same properties with respect to "generic programming" but we'd lose the "specialization" (and performance would likely be awful).

Consider also: Some compilers (such as GHC) make heuristic decisions whether or not to specialize a function that has typeclass dependencies. You can enforce specialization with a pragma, e.g. {#SPECIALIZE foo :: Specific Type}, and GHC will frequently specialize code if it looks profitable. But it isn't necessarily the case that generic code is specialized.

I think we should not conflate these ideas of specialization and generic programming, even if they are coupled in some programming languages. Specialization - constructing a unique block of machine code that accounts for statically available nitty details - is a feature of compilation, not a feature of generic programming. And generic programming does not forbid explicit parameterization, be it static or dynamic, though it is certainly more syntactically convenient and 'succinct' if you can implicitly (perhaps typefully) infer some of the parameters.

Addition and specialisation.

Which generic addition operation can add integer and floating point, or complex numbers, vectors, quaternions, and matrices? No hardware provides such operators. I just don't see how you can write anything generic without specialisation, although in some languages that specialisation is limited and implicit inside the compiler, limiting the usability of the generics.

Do you have an example of a generic algorithm that does not require specialisation to work, whether that is explicit or implicit specialisation? Obviously the identity function springs to mind, but I think that is a trivial and unimportant counter example.

Edit: a further thought is that generality without specialisation only applies to continuous domains. The domain of algorithms is discontinuous, therefore to have generality over algorithms requires specialisation. As all operations on a computer are algorithmic, even primitive things like addition and multiplication are implemented with algorithms, their properties are by nature discontinuous. As such there is no generality without specialisation.

I'm beginning to think you

I'm beginning to think you mean something different by "specialization" than how I have been using the word. That add op in your double example is a parameter. It might be an implicit, inferred, static parameter. But it's still fundamentally a parameterization on a single algorithm.

In compilation, specialization refers to a form of partial evaluation or partial application with constant parameters. This can be static or (via JIT) dynamic. This is what GHC's specialize pragma is doing. Your first mention of the word began: "Specialisation. I do not believe compiler optimisation can be efficient in all cases." So I assumed you're speaking of specialization the compiler optimization, which is pretty important for efficiency of generic code (but shouln't be conflated with it).

Template specialization refers in some PLs to an instance of overloading, i.e. using a single symbol (template name) to refer to an open set of algorithms. If this is what you mean by "specialization", it's more or less covered by "overloading".

Weirdly, however, I suspect you're trying to use the word "specialization" to refer to template-layer parameterization. I'm not sure why we'd want to use different words for parameterization at different stages. Though I suppose the two ideas could have become confused and conflated because template parameterizations are usually implemented by compiler specialization of code (i.e. binding the statically known calls, inlining where appropriate).

Unless we have time or space constraints, there is no generic algorithm that requires specialization (the optimization) to work. And overloading can in many cases be avoided via parameterization, which is what I've been saying.

Parameterisation vs Specialisation

Well you could parameterise all operations, but then generic algorithms would take a lot of parameters, for example the algorithm to calculate the average would require the add operation and the divide operation as parameters.

The thing is that there is probably only one valid specification for averaging a list of integers, to pass the floating point add function would be an error (and I believe removing the opportunity to make an error is a better programming experience than allowing it then reporting the mistake). The type system can enforce this, but the code still turns into a spaghetti of parameters being passed down to low level functions. For example my genetic accounts module that uses averages will end up parameterised by the add and divide functions as well as the average function to use. This is the dependency injection problem.

average (addfn :: a -> a -> a) (divfn :: a -> a -> a) (vs :: [a]) = divfn (fold addfn 0 vs) (length vs)


However, how do you chose the optimal add and divide algorithms for the type of objects in the collection? You really want the author of the 'average' function in the maths module to be making those decisions, not the user of the average function. Yet you also want extensibility, so that new as yet unthought of objects can be averaged.

What I am calling specialisation perhaps better called type-constraint based dispatch. I guess specialisation is just the mechanism you use in C++ and Rust to achieve this. What we want to say is:

If type T implements bidirectional-iterator dispatch to the version of "rotate" that uses a bidirectional iterator.

If type T implements forward-iterator dispatch to the version of "rotate" that only uses a forward-iterator.

The specialisation here is that these often overlap, with collections providing both a bidirectional and forward iterator. In fact a bidirectional iterator probably extends the forward iterator, so that they always overlap. We want to chose the most specialised implementation, so we use bidirectional if it has it, forward if the bidirectional is not available. Both C++ and Rust call this specialisation, so there is some agreement in this use of the term.

Or simple type based dispatch:

If T is a float dispatch to add_float

If T is an Int dispatch to add_Int

So it's not parameterised in the sense that the selection is implicit at the point of use based on the type or type-traits (type-classes or interfaces) implemented by the type.

When I was talking about compiler optimisation, I was meaning that a compiler is unlikely be able to optimise a bubble-sort into a quick-sort (without a library of algorithms and heuristics), and in any case it needs to be extensible and able to cope with new algorithms and heuristics.

Does that help move us forward?

Parameterization is what you want

You just need sugar to help automatically hook some things up. Look at Scala implicits. I think that's the right direction.

Modules and Implicits

Yes, that was the direction I was taking with my own language, you could define multiple modules parameterised on the same types, and explicitly pass the module, or you could bring modules into implicit scope, providing you maintained coherence. This has the advantage that the use site chooses which modules are in the implicit environment, so changing compilation order or including other modules does not change the meaning of code (unless you bring those modules into implicit scope in your own module).

That sounds pretty good.

I'm all for structural guarantees like "does not change the meaning of code", and precise local control of such things, via parameterization and explicit control of an implicit environment.

Different paths.

It sounds like we agree on the destination, but our paths to get there and the priorities we attach to the contributing features are different. For me I must have implicit selection (Rust traits, Haskell type-classes or C++ concepts), and the ability to explicitly select makes it better, avoids some corner case problems, and improves modularity. I get the impression you would put those the other way around, with explicit parameters as your must have, and implicit selection as the nice-to-have.

Thanks. I better understand

Thanks. I better understand what you have been trying to say.

I don't believe working with large sets of parameters is inherently problematic. We have mature ways to name and package such things, e.g. in your average example consider creating some variant of type Number a = { div: a → a → a, add: a → a → a } then you have average : Number a → [a] → a and you might add mul and neg and a few other functions to cover a wider set of algorithms with a negligible cost to the set of number types you support (esp. if your language supports row polymorphism, such that a larger package can fulfill a smaller one). This is more or less the same as explicitly parameterizing whole typeclasses.

I grant that search and constraint models can be very convenient and can work nicely with staged programming. I've explored a lot of such models, e.g. monads (or applicatives) that allow me to specify requirements, soft preferences, and providers, then search and 'compile' a result from disparate pieces. But I've come to prefer modeling these features myself, explicitly, such that I can better control, tune, modularize, parameterize, and understand the subprogram.

Towards that end, I'd favor good language support for staged programming and programmer guided compilation. The work on Surgical Precision JIT Compilers (slides, since paper is paywalled) seems relevant.

casual vs technical 'specialize' meanings

I am widening what is meant by verb "to specialize", depending on context of usage. (Yes, this is slightly funny, since it is also about context dependent usage.) But mostly I agree with this:

What I am calling specialisation perhaps better called type-constraint based dispatch.

Some devs say specialize to refer to type-specific interface variation in pursuit of the same semantic end goal. So that usage is present in normal casual conversation.

Often this is synonymous with overload, but you can overload and still violate the law of demeter. But specialize implies you intend to go to the same place, perhaps with a shortcut for efficiency. You want the same result, but either types or circumstances vary enough to cause a wrinkle (or performance loss) if you force things into an old interface.

This usage still applies when no language dispatch mechanism is involved. In C, manually calling a different function that suits another argument type is still specialization when "the same thing" is intended for the new type. It's a semantic characterization of interface: that it specializes arg types.

It seems less object oriented than role oriented. Some construct X has the role of handling some activity or computation Y, and new things keep coming up where Y needs to be done, but new stuff fits the old api poorly or inefficiently. So you specialize the X interface so it also accomodates new things.

Every time you generalize an interface to handle new inputs or outputs, this means specializing the new things in the interface. It's a different angle of view on a situation that generalizes.

Is that really specialization, from the programmer's POV?

Whether this even *is* specialization or not, as far as the programming language is concerned, depends on the selection of types in use.

For example in some PLs floating-point, integer, etc, are not distinct types. They are all values of type "number." So regardless of what hardware-type inference the compiler (or interpreter) does or which instructions it emits, or whether it needs to do dynamic dispatch under the hood in particular cases, the programmer need not think about specialization because in terms of the type system the programmer is using there is none.

Non extensible

I don't like it when some operators (like numeric operations) get some secret special ability (like being overloaded on type) hard coded into the compiler or interpreter. If the language can cope with both floating point and integer addition using the same operator (function) then I want the ability to do the same thing with the functions and types of my choosing.

I agree some programmers do not need to think about specialisation, however others do. Take a linear algebra library, I would expect the library developer would like to specialise matrix multiplication for different circumstances, yet the user of the library does not need to know about it and just calls the function with whatever argument types.

A general purpose programming language has to cope with the needs of the library author as well as the needs of the application author.

Linear algebra is a good example

of something _I_ want:

The ability to have regions in which a different dialect applies.

I want to be able to say "in the following region, arithmetic operators mean arithmetic over a polynomial, or matrix, or special field or whatever I'm working with" - and after a delimeter, we're back to the usual dialect

No secret special ability. It's a NUMBER.

That's the whole point though; there is no secret special ability. It is a value of type Number. If in the programming languages all values of Number have exactly the same semantics, then NO overloading is taking place.

Overloading is a semantic construct, not an implementation construct. Having one numeric type or several, with different semantics, is a semantic construct. But if all numbers have the same semantics, then they are the SAME semantic type, and any distinction under the hood is purely an implementation construct.

Not sure.

There are languages out there that allow infix operators like '+' to be used with floating point or integer types, and yet do not provide any mechanisms for the language user to define new functions that can apply different algorithms to different types. This is what I was referring to as a 'secret special ability'.

I agree if the language only supports 'Number' types then no overloading is taking place, but most languages allow at least integer and floating point arithmetic, and for which types, addition behaves differently.

Does "collections-oriened"

Does "collections-oriened" mean any structure that can be folded over?

The comparison to mulimethods just seems strange so I'm trying to understand the context in which the alternatives you proposed would overlap. And do modular type checked multimethods not address your concerns?

collections oriented

"Collections-oriented" isn't much better defined than "object-oriented". It is ostensively defined through a history of use. So the best I can answer is "more or less". Programming with unordered sets or relations can still be collections oriented without necessarily offering a well-defined 'fold' (because fold assumes a natural total ordering on the collection).

In many cases, generic programming can be understood in terms of intermediate representations, e.g. iterators are an intermediate representation between a collection of data and a processor of data. This results in the normal (NxM)→(N+M) definitions for front-ends to back-ends. As I tend to favor purely functional code, 'iterators' tend to get replaced by Huet zippers, lenses, or streaming. But the idea still holds. Collections-oriented programming styles are amenable to development of reusable intermediate representations and, via this, type-generic algorithms. More so than other kinds of values or objects.

I think collections-oriented and overloading/multimethods tend to achieve the goals of generic programming (efficient, usefully reusable code across types and other variables) in different ways. Sort of like the expression problem, where FP and OOP are both extensible but naturally support different kinds of extension. Either gets the job done, just with different idioms and patterns.

I haven't experienced "modular type checked multimethods", but the answer to your last question is probably: "it depends on the form of modularity". I'd want the ability to control, parameterize, the set of definitions (or typeclasses, etc.) exposed to a subprogram. This would let me unit-test and regression-test subprograms with arbitrary sets, more readily isolate problems when debugging, and precisely control which subprograms are open to extension with new multi-method overloads (and which types are open to extension, etc.). Even better if I could override the compiler's algorithms for deciding how to glue the pieces together.

Context

Actually, in PL design "context overloading" is where a different function is called depending on the type of return value expected by its continuation. Think of Perl for a minute and the way some expressions return arrays in an array context or the number of elements in one in a numeric context.

"Argument overloading" is what you're thinking of, where you call a different function depending on the types of the arguments, and that's the way most C++ descendants do it.

Few languages do both.

I once wrote a C++ template

that overloaded function calls on both return type and argument type.
Seemed cool.

Can you type arbitrary LC terms?

I wouldn't know, but the answer is likely: No. Or, at best, yes you can, but you get a nonsensical type as the answer.

Typing dynamic languages ad-hoc may very well be a pipe dream.

Typing arbitrary LC terms

The answer is yes, and they can have principal typings. You simply have to extend types to include a context, where the undefined types are tied to variable names. For example: \A . (A B) C would have the type {B : a, C : b} (a -> (b -> c)) -> c.

The answer should be NO

Untyped LC doesn't have the strongly normalizing property most typed LCs have. So, this doesn't make sense, since if an untyped term can always be typed, it should be normalizing too.

Link?

Recursive Types

I think when you introduce recursive types μx . T then you no longer have strong normalization. You need such types to type something like \x . x x x which would have a type like {} ((a -> (a -> b) as a) -> b) . We can combine types for recursive terms with free variables: \x . x x y x : {y : a} ((b -> (a -> (b -> c)) as b) -> c) . I believe type inference and type checking are both decidable with such a type system, as (first-order) rational-tree unification is decidable.

Edit: There is a close relationship between these kind of types and parametric types, in effect we have both universal quantification and type parameters. The type above could be written as a parametric type (noting the common use of angle brackets for type parameters): forall b c . <a> (b -> a -> b -> c as b) -> c however whilst this might make the behaviour clearer, it loses the fact that the type parameter a is tied to the free variable y.

Fully type-definite operations, your choice of verbs or nouns.

I think every type needs a "no value" entity unique to that type. You should never have a function that "returns a boolean or throws an exception" because booleans and continuations are different types. You should never have a function that returns "an integer or true" because integers and booleans are different types.

For consistency every type therefore probably needs a specific value like #NaN or whatever standing for "no such value is defined". What do you get when you test an inequality involving a NaN? You get NaB. Because you're bound to return a boolean, damnit, and you've just asked a question which doesn't have a proper boolean answer unless NaB by whatever name is included in your langauge.

If you have constructs whose semantics differ according to type of arguments given or according to type of return value demanded, you're going to have to compile separate versions for every combination and statically figure out how and where to link them all. Stalin took this to the Nth degree.

Finally you have to figure out where your language comes down on the "expression problem." Most dynamic languages go with verb abstraction - thousands of functions and procedures defined on at most a few dozen value types. Most static languages go with noun abstraction - trying to build everything and the kitchen sink into individual values with extensive user-defined types and type-sensitive functions and operators.

Having both is difficult, which is basically the same as saying the expression problem exists.

I think that's just options

#NaN isn't a good example, because despite its name it truly is a numeric object if not a representation of a number (when you model floats as intervals, #NaN is the whole real number line), and the IEEE standard defines #NaN == #NaN (where == is numerical equality) as false. Languages ignore that at their peril.

Providing a #NaB just amounts to always using Option[Boolean] instead of Boolean. Sometimes you want one, sometimes you want the other. The insight of Option objects (which is independent of static vs. dynamic typing) is that you are compelled to unwrap the object before doing anything, which means you need to explicitly handle the Nothing case.

Dynamic languages now commonly do provide extensible value types.

In any case, "language design is library design" applies nowadays in a different sense than Stroustrup probably intended, as designing a new language involves only a few "language" features and a vast number of "library" features. It's to the design of those that my remarks are mostly targeted. In Scheme there are only two places where the language itself interacts with the provided run-time types, both of them pain points: the fact that if treats #f as falsy and everything else as truthy, and the use of (mutable) lists to represent variable-length argument lists. Racket makes them immutable, but at the expense of being Not Quite Scheme.

Here's an idea: reinterpret Boolean, True, and False as syntactic sugar for Option[Unit], Some(Unit), and None respectively, and type the predicate part of if as Option[_] where None is falsy and all the rest are truthy (and anything else is an error). That seems to provide the advantages of "integer or false" return types, where in a conditional context we don't care what the integer is, without dragging in the full complexities of type algebra.

Much of the benefit of

Much of the benefit of dynamic languages is in the experimental/prototyping stage where types "don't get in the way." In that case, cake eating should totally be possible as types are added incrementally.

Without a type system, it is possible, of course, to write highly polymorphic code that defies checking in any reasonable type system. But in any case, such code is possible in a typed language by ignoring the type system and using dynamic casting features.

I think that eliminating

I think that eliminating implicit type casting is the most important single action we can take to make dynamic languages more type-friendly. Designing the language to favor context-free algebraic types is also useful.

This leaves dynamic languages with a lot of freedom for 'dependent' types that we might not easily infer, e.g. a record can be understood as a map of text→value where the value type depends on the text's value. But we can try to incrementally improve the idiom coverage of a dependent type-checker or abstract interpreter.

Doing away with conventional code-hiding modularity would hypothetically be useful. Assuming visible-source dependencies we could support ad-hoc and rich static analysis. We could always annotate against deep analysis explicitly when we want to control it.

Recently converted a dynamically typed language to static...

I went through this experience recently with http://strlen.com/lobster

It was originally a dynamically typed language, but decided I wanted to see if I could make it statically typed, without doing too much damage to the dynamic nature of the large amounts of code already written.

It required an expressive type system (not just type inference, but also flow-sensitive function specialization). It still feels like pretty much the same language. Some details on the type system: https://htmlpreview.github.io/?https://raw.githubusercontent.com/aardappel/lobster/master/lobster/docs/type_checker.html

I was "lucky" that the dynamically typed version didn't have any "crazy" features like being able to dynamically insert new fields into arbitrary objects or generate code on the fly, so there wasn't too much code I had to limit in semantics.

Links

Lobster

Type system

Mooi dat je weer eens wat post, Wouter!

Dank.. het leek relevant :)

Dank.. het leek relevant :) Welke Marco is dit?