Research vs implementation, theory vs practice, and LtU

There's a conflict between doing research and creating usable programming languages, and different people have different interests.
However, some LtU regulars seem to believe (some variation of) PL research is somehow pointless. (I won't try to offer an exact summary, it's too hard, so this might well be a strawman).

Instead of discussing this inside other topic, it seems this topic deserves its own thread, because the question is interesting. I'm not sure there's something new to say on the topic — this is an old discussion. And I guess that some of us will still not be personally interested in research. But I'd like to not hear the wrong arguments against research. I don't want to hear that research will never be useful in practice. I'm fine with hearing, say, that if you want something now, theoretical research is often not what you should look into — theoretical researchers figure out what you should use in a few years or decades.

I'll reserve my actual answers for the thread.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I realize that I'm not 100% sure what "theory" means

I personally am interested in all topics that might be summarized as different methods of implementing software. Sometimes "theory" might be something as concrete as a way of programming. For instance "functional logic programming" is a bunch of related capabilities for a programming language. It could include topics like "lazy evaluation", "needed narrowing", "delayed evaluation", "specialized solvers".

There is also theory that means something very dry and far from methods of programming. Clearly talk about foundations of mathematics has absolutely no mapping to programming methods.

Just as bad are programming models that exist only to make proofs. Those are so simple and unexpressive as to be useless to engineers - except where proof becomes new algorithm. And then those algorithms should be translated into a notation that an engineer can use. So no engineer wants to be made to write programs in languages that are restricted as to match a model of lambda calculus anymore than any engineers want to program Turing machines.

And a lot of other talk may be things that don't create any new methods, they just try to automatically prevent errors. I'm not interested in those - some of them they will be of use to industry some day or at least will be tried, but they sometimes do as much good as restricting programmers to toddler scissor and non-toxic crayons. You can't get hurt but you can't accomplish much either.

I would love to see more talk about logic programming of various kinds. Of search methods. Of reactive programming. Of embedding tools of all kinds into programming languages.

I don't care quite as much about optimization. When I find a useful algorithm, I often realize that it was in a direction that I avoided because it involved things that sound inefficient. The law that every problem can be solved by another layer of indirection is there.

Also new languages out of academia are often exploratory. They deliberately leave useful tools out, they deliberately simplify down to an area of research and force programmers to use new and restricted methods. They turn users into ginnie pigs instead of supplying a full set of tools. So logic languages and functional languages leave out very useful imperative dialects - and make themselves hard to use. So experiments in typing leave out less typed dialects.

Also every subculture of computer language and academic subject makes up ridiculous rules defining "good programming" and "bad programming" based on elevating minor problems or limitations that only show up in some contexts. One culture's idea of "unacceptable" is another culture's idea of "best practice"... We should face the fact that any method, no matter how strange, can become a new paradigm or be the best tool in a given context.

lambda

So no engineer wants to be made to write programs in languages that are restricted as to match a model of lambda calculus anymore than any engineers want to program Turing machines.

You do realize that the name of this site is "Lambda the Ultimate", don't you?

unlamba

Lambda is often seen as the

Lambda is often seen as the center of PL research, which is ridiculous of course. The OO community really needs to speak up.

It spoke up already

It has spoken up long and loudly and successfully. We will still suffer from the damage through the next few decades.

This is always the thing

This is always the thing right: the FP community wants to be the victim as well, since who doesn't like an underdog? But the truth is, PL academics has been synonymous with functional programming for the last 20 years, if not longer. Meanwhile, the programming languages that actually get used...for very good reasons mind you...are poo hoo'd as unfortunate accidents hoisted on the ignorant masses.

The indoctrination of a new generation of programmers to prefer functional code will not end well, they'll discover the realities of the problems they watpnt to solve do not match the fantasies of the pure abstractions they were taught to prefer. And they'll be back to objects.

Is there value in FP? Sure. But there are countless ways that computation can occur, and fixating on just narrow approach to computation is ridiculous.

Interesting that you think

Interesting that you think they'll go "back to objects". When OO was introduced as the solution to all of procedural programmings woes, and we then found it too was not a panacea, but there was no exodus back to procedural programming. We stuck with OO and enhanced it with what else was needed (design patterns, type systems, etc.). I'm sure the same will happen here, and some of those functional enhancements will be object-like, but not all.

On going back to procedural

There was in fact an exodus back to procedural programming. Or at least there was for OS implementation, device drivers, cryptography, and a few other fields that need to make programs with very tightly controlled and known behavior.

Engineers do evaluate their tools and reach for whatever they think will make the present job easiest. It's the beginners who just use the one language or the one technique they know. One must be extraordinarily resistant to learning new tools to try to pound in screws with a hammer; often it seems to be a religious matter.

There are jobs for which I prefer functional programming, and jobs for which I prefer procedural. The lower the level of abstraction, or the more important it is at the call site to know exactly that the computer is NOT going to do certain things to accomplish a given subroutine's task or do them implicitly because an operator has been overloaded or something, the better a fit for it procedural programming is.

There was in fact an exodus

There was in fact an exodus back to procedural programming. Or at least there was for OS implementation, device drivers, cryptography, and a few other fields that need to make programs with very tightly controlled and known behavior.

I don't think there was ever an exodus away to OO from procedural programming for operating systems. Certainly there are OSes that expose OO interfaces, and they use internal "abstractions", but they don't make use of OO as found in C++, Java, etc.

All the UNIX variants and UNIX-inspired OSes, pre-Mac OS X, Solaris, BSD, all the microkernels (Minix, Mach, L4, L3, QNX, VSTa), Linux, Amoeba, all the embedded kernels I've ever seen all use C, and some embedded kernels use Ada. L4ka uses a C++ subset without objects since virtual dispatch is unnecessary in a microkernel. The only kernel I'm not very familiar with is the NT kernel, so perhaps they use it, and Newton OS and BeOS had large parts in C++, but I would hardly count this as an "exodus" to OO.

Linux has plenty of hand

Linux has plenty of hand rolled v-tables (jump tables) and a heavy emphasis on named entities (handles and such). It is actually fairly OO, just in the way that OO was common in C before it was named and ascribed to limited C++/Java abstractions. One of the reasons OO as a thing sprung up in the first place was as a way to describe the patterns that were seeping into C code anyways (there is a separate lineage in Simula/Smalltalk).

One of the things that impressed me about VisualWorks Smalltalk, the OS was baked into the class library and a lot of it resembled what I learned in OS class.

"OO is just a name for..."

Might or might not have been Don Hopkins who dismissed OO as "a name for a bunch of techniques good programmers already use all the time."

Perhaps the OO language fallacy is the implied assumption that everything should be done using those techniques and not others.

Perhaps the OO language

Perhaps the OO language fallacy is the implied assumption that everything should be done using those techniques and not others.

That is a fallacy fallacy. Only novices and a few ideological purists go to the extreme side of either OOP or FP to believe they have only one magical golden hammer that is best at banging any nail.

The languages in use...C#, Go, Scala, even C++ and Java to increasing degrees, are all very pragmatic about it.

fallacy^3 (re Perhaps the OO language)

That is a fallacy fallacy. Only novices and a few ideological purists go to the extreme side of either OOP or FP to believe they have only one magical golden hammer that is best at banging any nail.

The languages in use...C#, Go, Scala, even C++ and Java to increasing degrees, are all very pragmatic about it.

I'm not familiar enough with 3 of those to be sure but C++ and Java share the characteristic that they want sprawling, multi-domain libraries to all have homogenous-style interfaces based on an ad hoc handful of OO techniques.

"Hand-rolled" OO forsakes that homogeneity for complete flexibility.

I did say

I did say "increasingly"...both Java and C++ are getting some FP religion, if somewhat slowly. Otherwise, why did they bother adding lambdas to both languages?

I think OO haters want to think that OO people think that OO is the only way, so it is easier to hate on them. It is quite simple villification. Just like I want to believe every FP enthusiast is avoiding state to an extreme where all their tools are command line batch commands (you know, so they are "pure"). The reality is always so much more nuanced.

The default stable state is balance. There is interesting research to be done on objects, and watching that community taken over by FP people who publicly hate on objects is quite heart breaking.

Is C++ really OO-only?

You're right about Java.

C++'s different: while you can find evidence for any claim on C++, C++'s STL is more focused on C++'s "generic programming" than OO, at least according to its own authors — and you won't find many virtual methods in there.

Multi-paradigm

IIRC, Stroustrup has claimed, for many years, that C++ is a "multi-paradigm" language. With an emphasis on OO and generic programming, to be sure. But not in any way restricted to a single paradigm.

Linux has plenty of hand

Linux has plenty of hand rolled v-tables (jump tables) and a heavy emphasis on named entities (handles and such). It is actually fairly OO, just in the way that OO was common in C before it was named and ascribed to limited C++/Java abstractions.

I agree modular monolithic kernels are necessarily organized this way, but this is OO in the same way that a car is a storage container: sure they share many of the same properties, but that's not at all what most people mean by storage space/OO, and when you're specifically looking for storage space/OO, it really isn't what you're looking for.

It depends on how you define

It depends on how you define OO. I have my own definition built around naming, encapsulation, and polymorphism...the nominal value (object) and abstractions and thinking that support it (in contrast to the pure value of FP). If most people define OO to mean...well...Java, then ya, you are right. But that is the power of something with a name, you can give it any meaning you want, whereas pure values are defined strictly by their structure.

I've been struggling with

I've been struggling with the question of OO myself lately. A major objective, going back better than 50 years, has been to allow effective language extension that just goes on and on, reaching ever higher levels of expression. Extensible languages were suposed to do that, but relied on macros and found out the hard way that macros are a lousy abstraction device. OO was supposed to do better, and in some ways it does, but it doesn't realize the grand vision of abstraction by its most enthusiastic advocates — whose most extreme claims have been vitually indistinguishable from those of the extensible-language movement that preceded it. OO misses something primordial that Lisp has. Seems to me the question to ask is, why is OO less like like a ball of mud than Lisp is, and what does that mud-like quality buy?

I think the pattern calculus

I think the pattern calculus repackages Lisp's power into a more usable form. OO is inherently limited to extensibility in a single direction.

I would argue that OO isn't

I would argue that OO isn't really about extensibility at all. It allows you to organize computation into interacting objects, where adding a feature is no easier than in procedural or functional. If you want extensibility, you really have to do the work and pay the elegance price, abstractions are more about managing complexity and communication.

You might be able to do something nice with virtual classes / family type polymorphism. However...even here I think you have to know a lot about your extensions before you can support them very well. Genericity helps out a lot, but in a way that is not unique to OO.

Disagree...

Objects are elements of mixed variance recursive types, which are (a) at the center of the theory of computability (eg, Scott's D-infinity model of the untyped lambda calculus), and (b) really, really hard to reason about. You can already see the problems arise in a (purely functional!) integer set interface:

interface IntSet {
   emptyset : IntSet;
   member : int -> bool;
   add : int -> IntSet;
   equal : IntSet -> bool
}

It's the last equal method that kills: we are passing an IntSet object as an argument, and so IntSets are defined to be things that process IntSets in a particular way.

This kind of circularity is super hard to reason about effectively. The only techniques I know involve powerful rely-guarantee-style methods combined with fancy logical relations to manage it. While it's possible (and necessary) to use similar techniques to prove the correctness of things like nonblocking parallel algorithms, IMO it's awful to be forced to deploy them for something as simple as equality of purely functional integer sets.

Complicated programs are going to be a fight no matter what you do, but we really should be able to prove simple programs correct using simple proof techniques.

I sometimes joke that the key advantage of functional programming is that it lets us avoid higher-order functions. It's perfectly reasonable to write first-order functional programs that do nothing but pattern matching on data. But in imperative programs, every statement is a state transformer, and so every semicolon is an operator on state transformers -- which means that higher-order reasoning becomes unavoidable. OO programs take these difficulties and then square and cube them. :(

In practice no one programmer has trouble reasoning about ==

If your theory makes it seem hard, then your theory is wrong.

Let me add, that polymorphic functions like == have a meaning that's "context dependent" and as such they have a higher information density in the source code.

They make source more powerful, more abstract. They are definitely useful.

It worries me, not that we have this very limited form of context dependence, but that we don't have less limited forms of it!

I do think we somehow need

I do think we somehow need to improve our understanding of imperative programming. That said, though, there's something circular about justifying FP on the grounds that reasoning about imperative programs involves ubiquitous state transformers, in that the decision to describe it with state transformers is essentially a commitment to the FP way of thinking about it. If you commit to thinking about things in an FP way, then it's not surprising FP becomes the simplest way to do so.

In practice

What practice are you talking about? Languages like C++ or Java don't actually use the pure object encoding that Neel is assuming here.

polymorphic functions like ==

I'm pretty sure you didn't actually understand Neel's point. Look at the object interface specification Neel provided and explain when it's safe to call equal on the rhs argument inside the implementation of your object's equal method and what you're allowed to pass to that (possibly recursive) equal call.

Perhaps to do with the

Perhaps to do with the difference between human beings reasoning about programs and computers reasoning about programs?

I think it's that humans

I think it's that humans would naturally avoid the difficult-to-reason about cases. They would restrict to the case where equal doesn't recurse on equal. (And note: this probably isn't the best example interface to study, because it seems clear there is no reasonable implementation of 'equal' if you can only use 'member' to probe the other set.)

homo sap

I think it's that humans would naturally avoid the difficult-to-reason about cases.

There's a deep insight in there somewhere. Because of course folks on both sides of the imperative/FP question naturally avoid the difficult-to-reason-about cases, except that they use different ways to reason and therefore avoid different cases.

Yeah that definition of == is silly.

If there is any possibility that you call == on the wrong type, then you define == differently.

It's not that it's hard to analyse it's that if there's any ambiguity then you would have defined == as
any -> bool or some set of types -> bool

Lua for instance, makes == a bit less useful (but easier to reason about) in that it's automatically defined as "== is false if the types don't match, then a user defined == will only be called if the types DO match." Lua's == has a hidden, unchangable layer on it.

So, yes, I guess I was avoiding your actual point which was not that == is polymorphic, but that the type of its second variable, in your example, was badly chosen.

So you don't mind polymorphic functions in general then?

After two tries

I think you're right that having pulled 2 different aspects out from his = relation, looking deeper at what he wrote, I see that I still have no idea what he thinks is hard about defining equality on sets.

And yes, he's wrong, there's nothing hard about it.

I could pull a 3rd aspect out, (that you don't allow parallel changes while testing) but I'd still be wrong. Does this have something to do with avoiding internal representations for data that relies on on non-mutable members? Are we to the level of fetish where (for instance) hash tables are considered bad?

If a test for equality that changes no state is seen as hard by functional programming advocates, then they're doing it wrong.

There is a case that shared state is a danger point. If it's between different threads, then it's an emergency, it's so hard to analyze that it needs locks or phases or something. If it's within one thread then it needs some vigilance. If you you CAN convert the code to a functional style, then you don't have a problem, which doesn't mean that you SHOULD convert it to a functional style. That's just a mental check, since the functional style is often a LOT longer and harder to read.

Just because there are things that a programmer should keep his eye on does not mean that those things are bad.

And yes, he's wrong, there's

And yes, he's wrong, there's nothing hard about it.

I'm afraid it is. It's a well known fact in type theory that objects can't easily do binary methods, of which the sample being discussed is an example. Most typed OO languages these days don't use proper objects for this very reason, ie. classes are hybrid of abstract data types and objects which makes binary methods tractable. See also Will Cook's papers covering ADTs vs OOP.

I'm not too familiar with

I'm not too familiar with Lua, but I don't think it's an OOP language. I think I've seen you mention SmallTalk, though, which is pure OOP. His comment is addressing pure OOP, so think Smalltalk. You might check out the paper William Cook's paper on objects vs. ADTs (see LtU discussion here) to brush up on the terminology.

So, yes, I guess I was avoiding your actual point which was not that == is polymorphic, but that the type of its second variable, in your example, was badly chosen.

The idea of objects is that all accesses to the object are through the interface. Many languages don't stick to this pure OOP approach and cheat, by letting you peak through to the internals of another object if it's in the same class, but that's not pure OOP. And this cheating loses you some of the advantages of OOP, because now different classes implementing the same interface aren't compatible.

So the problem isn't that the interface he's specified is wrong (it probably is -- try to implement equal -- but it could be fixed without changing the point), but that binary methods in pure OOP are tricky. I personally think binary methods in object interfaces is a bad idea.

I'm no longer a fan of data hiding anyway

I'd like to propose that addressing internal representations be textually offset, so that you can SEE which parts of your program are implementation dependent, but that the compiler not enforce anything.

For instance, in my own programming I've made up a standard where no methods are declared private, but that every function that is not part of a public interface is prefaced by an underscore.

Data hiding

I'd like to be able to break into encapsulated data if I need to - particularly for print-debugging style code. But I also want it to be easy to audit such violations, forbid them on build machines, etc.

Lua is an ad-hoc mixture

Lua uses open access tables in the place of objects and prototypes, and methods are not overloaded on type... except for a few special ones like ==

It's sadly ad-hoc, but it was designed by a professor of computer science and so none of it is very bad in practice. It's a high wire act of seemingly bad ideas that actually work reasonably well together.

State Change Denial

If your theory makes it seem hard, then your theory is wrong.

This "argument" keeps popping up. That doesn't make it right. It's wishful thinking at best, but more often it reminds me of climate change denial. If you actually took the effort to understand a little bit of "the theory" (*), instead of just dismissing it, then you'd know that there is overwhelming indication that the complexity is inherent, not incidental.

In practice, no programmer has a clue how to reason about these things properly. Popular belief to the contrary (especially among programmers themselves) notwithstanding.

(*) In fact, there is no "the", the problem has been attacked from many angles by many people with many methods over many decades.

the = he used as an example was a test of equality not a state

change.

No sure why equality is a problem. I've never seen anyone have a problem with equality tests.

He DID define it as having the second type have to match the first, so it's overloaded and if the types can't be decided at compile time you have a runtime error.

Once again, if that's a problem then he should have defined the type signature differently.

Fair enough, I only picked the title

Fair enough, I only picked the title for the pun. The content of the post applies equally to stateful programming and object-oriented programming, and even more so to the combination thereof.

Ok... Finally processing what you're saying.

But in imperative programs, every statement is a state transformer, and so every semicolon is an operator on state transformers -- which means that higher-order reasoning becomes unavoidable. OO programs take these difficulties and then square and cube them. :(

If you put the word "potentially" in front of each of those phrases they'd stop being wrong.

Change it to: "in imperative programs, every statement is could be a state transformer, and so every semicolon could be an operator on state transformers -- which means that higher-order reasoning occasionally becomes unavoidable."

OO, most of the time, just gives programmers objects to organize their apis around. Data is grouped, and functions are grouped in a class, but there's little there that even needs vtables or messages.

Some of the time it sticks in implicite look ups. Nothing that you couldn't do in an imperative language anyway.

1) You can write in a functional style in an imperative language, or an OO language. If you do that, your objections drop out.

2) You can often verify that large parts of your program COULD be written in a functional style without changing the logic. In that case, you know that your program doesn't have state sharing problems but it can still be more compact and easier to read by NOT doing the transformation. The principle can be followed implicitly.

You can recognize that a run of statements don't cause any state changes. You can recognize that the final result of a procedure is safe even if some steps weren't pure functional too.

Things are easier to analyze than you make it seem.

And your example is weird - you have to define equality in a functional language the same as in an OO one. That's not an OO problem. "passing an intset" to an intset is no different from "passing two intsets to a relation"

And this is a problem normal

And this is a problem normal programmers just don't have. They see an interface that takes itself...so what? It is hard...why? This is why theorists will continue to be ignored: they preach the evilness of self type recursion while the real world keeps using it to no ill effect.

If you give a real person the interface above, they can make sense of it much easier than a pure one that directly involved higher order reasoning. There is something in our head that allows us to vaguely associate names with complex behavior and then just move on with our lives, we don't ask why, we don't ask for proof, it just is because it is said so. This in turn allows us to build big creaking systems that would be impossible if we wanted to know that everything was perfect.

Pure FP tries to correct this mistake by making it all explicit, it will throw that higher ordering in your face because...truth! And then you'll never be able to build that big system, because you'll get stuck on that recursive equality method you need. 10 years later you'll submit a paper to ICFP about the real elegance of your pure solution, while in the meantime your competitor has already built a space colony on Pluto.

Hint

Hint: ADTs. Nothing higher-order about them, nothing complicated about them, nothing specifically functional about them, nothing much that we haven't known how to do with them since 1973+-.

Yep, and binary methods in

Yep, and binary methods in OO languages has been settled since the 90s.

different types awkwardness

When two binary operator args have different types, the way this is handled is only partly settled, because it seems to require something sketchy happen (like peeking at the "actual type" of an arg to decide what to do). Double dispatch is not very extensible when it's basically a case statement, while generic dispatch isn't supported in some languages. To fully respect another argument as an ADT seems to require an awkward after-you-alphonse dance, unless you itemize the grounding combinations explicitly.

(Ob disclosure: I have never been an academic, and almost always approach things from a practical rather than theoretic stance. If I sound theory-oriented now and then, it's induced by communicating -- whenever I explain an intuition from synthetic systems reasoning, it sounds analytical instead when that's not how I got there.)

I've recently implemented

I've recently implemented this in Type-less: the trick is that when you use a self type in an argument position, you basically give up subtyping for that use: the exact types will be known statically or parameterized with an open type variable, which is the same with any type parameter used in the write position! Now, that is kind of a bummer, but no worse than in a functional language that doesn't do subtyping at all.

So we've actually known that for awhile, why language designers don't apply that to production designs is probably a combination of convenience and the fact that no one outside of academia cares about the problem very much.

ADT specified dispatch

IMO, binary methods should always belong to an ADT. Objects are a pattern (~existential types) that's useful for letting values come equipped with their own interpretation of a set of operations, but it's a pattern that does't make any sense for binary methods. Any particular rule (this rule or any complicated multi-method dispatch rule) isn't going to do the right thing in general. The dispatch rule needs to be specified by an ADT defining the object abstraction. That's the only approach I see that makes sense in the general case.

I don't disagree. But when

I don't disagree. But when we are talking about binary methods, we don't necessarily mean the pattern matching that is typically associated with ADTs. For an equality method, add method, compare method, the types just need to be the same. Dispatch isn't even very complicated (definitely, multi dispatch isn't needed!). ADTs for more complex things lie between objects and simple values. I'm not sure where to put them in Type-less yet, if I want them at all.

Objects are great for identity and what goes along with that (encapsulation, state, virtual dispatch). There are cases where it is hard to tell if you want objects or values, e.g. The expression problem can be solved with virtual classes and virtual dispatch, I see meta expressions as objects, not values, especially since virtual dispatch is just so gosh darn useful when writing a type checker or code generator. There is some double dispatch going on, but nothing that is particularly ugly. Of course, I've written numerous type checkers and code generators over the years, so it is all simple easy to me :)

Others might see huge case statements as a better solution, and they see them as ADT values. Its personally not my hing, but there is nothing really wrong with a 500+ line method in Typers.scala, well...at least all of it is I the same place!

Same type

But what if you want to allow comparisons or additions of values that don't have the same type? Sometimes it's possible to define how to combine two objects in terms of the object interface. In that case, you don't want to require that the objects have the same class (which I assume is what you mean by type here). The terminology starts getting confusing in a general setting when people mean different things by "type".

(Also, by ADT here, I meant abstract data type, not algebraic data type. So I didn't intend an implication of pattern matching as the dispatch mechanism. Also, I personally don't bundle identity and state in with objects. Just the existential/record business.).

I intend to investigate your newest publication in more detail soon.

We are totally on the margin

We are totally on the margin here. Sorry, I got confused with the other ADT :)

Comparing values of different types honestly doesn't come up very often. In one case, we have identity comparisons, but that is a quite simple pointer check. In another, we might be sorting paths, but if they aren't the same type, they have parents that are and the check will never get to the different types elements. That's all that I can think of in my own code. I can see how it might happen in theory, but then you use a raw type parameter, fixed types, and dynamic checks of the need arises. I don't see how Haskell or O'Caml does any better here, maybe there is a concrete example to talk about?

I use both objects and values in my code, C# supports both though values can be a bit annoying (must write my own hash, equality methods for some reason). My criteria for whether something should be an object or a value revolves around whether it has identity or not: values lack those while objects get them for free. It is a deeper philosophical point that should probably argued more when I wrote something more polished about it. Also note that it my recent work, APX, over a time based computation, object identities change much less frequently then values, and value types change rarely if at all; you can bank on that in your optimization strategies (e.g. The slow/fast path optimization I discuss in the essay).

Supertypes can be incidental & matching in wrong case a disaster

I'm not sure what level of control is appropriate.

I found C++'s conservative answer, refuse to compile if there's any ambiguity a big thorn in my side, when trying to create types that could be used algebraically.

I also found automatic conversions, say by constructor, a big pain in the side - I'm glad the added the "explicit" keyword.

Lately I've been using non-statically typed languages, and not worrying about multi-dispatch in them, though I know it's POSSIBLE.

...

1) What do you think about erasing the connection between subtyping and matching? What if variables only match on interface or precise type?

2) In a non-static language is it good to have a "send-super" that disguises self as it's supertype until the outer return, so that parent class methods run as if the object type was the supertype?
OR is the only answer the C++ one:
In C++ the solution is to have static dispatched calls on (implicit) "this" in the parent so that the actual type doesn't count and call up the chain with static calls as well.

If a normal programmer...

If a normal programmer writes an equals method, we can basically take for granted that it's broken. For example, see the discussion in Bloch's Effective Java on how to write an equals method -- it's *twelve pages long*, and doesn't even start to discuss how inheritance and equality should interact.

The real solution is simply to not do that unless we have to. Occasionally we need the full power of processes or objects or whatever, but it's better to use simple methods whenever they are available, and reserve the big hammers for the big problems. The payoff is that it becomes possible to build big systems that don't creak.

For example, the CompCert and SEL4 projects shows that it's possible to build production compilers and operating systems that don't have any bugs at all. IMO, it's much more interesting to try and push full correctness to reactive systems than to build yet another broken, semi-working system. As an illustration of this point, with my collaborators I have a new draft on the type-theoretic basis for event-based programming. The implementation uses higher-order state in an essential way (with the expected complexity in proving the correctness of the implementation), but the interface is both pure and permits easily building abstractions over events.

"equals" is too important to new types to make impossible

If "equals" really needs subtle semantics (which I'm not sure I believe - if it does maybe the environment is demanding too much of it), then the ability to satisfy those semantics effortlessly is an important feature.

Object equality is trivial,

Object equality is trivial, it is just an identity check! Value equality should be trivial, but OO languages make it hard (even C# requires custom equals methods for structs, and iequaltible if you want to avoid boxing!). Then there are those that think custom equality methods are common and should also be easily supported, which is how we got I tot his mess for functionality that almost no one uses and is quite hypothetical.

Equality as an overloaded function

In C++ equality doesn't have a default.
This leaves equality open to being just an overloaded function instead of a method.

The only annoyance there is that it doesn't know that == represents something reflexive
bool ==(TypeA &, TypeB &);
has to be a different function than
bool ==(TypeB &, TypeA &);

Also the difference between AType, AType *, AType & and AType const &
is a pain in the butt

The problem that HE'D complain about would automatic casting from
AType const & and ATypesSuperclass const &
or
AType const * and ATypesSuperclass const *
which would both be deadly
..you can avoid the autocasts by getting rid of the "const" which looks wrong because they ARE NOT going to be changed by the function. It's just using the bug that if it doesn't KNOW that you won't change the object then it DOES know that it isn't safe to cast/match to the superclass.

I think, because of this, Visual C++ complains that const references are depreciated.

After using ducktyped languages, inheritance seems like nothing more than a convenience. But ducktyped languages don't usually have overloading on type.

I'm not so sure.

Object equality is trivial, it is just an identity check!

That only holds for certain cases, I would argue. What happens in a distributed system, for example? Object with id "x" on processor y should be equal to object with id "x" on processor z, surely?

Different Objects.

Generally if you change the state if object X, and the state of object Y does not change, they are different objects, multi processors or not. Equality is about identity with objects. Copies of an object are not the same object, unlike values where they are.

Not a state change

Having references in two places doesn't imply a state change. For example, with a copying GC, one might, in one process, hold two separate references to a single (potentially immutable) object which are not the same - one might be to the original object, another might be to a forwarding pointer. Suddenly, what was a simple pointer equality test becomes something with significantly more overhead. It's still an identity test, but it's no longer a trivial identity test.

If you allow mutation of object state, things get far more complex again.

In that case, you are still

In that case, you are still doing an identity check, just a custom non-pointer-based one! And it is actually quite easy to implement that, though a bit of argument dispatch might be required if the language doesn't support remote identity directly. In general, object equality via identity does not really need to be a binary method, while value equality really must be (since structure is checked deeply, and you often don't compare against "any"). But we typically treat them through the same interface, which frankly might not be for the best (or at least, have binary equality and non binary equality that includes a dynamic type check before calling binary equality).

Object equality is trivial,

Object equality is trivial, it is just an identity check!

Until you introduce transparent proxies, which any true OO language should support.

Still identity check, just

Still identity check, just one that is more complex than a pointer comparison. It makes sense for the language to manage identity carefully if possible; e.g. by including direct transparent proxy support.

My point was that it was no

My point was that it was no longer trivial, not that it wasn't an identity check. Now that you're supporting transparent proxies, consider proxies that forward messages across process boundaries.

One efficient mechanism is to have transparent proxies encapsulate something like a certificate chain, listing all intermediate proxies and the final proxied object. Of course, equality of this sort breaks entirely when messages are routed through proxies outside of the language itself, where the runtime cannot maintain a proper certificate chain.

So a fully general equality must be a sort of double dispatch method that starts with one reference, traversing the proxy graph to the root carrying the other reference to compare. Once at the root, it must then perform the exact same process with the other reference, passing in the root. This is unbelievably expensive when proxies cross process boundaries.

I think this demonstrates how proper general equality is highly non-trivial beyond toy examples. It's arguable whether object identity should even be a primitive. Most capability systems restrict identity checks via a distinct capability for this reason.

Transparent Proxies

I would not allow access to two transparent proxies. They would be limited to one per node. Therefore any code doing an equality test will have the same transparent proxy on both sides of the equality if the object is the same.

Given object O in B, B

Given object O in process B, B proxies it to C, C proxies it to D, B proxies it to D. Now D has proxies D->C->B and D->B. How would you collapse this as you describe?

Forward the original link

I guess a lot depends on implementation issues. When 'C' forwards the proxy of 'B' to 'D', it could connect the new proxy directly the the source, so you have D -> B for both of them.

You could just have a rule:

(==) :: Proxy a -> Proxy b -> Boolean
x == y = (original x) == (original y)

But I admit this all sounds like implementations details, and does indeed make equality more complicated.

You could do that, but then

You could do that, but then you lose the full interposition powers of proxies. For instance, capability languages use proxies to revoke authority, ie. the proxy has a local bool that can be toggled as to whether messages are forwarded, or throw a revocation error. Shortening the delegation chain the way you suggest breaks this property, because C must be able to intermediate (or the itermediation code itself must migrate from C to D, but how can D trust the code, and how can C trust D to honour the code?).

In short, I think "equality is trivial for trivial languages" accurately sums up the situation. Consequently, it may very well be that we want a trivial language just so we can keep equality trivial.

Should a proxy that can be

Should a proxy that can be revoked even be equal to the original, or equal to another proxy that could be revoked at a different instant? I'd say no.

I'd say yes, since identity

I'd say yes, since identity indicates aliasing, even if operations can fail through different proxies.

In my opinion a more

In my opinion a more sensible choice is that equality indicates observational equality. This would mean that different proxies to the same object are not equal.

Obama is still POTUS to his

Obama is still POTUS to his wife, to the American public, to congress, to the Chinese president, but they all have different capabilities on him. He doesn't become a different person for whomever he is interacting with.

Bogus analogy

Your analogy is not even wrong.

1. A proxy may have different capabilities even when interacting with the same party.

2. Physical life doesn't necessarily make good programming abstractions (the OO fallacy).

Because proxies are crude

Because proxies are crude mechanism for implementing party-oriented access? Why are you using proxies anyways?

Objects are linguistically inspired and therefore should follow the naturalism path as much as possible. Just like mathy functional abstractions should follow the math path.

"As much as possible"?

That is such a ridiculous notion that I don't even... So surely it would be "possible" for programming objects to emulate "natural" mass. Should they? Or are they more like subatomic particles that also have charge? But then, in quantum physics, particles don't necessarily have "identities" either. Which random pick of "naturalism" do you prefer? And if a proxy is like Obama, does it have to die?

Come on, wake up. Programming has nothing to do with the physical world. Modelling it on that is bonkers. Linguistic analogies are fine, but come after not before semantics.

Btw, I don't actually disagree that proxies are a crude mechanism -- as objects in general ;). But either way, crude or not is besides the point of the discussion.

Whether you like it or not,

Whether you like it or not, OOP comes from the point of view that programming should be as close to human communication as possible. That is why you have the subject of an action, the N does V orientation of the language. That Chomsky-style linguistic acquisition device in your head took millions of years to evolve, and we should use it! OO is language inspired, so why make decisions that go against that?

Functional programming is dominated by math and formal logic, it is not meant to use our language facilities, which is why Haskell code has f, x, and y that only a mathematician could love. Math (and science) is also a relatively recent innovation compared to language, it all requires learning and we don't have direct support for it.

Although I'd been told that

Although I'd been told that OOP started with a simulation language (SIMULA), I didn't really appreciate the significance of the point until as an undergrad (so, this would be in the mid-1980s) I took a course on simulation that used GASP IV (a simulation package using FORTRAN IV). Which drove home that object-oriented is a really excellent way to organize a simulation. That's how I learned to think of OOP as an example of the general phenomenon that somebody finds a good way to organize one particular kind of program and then tries to apply that organization to everything.

I don't think you'll get

I don't think you'll get many arguments that OO isn't great for simulation...sketch pad and simula started that way, anyways. But there is more controversy about whether that carries into more abstract programming tasks, which I think it does (those that involve state and behavior on state) but functional programming becomes much better for very abstract stateless programming tasks.

But this an age old PL debate.

Objects are bad for simulations

Objects encapsulate state, which is fundamentally a way of turning state into logical concurrency (recall Kay's remark that objects are small computers). By turning state into concurrency, you make programs robust to change by writing code which works under many scheduling policies.

However, this is bad for simulations, precisely because implementing simulations efficiently requires making the state explicit. You need a global view of the state to support operations like checkpointing the simulation, both for backup, and to support speculation/retry operations, as well as modifying the local step and grain sizes of the state updates based on global parameters.

For example, most modern games use an entity/component systems rather than a SIMULA-style OO architecture, because ECS's essentially make the game state completely explicit in a database (which also makes it easier for implementors to modify data layouts for added effiency).

You can totally encapsulate

Ah, this is my own research topic!

You can totally encapsulate the state of an object into a global array, almost every physics engine works like this! The interface is basically OO, the ID is basically indices into a simulation array (if on GPU, this is definitely a texture for SIMT update, otherwise could just be a global list of objects to be walked), the simulation then applies time steps on these arrays which are observable to encoded behavior through the object interface. Glitch/the virtual time machine take this pattern to an extreme (check pointing, rollback, etc...). Object abstractions do not work against this at all.

Entity component systems are just object systems for people who need objects but are uncomfortable to call them as such. They even use the term object everywhere in the documentation, and don't really try to hide it ("Clojure's object system"... ). And whatever you think about ECS, you could never call it functional.

Also, most modern games don't use ECS, but almost all of them do use some form of custom object system (with a bunch of patterns that include those used by ECS, but have been used in physics engines for a long time). This says nothing about how bad OOP is, just how Java/C++ objects alone are not good enough, and I totally agree with that!

Clojure's object system *is the host's*

Clojure goes out of it's way not to hide the JVM's (or JavaScript's) object system. Reusing the host object system without abstracting it away is an explicitly stated design goal. Your remark seems confused about this.

We were talking about ECS

We were talking about ECS and Clojure's preferred object system, not Clojure's support for the legacy Java object system.

Clojure definitely supports the Java object system, but all the documentation points to a more preferred object system that isn't that nor is natively supported by the JVM (which isn't very unusual, see Box2D, heck even EJB). That is the interesting one for the purposes of our discussion, POJO is not.

Preferred?

I don't think ECS is preferred by any means. At most one or two people experimented with it, wrote some blog posts, and moved on to other techniques.

I'm not sure what documentation or "preferred object system" you're speaking of.

http://clojure.org/state

Still, totally unrelated to ECS

Re-read that page. Clojure's "object system" is the hosts, which it encourages using by bifurcating your usage of objects in to two classes: "identities" & "values". Both are objects, although "identities" are mutable and implement equality in terms of object-identity, while "values" are immutability and implement structural equality. Different classes of identities provide differing mutation policies (sync vs async, coordinated vs uncoordinated, thread-local, etc) and different classes of values provide different data organizations (ordered, sorted, unique, associative, etc). It explicitly is *objects*, it's just not a style of programming *oriented* by those objects, but rather oriented by this particular classification of objects.

I did say ECS and clojure's

I did say ECS and clojure's preferred object system, I did not equate the two (or maybe by accident?). They are not equivalent, just examples of different ways of doing objects, all of which involve and even revolve around identity to some degree. I'm not sure we are disagreeing here, well, maybe in what oriented means.

The linked article does make some digs at OOP by focusing on POJO flaws. Also, there is a good deal of linkage to FP that doesn't make sense, and is anyways unsubstantiated and not backed up in practice (processing identity heavy data with map reduce, for example).

But I really do think clojure is on the right track with its philosophy.

In my opinion an even more

In my opinion an even more sensible choice is to not bake a notion of identity into objects.

Then what are objects? Take

Then what are objects? Take away their identity, you also take always their state and behavior, they become essentially just functional values.

Objects can have identity

Objects can have identity without requiring it be directly observable. The various capability operating systems are good demonstrations.

That isn't the same as not

That isn't the same as not baking identity into objects, but just that equality might be optional.

Well, my personal preference

Well, my personal preference is that identity not be baked into objects at all and that state be separated from first-class context of objects. But in the above, I just meant what naasking assumed, that you shouldn't be able to compare objects for identity.

If you extract state from an

If you extract state from an object and use identity to access that state, you still have an object with state that is just encapsulated differently. If the object lacks even an identity field, then there is no way to associate that external state with the object.

I didn't follow this. I'm

I didn't follow this. I'm imagining 'identity' to be an element of some abstract set that supports comparison to other elements of the set. You could associate state to such an identity, but you don't need an identity to have state.

Value with identity = object

Identity's number one feature is that it is necessary for an object to have state, you need it! If a value has an identity, then this identity can be used as a key in a map to get and set things in that map; if a value has state, then it can either be aliased (and it has an identity!) or copying it is disallowed (it is not a value nor object). Object-ness is then wrapped up in identity vs. the anonymous values of equational reasoning.

And by state, I mean the mutable kind, of course.

I agree with Rys'

I agree with Rys' assessment. The way I see state or mutable object interaction is as a (read/write) channel. While you'll need to have some way to name this channel, I wouldn't consider that an identity any more than I consider a variable name an identity for the value it holds.

But that just isn't state;

But that just isn't state; so our disagreement is in the nature of what state is. A channel is communication, not state.

Or let me put it this way: what is the benefit to be gained from understanding communication/observation and state to be the same thing? And those channels, if I never use them and stick to shared memory, am I just not doing state?

I'm confused. An object can

I'm confused. An object can maintain state internally but only updates it in response to communication. Viewed from the outside, an object isn't state. It's a target for communication.

All definitions for state

All definitions for state seem to be of the form "a particular configuration for a system at some point in time." An object has state, it isn't state. The way the object behaves depends on (and can update) its state, no communication is necessary for an object state update, unless you count a clock tick as communication! Time is of course necessary, just by the definition of state itself.

The point is that code using

The point is that code using an object doesn't need to refer to the identity of that object. All it needs is a channel of communication to the object. When a procedure is "passed" an object, that can just refer to there being a channel of communication open to an object. It doesn't require passing a "reference". This might seem like pedantry, but OO languages typically get things wrong for reasons stemming from these basics of the model. They let you allocate new objects in the ether and don't provide any way to manipulate a set of objects as a single value (cloning an entire object subgraph, for example).

I see what you are getting

I see what you are getting at. At some point, the channel must connect to the object, it must be grounded in an identity, otherwise no state will be accessed or observed! Identities are infectious: if a channel sinks at an object with an identity at one end, it can't really be a value, it has an identity even if that identity is just formed from the object it is connected to! Anyways, this is an interesting point: values shouldn't really refer to objects, I mean they can, but they become indirectly stateful at that point.

Meta-identity

At some point, the channel must connect to the object, it must be grounded in an identity

I agree there is a notion of identity in that sense, but I think it should be kept local and not implicitly reified. In the terminology I use, an object is a proper value, but only because I don't consider the object to contain its context. Example: suppose we have an object that depends on the time and temperature. The fact that this object depends on those two other objects being in its context would be part of its meta-level description. We could capture such an object as a first class value, but to use it properly we have to install it a context where time and temperature are part of the context. The agreement required between object and context serves as a sort of identity, in your sense, for 'time' and 'temperature'. But I wouldn't want that meta-level notion of identity leaking in to the value layer without explicit instruction.

Well, the object is its

Well, the object is its identity. It's state is not a part of it per say, but embedded in the system the object exists in. How hot is X? The temperature is not apart of X, but we can measure the hotness of X in its environment. Likewise, position and velocity are properties of an object in a physical system, they are not embedded in the object. That is why I'm claiming that identity is the key of object-ness. You can't talk about the state of something without a way of naming it, otherwise you just have values coming in over the wire with no context.

Even Java has many object systems where identity is managed outside of a pointer value. I guess it's just too hard to come up with a perfect all purpose object system.

I should have explained that

I should have explained that what I'm calling an object value corresponds to the state of an object at a particular moment, including its behavior plan that specifies how it will interact with an arbitrary context. It doesn't have an identity. In order to use the object, you have to install it in some context, giving it an identity in your sense. This seems to just be a difference in terminology.

Possibly. I'm fascinated

Possibly. I'm fascinated with objects from a more philosophical level, and I think the term "object" is often dragged through the mud to mean a POJO. But there is a lot of diversity in object systems out there, and also to manage state at a higher level (still, manipulating state through objects/vales with identities seems to work best, the alternatives being actor style anonymous communication and whatever they do in Haskell).

I can think of cases

I can think of cases where absolute object identity is undesirable. Such a system might work on anonymised data having no state which, taken together or separately, could be considered to constitute an "identity". Indeed, the only "identity" one might be able to assign is "it came from record x of dataset y", or "it (currently) lives at this address in memory".

All the notion of identity provides is the ability to search a collection by "identity" (as for any other item or revealed combination of object state), and, in certain cases, to decide if two objects are, in some way, "the same".

This latter concept of "sameness" is not in any way general - it's domain dependent. For example, in a banking system, 2 objects referring to my bank account are certainly "the same", even if one has internal state which is different to the other.

Sure, in many, if not most, systems, object identity is an important concept. But it's no different to any other piece of state, handling it is domain specific, and it's not a good idea to hard-wire any one specific way of handling identity into a language.

It is simply hard for

It is simply hard for something to have state without identity. Or perhaps it is better to say that no one could easily observe that thing's state!

I was going to say impossible, but Yampa and Antony Courtney's thesis goes pretty far here. I guess at that point, we are arguing about what "something" is.

Surely "something" in this

Surely "something" in this context is "an object". And an object, at the lowest level as I understand it, is some way of treating a set of related pieces of data, together, in a coherent way.

Now, (and again, as I see it) the choice of which pieces of data to deal with is entirely domain-dependent. So while the physical thing we're modelling may (and almost certainly does) have an identity, that may not be relevant to our domain. It's probably easiest to visualise with deliberately anonymised data - My browsing history does have an identity - it's mine - but once anonymised it simply becomes "a sequence of websites accessed by one browser over a period of time". The only identity this can have must, by definition, be synthetic and meaningless - all it can do is disambiguate what was once my browsing history from what was once yours. But even that is entirely unnecessary, because the problem domain doesn't require you to ever do that.

I don't see the concept of "identity" as being absolutely necessary. Where it's needed, it's simply a piece (or set of pieces) of state that can be used to query and disambiguate objects. An object may even have multiple identities.

The concept of identity is easy to latch on, though - we're used to things having identities, or being owned by things that have identities, but those identities are arbitrary, even in real life.

Object Semantics.

The notion of an object comes from the real world. I put a thing in a box. Both the thing and the box are objects. If I look in the same box I expect to see the thing, if another box I don't expect to see the thing

A reference to an object is like having a lot of boxes all numbered, and you can say, put this in box 23, get whatever is in box 17 etc. Effectively the box (a concrete object) cannot be in two places at once, but its name is a value and can be.

If you clone a box (and its contents) and remove the contents from one, it does not affect the contents of it's twin.

There are probably other "rules" that I gave missed, but I would say anything that breaks these semantics is not an object. It is important to distinguish between names, which have value semantics, and the objects to which they refer. Copies of names are equal, copies of objects are not, because even if they currently contain the same value, if I change one the other does not change.

Without mutation, identity is not necessary, and you have value semantics, which is basically a functional record.

In the OOHaskell paper it is noted that an object is the fixed-point of a read-only record and the functions that produce updated versions of it. You can see how the notion of identity is intrinsic to this. Without identity you get a new copy of the record every time you change it, the fixed-point creates identity by collapsing the record and all its future states into a single thing which you can refer to by name.

Thinking about the fixed-point we can see two ways the common notion of an object can be widened. The common object is a record, and the functions that produce new versions of it. We might also consider the fixed-point of a record and some global generic functions that operate on any record as an object, and at the other end the fixed-point of a global data record with member specific functions as an object. I would guess that the FRP signal approach is like the latter, and if we took the fixed-point of a UI component and all the signals it sent and received you would have something like an object. Even with FRP there must be state somewhere, if we consider a toggle button. Signals received by components are like methods, signals sent are like dependency-injection.

I don't think recursion can

I don't think recursion can create identity on its own, even a Y combinator will give you a value that can be updated but can't be aliased and otherwise talked about from far away. It is a flip flop which is definitely "state" but by itself not addressable. I don't think you can add addressing on top of that while still being pure...it is an interesting thing to ponder. How does OOHaskell add naming?

the FRP approach really does avoid identity, it is quite pure in the purest sense, avoiding aliases and making everything a value like signal. It is quite amazing how that works with collections, and provides a true contrast to the object approach.

Arrays

Consider an array of functional (read only) records. Taking the fixed point of an function that operates on the Nth element creates a unique identity. We can reference any element by its index in the array.

OOHaskell uses monads for naming, ideally a different state monad for each object, but this is complex to implement (you need to model effects using parametric monads and HLists to allow composition of state monads), so in the paper we had everything in the IO monad I think.

Neat! I really need to take

Neat! I really need to take a look. A naming monad makes a lot of sense, not sure why I didn't think of that immediately.

Identity isn't state, it is

Identity isn't state, it is the object's name. Imagine if you took all the names away in your program...you would have something very pure and very functional, you could not get state without creating some kind of feedback loop (Y combinator?) and even then it would be something that is totally different from an object...a flip flop.

I'm claiming that values basically become objects when they acquire unique names. That without these names, object thinking is not useful, and you are just not programming with objects. The flip side is that named values are not very pure and do not benefit much from the equational reasoning of FP, putting OOP and FP in very different design spaces (you can use the, together of course, but the designs for the objects of your program are quite different then the design for the values of your program). "Everything is an object" is wrong, while "avoid objects at all cost" is impossible unless you are programming in a truly pure language like Haskell (e.g. Doing pure FRP to manage state)

Identity is not the name.

I am not sure about this, but I don't think identity is the name. I think the name is a value that refers to an object. You can make copies of the name and it still refers to the same object.

However identity is the uniqueness of the object itself, it comes from the fixed-point, and is a self reference. You can always refer to yourself without knowing your name. In fact one object could have several different names, and this is not a problem.

Consider:

x = 0
y = x + 1
print x

If we take the fixed-point, such that y = x, then it will print "1". It is the thing referred to by x and y that is the object, not x and y themselves, which are just names for the object.

In a universe with only a single object, no names are necessary. You might call this a global object, but it still has identity, what you write to it you read back. Without identity the update would create a new version of the global object, but you would never see that new version because the version "referred" to in the program would be the original unmodified version.

That is a good point. When I

That is a good point. When I use the term identity, I'm implicitly including naming, but...if you were just doing say pure actors, self identity would be good enough.

this (state and identity) has come up before

I think Sean means changing state implies identity, because observing a change implies you had an idea of thing whose state changed. Otherwise you would see new values without a reason to associate them with old ones.

But I think you can observe change with only very weak notions of identity. If you can read a channel, when you don't know who writes it, you can ask for the current value of variable X and still be unable to compare identities of whoever answers the query. Identity of a channel doesn't reveal identity of other things more steps removed. [Edit: in the above identity is pair (channel-id, 'X').]

A proxy acts more like a channel than the thing proxied, unless a protocol guarantees more. Extent of identity seems like a contractual arrangement you construct.

If you are reading things

If you are reading things from a channel, is that really state? State is like: I poked X, someone else can tell I poked X. The poked-ness is state that only makes sense in the context of X! Let's put it this way: state by definition applies to an identified something. State can't just exist on its own without context!

Since we're talking about

Since we're talking about objects, I'm assuming we're talking about an identity check. The implication of your position is that transparent proxies have their own identity, just like any other object. On the one hand, proxies could be stateful like the revocation pattern I described, so having an identity kind of makes sense. On the other hand, it contradicts the very purpose of a transparent proxy.

Ultimately, I think Matt's position is the most coherent: equality or identity checks for objects ought not be primitives. The absence of equality makes OO awkward for programming in the small where equality tests on primitives are plentiful, but I think that pain is meaningful: objects aren't naturally suited to reasoning and composition at this level.

ADTs and values are a better fit here, with objects being more suitable to reasoning in the large where equality and other binary methods don't really apply. For instance, I'm not sure it ought to be meaningful to compare two references to Amazon shopping carts.

Since you've been handed two such references, then the authorizations for each reference are likely different, so you shouldn't even be able to conflate them or you risk introducing vulnerabilities. Really, all you ought to be able to with such references are add and remove items, and checkout the shopping cart.

Equality for objects is

Equality for objects is still trivial, it is an identity check and that's it! Now, whether the identity check is easy or not is another issue, most of the time it really is though.

The problem is that we often conflate object equality with value equality, which are actually very different things, and trying to handle both with the same core interface and abstractions leads to no one being happy.

Security issue

From an object-capability perspective, identity comparison is an operation like any other operation: one is subject to permissions, and one that it may be important to intermediate for good and valid security reasons. The notion that an identity check is "built in" raises security implications that shouldn't be taken for granted.

There are lots of arguments for and against identity equality in programming languages, some of which (on both sides) are compelling. This is why the KeyKOS, EROS, and Coyotos systems defer the identity check to a runtime-implemented object called "discrim". An entity holding the discrim capability (in OO speak: the discrim object reference) can invoke Discrim.Compare() passing two other references and obtaining a boolean result. This leaves us with the design result that the capacity for identity comparison is preserved, but that capacity is not promiscuously available.

Discrim

Separating out Discrim into its own capability seems like a good step, but what you've described sounds like it partitions objects into two camps: those that can compare arbitrary other objects for equality and those that can't. This seems like a principle of least authority violation. The capability most objects will need is the ability to compare the identities of a small subset of objects.

True OO

I would say that transparent proxies come for free with true OO, because true OO doesn't let you compare objects for identity. I'm defining "true OO" here to require that objects are accessible only through their interfaces.

Indeed, and this is why

Indeed, and this is why "true OO" can't really type binary methods like equality, unless you restrict OO to some hybrid with ADTs.

Security and programming in the large vs. productivity

I'm sure there are environments where you want a language that maintains all kinds of security invariants, but I would never want to program in a restricted environment like that, not more than I have to.

I also am not a fan of "pure OO" with the mostly pointless restrictions.

I guess you restrictions on what programmers can do may help when writing on systems that are so big that no human can understand the whole thing.

Huge programs aren't my area of interest either, and I'm not sure that the safety is worth a loss of productivity that comes with languages that restrict access patterns and prevent abstraction.

Because it's clear that systems like java that are designed for programming in the large also prevent programmers from coming up with readable notations, they prevent abstraction on purpose, also so that there is nothing about the system that's not visible on the surface.

Large, secure systems are, perhaps, a specialized use case and the needs of that case shouldn't be forced on everyone else.

Purity vs. productivity

I agree with part of this, but also think you're setting up a false dichotomy between purity and productivity. My experience has been that most "impure" programming notions, introduced by programmers who "just want to get things done", are terrible and lead to problems down the road. If purity restricts "what programmers can do", then maybe that notion of purity wasn't that useful.

It's not up to the LANGUAGE to substitute for experience and

knowledge.

It takes experience, talent, insight and more than just experience: practice, to be a competent programmer.

Maybe there are languages that are stradivarius and languages that are lock blox.

Backwards

You seem to think of purity as referring to a subset of some fuller language in which a language designer might choose to confine you. That's a rather backwards way of looking at things. The language designer starts with a blank slate. Usually when there's an impure part it's bolted on because the pure part was deficient in some way.

Re: It's not up to the

Re: It's not up to the LANGUAGE to substitute for experience

I somewhat disagree. Science and engineering is in large part working in frameworks built from other people's knowledge and experience, without necessarily grasping all of the intricacies yourself. I don't see why this shouldn't be the case with programming languages too, ie. a good language will provide sane defaults for equality, concurrency, error handling, etc. which was garnered from good PL research. It should be hard to do the wrong thing, and easy to do the right thing, which I think counts as the language substituting for experience to some degree.

Type?

I'm not sure what you mean by "type" in the above.

You can define a comparison operation in OOP, but it has to use the object interface of the comparand to decide on equality. So e.g. you can decide if two lists are the same if they have the same elements. A primitive equality notion is needed at the bottom. I don't think there's a way to build a notion of unforgeable identity out of this, though. So you'd have to include that as a primitive if you want it.

I'm not sure what you mean

I'm not sure what you mean by "type" in the above.

I had hoped my reference to "binary method" would disambiguate what I meant! I meant simply that the type of equational reasoning most people naturally reach for to test equality, ie. an ADT-like binary method T->T->bool, simply can't exist for pure OO.

I don't think there's a way to build a notion of unforgeable identity out of this

There sort of is, but you need impurity of some sort, like a unique shared reference. If two objects share the same reference, they are identical, which is a property enforced by the object factory.

So an equality check becomes a protocol that writes a primitive value to the reference via one object and compares it to the value read from the other. The E language has done a lot of work on this in terms of sealer/unsealer pairs, branding, etc.

Why not?

an ADT-like binary method T->T->bool, simply can't exist for pure OO.

Ok, so you're saying that this method doesn't exist, not just that it can't be typed? You can write a comparison of two lists by comparing them elementwise, right? Are you worried about how to identify primitive data? I agree you need some primitive operations.

transparent proxies

"Until you introduce transparent proxies, which any true OO language should support."

While I'm currently using a language that has transparent proxies and uses a trace-compiler to optimize the check somewhat, I'm not entirely convinced that allowing proxies everywhere in all contexts is worth the cost.

In a typed language, it's often pretty easy to explicitly make a type allow proxies without changing the code that uses the objects.

Logical variables are another example of optional (and in this case, possibly chained) indirection. Should they be allowed in 100% of contexts?

I don't see the world as

I don't see the world as black or white, functions and objects co-exist quite well and each serve a purpose in a healthy hybrid paradigm. OO abstractions are great at dealing with names, not just static ones but dynamic ones as well, aka values with names or "objects." FP is great at manipulating anonymous values that are completely identified by structure and therefore are amenable to equational reasoning.

The move to replace objects with functions either leads to poor objects under another name (entities) or systems that are truly more difficult to use to achieve a certain real elegance (monads). By taking another look at objects and correctly identifying what they are good at, we could build much better object abstractions then the crappy ones Java gave us.

Preventing errors automatically — separation logic

Thanks for your answer. Let me answer for one thing at a time.

Many error prevention tools fit your description to some extent. But separation logic doesn't — and it has been used (recently) to verify realistic C code with minimal changes. Somebody else wrote recently on LtU that a startup for separation logic was bought by Facebook.

Yet, separation logic was born out of theoretical work. And if you read the original papers (15 years ago), they also verify rather small examples. Yet, those examples were chosen well enough that, once you handled those, people managed to handle much fancier ones.

What's also cool is that separation logic was just a refinement of a line of work that lasted 40 years. Lots of this work wasn't practical enough yet — but what was missing was more research. And separation logic built on this work.

Lambda as a core

More seriously, do you realize that often, once you handle lambda-calculus, you can handle realistic languages based on it? If your "realistic programming language" is in fact Ocaml, the essence of what you need to handle is probably a lambda-calculus with references and a few other things. Sure, an engineer needs a lot of libraries on top of the core — but you needn't add support for each library function separately.
(This is the general idea, and details can get trickier, but this is how for instance compilers for many languages work).

In other cases, the real tool will need to handle additional primitives. For instance, many simplified languages have only integers as primitives, but you also need floats, doubles... However, often adding these things is just an exercise for the implementing engineers, not something the theorist needs to describe. (Again, there are exceptions).

Research languages are exploratory

Also new languages out of academia are often exploratory. They deliberately leave useful tools out, they deliberately simplify down to an area of research and force programmers to use new and restricted methods. They turn users into ginnie pigs instead of supplying a full set of tools. So logic languages and functional languages leave out very useful imperative dialects - and make themselves hard to use. So experiments in typing leave out less typed dialects.

I believe this is an interesting and useful observation. I think the "experimental" nature is by design -- and probably a good thing. On the other side, the "without supplying a full set of tools" aspect is a side-effect of the sheer difficulty of releasing the amount of tooling we have come to expect. There is a strong need for more work to make providing such tooling easier.

The "official" answer I was (somewhat) taught

Engineering a realistic language is a full-time job, also because you need libraries. (That's not just somebody's guess — the Socio-PLT research project produced initial data, showing that language adoption depends on libraries.)

The stated goal of academic research is to produce knowledge — and since a researcher has finite time, basically he must often choose between producing research and production-quality software. In other words, if the Racket authors took more time for making Racket more practical, they'd have less time to make it cool.

Also, research environments are often not so good at producing usable software (there are important exceptions though). Usually, it's far better if people outside of academia do that job.

(The above is over-simplified, just tries to illustrate some basic issues. And it's not like you learn this answer in some PhD 101 course, so feel free to correct/expand).

Two points.

Usable programming languages often come from hobbyists. Which implies that it's a matter of priorities, not absolute work.

Second point, programming is full of opportunities for clever shortcuts. If you need libraries, perhaps you can use preexisting ones. Maybe you can do automatic conversions. Maybe you can enlist help.

Programming languages come from all sorts of places

Usable programming languages often come from hobbyists.

As a percentage of the total number of languages generated by hobbyists, how many are actually "usable"?

In practice, programming languages come from all sorts of places: industry (Erlang, Java, Javascript), academia (Scheme, Scala, Haskell, Lua), and hobbyists (Ruby, Clojure, Perl). That said, academia is obviously focused on trying out new ideas rather than doing the same stuff over and over again. It's quicker to implement (even if you're using "clever shortcuts"), and easier to evaluate, a new idea if you just focus on the essence of the idea instead of on producing a "useful" programming language. But that demonstration of the essence of an idea can then serve as inspiration for other people who are developing "useful" languages (look at the way Pizza led to generics in Java).

Then again, many hobbyists produce languages that aren't "usable" because their focus is on learning about how to build a programming language rather than on making something for widespread use. I don't hear you complaining about them.

"Hobbyist" languages are built by communities

Yes, it's about priorities. To rephrase: production-quality engineering conflicts with the goal of producing new knowledge, and in academia you learn that early.

But it's also about absolute work. Usable community-driven languages come from big-enough communities, and building such a community takes work. And some social component which isn't clearly understood. If, say, I simply enlisted help I don't think I'd get much following.

Pre-existing or converted libraries can be of some (limited) help, because they won't fit well the new language if it is interesting. But they can be a starting point. You can see both aspects with, for instance, Scala: you can use Java libraries, but some are clumsy to use because they'd have simpler interfaces in Scala. (Plus, Scala pays a high cost for its interoperability). Maybe there's space for improvement, but not so easily.

Usable programming languages often come from hobbyists.

Some usable programming languages have come from hobbyists, surely?

The majority of them, though, even the popular ones, are godawful.

Perl, whilst useful and fast, encourages code that's impossible to maintain (and I'll refrain from laughing at the 15-year disaster area that Perl 6 has become) - calling Perl "write only code" is not a joke if you've ever been required to do support work on it.

Ruby is somewhat nice, and one of the reasons anonymous lambdas are hitting the mainstream at last, but it doesn't know if it's an OO or a functional language, and it's horribly inefficient. One of the Ruby projects I did required large chunks to be converted to C, and a custom Ruby runtime in order to perform with an acceptable level of performance.

And then there's PHP, the poster child of hobbyist languages.

To be honest

I came to LTU, and I made my hobby programming languages because I am not satisfied with any language, not one.

I have a number of ideas I want to try... Oops got to go, phone. I'll edit in some details on languages later.

Cognitive dissonance

There is a massive cognitive dissonance in Josh Scholar's arguments here. He will claim at the same time that Racket is unusable because its GC is not good enough, and that "usable programming languages often come from hobbyists". However:

- Most hobbyist programming languages actually have runtime implementations that are equally or more naive than Racket's (or most existing "academia languages"); they should/could be claimed "unusable" for the exact same reason

- Languages with a historically naive runtime such as Ruby, Python, PHP, etc. actually see massive usage in industry (eg. web frontends). Sometimes people will restrict the meaning of "industry" to subdomains with critical resource bounds (embedded software, some video games...) but that is unfair.

- If you look at it closely enough, many of the languages in wide use today actually come from academia. This is the case of C, C++, Lua for example. There is no particular pride to be had from this (academia happens to have a culture of sharing results and letting people work on hobby projects that is relatively fertile, in terms of open source software development in particular), but that makes broad claim about languages and academia very wrong very quickly. More specific and informed criticisms are still relevant and interesting.

Point taken but

Perhaps I should focus on run-times, not languages:

Lua has a good jit because of a combination of a hobbyist and industry backers.

A lot of people from every place make solid languages because they can rely on the JVM or .net

There's the hobbyist version of .net, mono that brings it to Linux and Android.

You used an interpreter-to-hotspot runtime, I have no idea if that's from industry or hobbyists. There's a similar framework for Ruby, I think.

I'm not content with these because of various features I want. There isn't a single jit/trace compiler that has all of the features I want to see, just as there's no language that does (though Racket comes close on both accounts - thus my frustration with it).

Hobbyist?! Are you nuts?

Did you seriously just call mono - one of the foundation blocks of one of the most widely deployed operating systems (eg, in smart phones) a 'hobbyist' version?

I think maybe this word does not mean what you think it means, because there are thousands of pros whose entire job depends on Android and they have to eat speak and breathe mono every day.

Ummmm

Android's not based on mono (the open source version of the .NET CLR, but rather on Dalvik (the open source version of the JVM), at least for the moment.

Neither of them are originally "hobbyist", in any case. Mono came from Novell, Dalvik (and its replacement, ART) from Google.

Racket is already practical

Since some of the meta-LtU discussion started with people insulting Racket, it's somehow become an example of a not "production-quality" language implementation. This is not accurate.

Racket is used in production in real systems ranging from high-traffic web sites to test systems for embedded devices to AAA game development, features a mature and well-tuned runtime and compiler, has an extensive standard library, and a vibrant community. It's certainly a language that comes out of academia, in the same category as OCaml and Haskell, but that's far from impractical.

In particular, the Racket community has worked very hard to make usable software out of the box, something often not true even of significant open source projects, let alone academic software. For example, it builds on Windows with native tools, and runs on iOS and Android. The software, especially the IDE, is usable enough that it's part of the curriculum in many high schools and colleges around the country, run by teachers after just a short workshop.

The meta-LtU discussion doesn't really relate to Racket, and I'm going to stay out of it, but I wanted to correct the record on Racket's practicality.

Google can answer "do other people agree with me?"


https://news.ycombinator.com/item?id=8700811

They're complaining about the IDE being too slow, pausing, using up too much memory. By they way, I disagree that it uses memory. I think they're just assuming that a program that pauses all the time is thrashing. Also, in my experience, disabling features doesn't help.

ANTSANTS 227 days ago

To expand a bit on "bloat," DrRacket does use a lot of memory for a text editor (don't have it installed ATM but I remember 100s of megabytes of memory in use in a typical light session of 1 file + 1 REPL), crashes somewhat frequently (a couple times in my test run, usually when trying out some of the crazier features hidden in the menus), and becomes less than responsive due to very frequent GC pauses (that are helpfully indicated by a little recycling icon in the corner of the window).
...
ANTSANTS 227 days ago
My Ivy Bridge i7 with 16 gigs of RAM is not exactly resource constrained. I'll take your word for it that you can disable features to get the resource usage down, but it's kinda sad that the default config can't run without stuttering on a machine like that, especially since the program is intended to be used in education where the average machine is significantly less powerful than that.

...
illicium 227 days ago
The GUI is very slow: scrolling a ~500 line file on a C2D 2009 MBP with DrRacket 6 is (was) painful.

...
Crito 227 days ago

I love racket, but DrRacket is just too slow and memory heavy for my machines.

My most powerful personal computer being a laptop with an 1.8GHz i5, and 4GB of ram...

I keep being accused of being too insulting, but I just can't hold back from pointing out that it says something about the developers if they're not aware of what people think of their system and are not eager to fix basic show stoppers like "Most people find the IDE pauses too much to be useable for long."

IDE vs. runtime

You can also find people complaining about the bulkiness of Eclipse, Visual Studio, or any IDE under the sun (or LibreOffice, for that matter). I would be ready to believe that there are some performance issues in the DrRacket IDE, but that does not mean much of the capability of Racket's runtime system. In particular, it is not the case that freezes in a GUI are always related to GC pauses; tons of other reasons, such as insufficient multi-threading, are more probable causes of UI freeze.

It says something about the developers if they're not aware of what people think of their system and are not eager to fix basic show stoppers

I don't know much about the dynamics of the Racket ecosystem development, but this criticism could be applied to other ecosystems I have worked on. When the vast majority of people contributing to a project are volunteering their time without getting much retribution for it (this is a fact about academia software: it happens by side-effect; except the few and far between research engineers, nobody is paid for it, and in fact everybody working on it has something they should rather be doing instead). It often happens that things that externally seem natural priorities for fixing are left as they are by the existing contributors, which focus on issues they are personally more interested in. (If I wanted to contribute to a language ecosystem, many places would be more exciting to me personally than the GUI code.)

If you want this to be fixed for a particular project, the best option is often to donate funding specifically for this particular issue, or to do it yourself and contribute patches.

Meta-opinion?

I would also be interested in your opinion on the meta-points. Would you like to see some technical or social changes in LtU? Do you think that some form of moderation would improve the website? What kind of submissions would you like to see more of?

LtU directions...

I would love to see LtU become a place to more generally discuss PL related things.

I would love to see more posts on concurrent programming languages, their theory, challenges, practice and implementations; to see more about other paradigms/ideas being considered for inclusion in a PL or more generally domain specific languages. For instance, are there any DSLs being considered for neural network programming? I speculate (but don't have time to do more) that reactive programming will fall out naturally from a stream programming paradigm -- so I would like to hear more about stream programming languages. What can be done at the language level (or tools level) to help with distributed program debugging?

To say it another way, I would love to see LtU become more of a `Pandora internet radio' of Programming Languages which exposes me to new PL music (which in turn inspires me to delve more deeply into some of it on my own)!

Good suggestions

Thanks for your post, it's very positive and refreshing. I think all of these things are welcome (and there's research on some); we "just" need somebody with relevant expertise picking and posting nice papers. Maybe I should spend less time participating in heated discussions, or even debating logics, and more time helping with this.

And I think that, on LtU, PL research sometimes comes off as more theoretical than it really is, just because posts are somewhat skewed. (Skimming the front page gives a different impression).

uniform access to concurrent interfaces

Stereotypes are a problem, like a sibling cast as the smart one (or the fast one, or the crazy one) when only slightly different from other kids in one family, compared to kids in other families. You can get relative positioning effects that make no sense in absolute terms. In computing, PL tech is the younger sibling of OS, who already staked out concurrency so aggressively that PL is ashamed to try. (Metaphor; work with me here.) Handling concurrency isn't a classic role of PL, so there's some resistance to getting processes into PL definitions.

I would love to see more posts on concurrent programming languages...

Yes, me too, but I'm interested also in explanations of features in a PL that replace, augment, or displace effects that would instead have been done by system calls in the normal situation when a PL doesn't even try. In other words, I'm interested in the story that rationalizes effects in system behavior, laid out in use-case scenarios.

For example, whether or not sockets have an API that stinks, I want to be able to open connections to anything from a PL process, where talking to another local process gets a PL-domain socket of sorts that is supported directly by the PL. Having remote and local behavior follow weirdly different behavior is problem-inducing, because it forces architecture differences for no real good reason, spinning out endless you-cant-get-there-from-here eddies in dataflow and control flow.

There are benefits that accrue to uniformity, when you want them, which get more important as scale and complexity increases. Structuring a large system as an army composed of quirky individualists is a funny way to plan a project campaign.

The kind of thing I have in mind

A hosted PL implementation has to rely on whatever the OS provides. But "one level up" it shouldn't matter. You can abstract out platform specific peculiarities to a large extent (and not use platform specific features). Issues at the service level are
- streams or messages
- RPC or send/receive
- how do do point to point (1-to-1), broadcast (1-to-many), multicast (many-to-many), publish/subscribe modes of communication
- how do you address the remote end(s)
- how do you detect/handle when the medium fails or when the other end goes away
- how do you do error recovery
- how do you control communication (access, latency, bandwidth, priority)
- how do you monitor things you want to control or depend on
- can you characterize how the system behaves normally, under heavy load, under error conditions, under attack
etc. etc.

Given the way the Internet has evolved all of these have been answered in different ways & multiple times but there is no coherent whole (at least that is how I see it). To me programming languages don't just provide a nice notation to save programming time and reduce errors but their formalism has helped us in other ways; particularly in being able to *model* systems (or build executable models), in building abstractions and indirectly in algorithmic analysis.

Given present realities, people are building complex distributed systems and duct-taping them together with perl/PHP/Ruby/Go/C++/Java/Python/Erlang/shell scripts etc. A lot of different components work together, written in multiple languages (often not well documented or well tested). Upkeep of such systems is painful and expensive.

So I think this situation is calling out for some sort of formalism. A distributed programming-in-the-large language that is amenable to analysis and characterization. I think that at this level it can't be "algorithmic" - it has to be declarative or structural but it also has to support abstractions and constraints and placement etc. etc. Then conceptually I can use something like Beazley's "SWIG" to wrap all the existing programs and use them with the new system while I am building new components.

That is the kind of research I'd like to see!

feel free to expand :-)

I like to hear reasoning about concurrency from folks even when it doesn't match my ideas. Otherwise how would I experience surprise? I'm slightly more interested in reasoning than specific answers, because exploring shape of the potential semantic space is the fun part, rather than any particular way of trying to cover it.

(One of my great disappointments in life is encountering so many people who suppose there's only one way and one answer, most of the time, and stop thinking as soon as a first solution is encountered, under a theory it outlines an entire space authoritatively. It's like first graders raptly accepting their teacher's received wisdom.)

So I find your post entertaining. Do you want to talk about your bullet points?

A hosted PL implementation has to rely on whatever the OS provides.

It's obvious a host OS for a PL is the bottleneck for several things (communicating with anything outside the local OS process), and throttles resources like CPU and memory, but the relationship can be folded in different ways using dependency injection, so control of some things can occur on the PL side even when OS supplied.

Simulating threads via fibers and multiplexing does not require any OS cooperation. But getting access to actual threads and more processes does require an OS api. However, virtually every OS with processes and threads has a way to spawn them and interact. Abstracting the interface involved is not hard, especially when an OS typically wants them to be as magic as possible in the first place, exposing few internal details.

Similarly, every OS with web access has a networking api of some kind, so getting messages and streams to remote entities can be assumed a solved problem, if you can find them at all in whatever name lookup system is visible. (There's a lot of asymmetry in entity visibility, so peer to peer architecture can require baroque hoop-jumping.)

Limits are usually imposed in a certain order, like playing a game of twister, starting with whatever part of a stack you decide is a stable given in your dev solution space. Assuming one OS, or a particular UI framework, or a browser interface, etc means you now have one foot or hand pinned down in a way making you revolve around it, perhaps becoming the sun in your solar system. If whatever you pin in place first is not very flexible, it can shape the way you see everything afterward.

I think usually folks want to squeeze a PL into whatever wiggle room is left over.

An example may help

Imagine that initially my "service" app runs on a single machine, as a single process. But then it is found to be very useful by all sorts of people and takes off like wild fire and starts acquiring millions of users a month and starts providing more and more services. And some of these services require cooperation of a large number of processes running on multiple machines. It is like growing a single cell organism into an elephant. Ideally I'd like to be able to do this entire evolution in a single language, or failing that, two of three languages. Clearly people are building such websites or other distributed systems but the evolution process is quite painful. If the same language(s) captured all the essential elements of this evolution, it might become easier, more tractable etc. and we'd be able to evolve these distributed systems faster. Such large beasts have their own characteristic behavior and needs. Conversely, if we had such a language to build evolving graphs of computing nodes we can start applying to other problems previously unaffordable (think how RaspberryPi or Arduino have opened up new apps!). Is this just a pipe dream? Possibly!

Throwing the baby out with the bathwater

I think this conflict between the academics and the anti-academics makes more sense on a tribal level than a practical or technical one.

As other have mentioned, academics are in the business of fundamental analysis, and coming up with new algorithms/features/ideas. Constructing and supporting fully practical engineering tools is not really in their job description. You could argue that it ought to be, but IMHO that is a matter for academia to hash out for itself...

My main point: computer science *is* in the business of building new toys for language designers to play with. There's no guarantee that any given toy is useful for your particular purpose, but that's no excuse for petulantly labelling the whole pile as stupid, boring, and useless. I mean, it's no skin off my nose, but it's likely to leave your project less effective than it might be.

Theory is not the end-all and be-all, but if you try to build your language without recourse to it, you are likely to regret it later. Quite likely much later, when an existing user base makes fixing fundamental problems impossible.

True, it might be hard to sort your way through great masses of academic publications. But, that's one point where LtU shines: suggestions of good papers particularly relevant for a given purpose...

One final point: just because you like one particular toy very much doesn't mean you get to shove it in everyone's face. Also, pretension to mathematical rigor is not an adequate reason to engage... I'm constantly surprised how this kind of thing is catnip to academics and professed anti-academics alike!

People seem to see the other side bigger than it really is

There is a blatant incompatibility between the point of view of some, that are convinced that LtU is invaded by the filthy academia, and others that on the contrary believe that LtU is lost to the climate^Wuniversity deniers that claim nothing of value is produced by public-funded research¹. Of course, most have more nuanced point of view; but I wanted to highlight the cognitive bias that we always seem to see more of the stuff that we dislike, to the point that two opposite groups can be convinced that "the dark (other) side" is in large majority.

¹: my description is humoristic but I do have heard different people claim both sides.

My personal opinion is that we have a good mix of both academia and non-academia people, that provides diverse viewpoints and keep both sides in shape. It is good that there is some amount of explicit skepticism on the usefulness of academia, and that we have this debate once in a while, because otherwise we would have no way to realize if it indeed became sterile -- and no one cared. And we need good references to what the industry is doing because there are a lot of (often poorly-communicated) important ideas there. (Z-Bo is great at providing such references; other people, you need to step your game up!) On the other hand, as I expressed recently, I find that the form of academic discourse (in particular the importance of a thorough study of Related Work) fosters good references that are easy to discuss constructively. (This is one of the reason why I personally post mostly research articles.)

To summarize, I'm happy that we have a diverse bunch of opinion, including some people with extreme views ("research is useless" or "there is nothing of value in mainstream industrial programming languages") that I strongly disagree with. But it is absolutely necessary that everyone could express his or her view in a respectful manner -- otherwise constructive discussion is stifled.

mainstream industrial programming languages

First , I am not an academic. However, I think you're being far too generous to mainstream languages / platforms. I've just had the "opportunity" to use unreal engine 4 (which means C++ and development on Windows) at work for the last 2 months - after not having touched C++ or Windows pretty much since the 90's. C++ is shockingly, laughably fragile and error-prone as is Windows 8.1 itself (it gets a blue-screen-of-death at least twice a day), and so are its so-called "SDKs". The MS documentation is poor, example code is nowhere to be found unless you copy and paste it out of the documentation. The code is so full of preprocessor boilerplate you cannot intelligibly reason about it. Unreal has apparently created their own tracing garbage collector - in addition to having a plethora of "smart pointer" referenced-counted gc. Ofc MS SDKs are COM and have their own reference-counting mechanism. Unreal uses C++ as a intermediate representation as well - they have a preprocessor (UnrealBuildTool) which generates additional C++ code with metadata to support reflection, for serialization, the blueprint scripting system, and the gc. They've created their own "module" system on top of C++ which limits the number of header files that are exposed to the C++ compiler. The build system itself actually consists of C# language source files, oddly. In spite of all that, build times are excruciatingly slow and error messages poor to absurd. Unlike good ol' gcc visual c++ doesn't indicate the chain of header files that led to an error, making tracking down errors another laughable exercise. Yes, I know most people on this list whether in academia or not are well aware of the "deficiencies" of C++, however I think it's good to have a firsthand reminder of what a ridiculous state "mainstream" programming is in.

C++ headers badly designed. Yes, you need to compose them.

You have to limit the number of header files that are exposed to the C++ compiler.

By design and specification, C++ has to parse a header file (even if it's #ifdef is false) EVERY time it's included, and EVERY time anything that's included includes it. The way headers are used by human programmers, this means that build times are exponential in terms of the number of header files.

When you hear people talking about C++ projects with tens of millions of lines of code that take days to compile? Well, those are mostly programs that take an hour or two to compile and spend wasted days parsing all the damn headers an exponential number of times each.

Sounds like an argument for modules

Other problems:

Boost has a couple orders of magnitude too many details, trying to fill in a poor language, so writing a new library that can be embedded into Boost and work with it and with the different combinations of Boost libraries and the standard libraries that came out of Boost is way too hard for mere mortals.

Languages like Smalltalk and Ruby that require you to add default behavior to Object sometimes make libraries that don't combine.

A lot of related work

A lot of related work discussions are defensive: the author really doesn't grock (or care about) the work they are citing, and are only protecting themselves from having their paper torpedoed in committee. I've seen it happen and know how the game is played (PL is much better than HCI in this respect, but still).

Academia in theory is a great thing: scientists can work on hard problems, take big risks that lead to great advancements 5 or 10 years down the line. But academia as practiced often fails to live up to that promise: fads with puppy mills, citation circle jerking, bullies who feel it is their job to beat who they see as weirdos out of the community, and an unhealthy obsession with the past.

This is more of a problem in fields without clear breakthroughs. So ML is pretty safe right now (everyone is riding the deep learning gravy train, advancements come quickly, their is a clear goal to reach human intelligence), but PL has none of that ATM. We have no grand challenges or super hot advancing technology.

Frankly, honest discussions about what is wrong with academia never end well because of pervasive deep insecurity about how it really works; offending people is easy.

I agree that there is way

I agree that there is way too much complaining about academia here, (and elsewhere, even more so) and I'm saying this as an industrial programmer. I for one enjoy learning from everyone here, academics and practitioners included. Many people here have opened me up to new perspectives on computing, some of which wasn't "practical".

It's not to say that their criticism of academia is invalid, but I'm rather tired of people complaining about an article just because it is not immediately useful to them at work, or simply doesn't match their interests.

Mainstream influence of academic research

Academic research has already been quite influential: type systems, lambdas, pattern matching, etc.

Which academic research will have a big effect on mainstream programming?

I'll start the list:

  • Dependent types
  • Linear types (already happening to some extent in Rust)
  • FRP (already happening to some extent in JS libraries)

Dependent types

Dependent types don't allow you any new methods or algorithms, it just lets you call something that programmers (and even assembly codes) have been doing since the dawn of computing "a type".

Maybe it's a reason for theoriticians to be happy, when they can finally describe what people have been doing for 70 years, but there is nothing there to influence.

Not even wrong!


A machine code or a byte code can pretty much be

described as a dependent type (think of its format in memory), for instance. Did these things not exist before someone had that phrase?

The type of "these things"

The type of "these things" certainly didn't exist (or wasn't known), even if they did. The typing of existing abstractions is exactly the point.

But since they were ubiquitous before they had a name

formalizing them isn't going to change people's programming methods, it's just going to give an imprimatur of formalization to what everyone already does. That's why I said it wasn't something exciting that's going to change what engineers do.

I'm just speculating because this isn't my area (and I haven't seen a reason to be excited enough about it to learn it) but my guesses are:

It might lead to languages that have better support for what people already do. It also changed "type" from being something simple that's computed with structure unification into something that might require arbitrary code... I almost approve of that, but as I said once, I consider "types" one of those limited languages that mathematicians love because it's so simple that they can make proofs about it. But simple is the opposite of expressive, so on the whole I prefer to avoid it. If we start making types into turing machines, maybe we should just forget them and use the full power of programming in their place.

Here's a question, if the people concerned with types have a new toy, should that affect how programmers program or is this level of reasoning about programming not really the same interest?

Your argument seems to be

Your argument seems to be against types in general. Most mainstream programming is done in typed languages. Therefore improvements in type systems will have an impact, even if you don't like types.

Static programming

I think prevention of errors isn't the [biggest] reason to be interested in types. Rather, it's because they facilitate compiler understanding of code. This can enable programming models that aren't easily encoded into a general purpose operational model. A not-so-great example is that Haskell needs its type system to enable lazy evaluation in the presence of side effects. But I think that's just the tip of the iceberg.

I'm not particularly excited about dependent types as a programming tool, though. I think current incarnations intertwine the role of logical assertions about the program with the meaning of the program in a counterproductive way. I do think some form of dependent types should be supported, because it's a convenient notation in many cases, but I don't see them ever becoming a cornerstone of mainstream programming.

Bug-free programming

I do hope, on the contrary, that the concept of writing both a program and its specification, in such a way that the compiler can check that the program implements the specification (that is, virtually bug-free programming), could become "mainstream" in a "few" years (say 20 years to be on the safe side). This will never be the case for all programs (in some cases we simply don't know what the specification is), and it may not involve dependent types as we know them (only refinement types?), and it very much depends on how usable we are able to make these technologies, but it seems to be a reasonable research goal -- given that a few exceptional individuals are able to do this today already.

(Of course all the other, not correctness related, advantages that we see to type systems would still be present in such a system, only stronger.)

Not contrary

I broadly agree. I would say programs and specifications aren't necessarily different -- a program is just a specification for which an operational semantics can be extracted. We should aim to write specifications in their most natural form, and sometimes that will just be a program. Tooling can only check consistency, not correctness, of specifications, so we should overspecify our programs with logical assertions and tests that are unlikely to be consistent with the kinds of errors we might have made in the program.

I'm just skeptical that dependent types are the interface we want for this. For one thing, I think whatever we do needs to be "gradual". Over-specification makes programs harder to change, almost by definition. Logical assertions and optimizations are often best added in later passes at code.

Also, my previous post lost a word during editing that altered its meaning somewhat. I've added it in brackets.

I don't think bug free

I don't think bug free programming is even in the top 5 of the most important things type systems do.

  • Types guide program design
  • Types do a basic level of sanity checking for a cheap cost
  • Types provide documentation and aid reading code
  • Types allow for better performance
  • Types enable type classes
  • Types enable better IDEs (instantaneous type error feedback, code completion, etc.)
  • Types are interfaces for module boundaries

Dependent types can be used to rule out all/most high level logic bugs, but I'm skeptical that this will be mainstream any time soon because:

  • Most people actually don't care about correctness. They care about getting 98% correct software fast and cheap, not about getting 100% correct software slow and expensive. Most businesses fail because their software doesn't do enough good stuff, not because it does too much bad stuff.
  • For most of the remaining software where correctness does matter, there isn't a sufficiently easy mathematical specification.
  • For the remaining software where correctness does matter, and where there is an easy mathematical specification, you can often generate the code from the specification instead. Think of generating a parser from a grammar, instead of writing a parser by hand and proving that it parses the given grammar.
  • For the remaining software where correctness does matter, and there is an easy specification, and the program can't be generated from the specification, it's usually cheaper and sufficient to test the program against the specification rather than prove it correct.

Therefore I think proving programs correct won't be mainstream until we have such good automated theorem provers that the cost of proving is almost zero (the cost of writing the specification will still be there, so even then this won't happen for a lot of code).

Note that all of the points in my first list are about doing more good stuff in a given timeframe, rather than about doing less bad stuff. People will use software regardless of how buggy it is if it does enough good stuff. For every piece of software that failed because it was too buggy, there are 100 that failed because they were too late to market, didn't have the features users wanted, had a bad user interface, couldn't adapt to feedback quickly enough, were too expensive to develop, etc. So given that time and money are finite resources, trading off all those things for fewer bugs is usually not a good idea except for safety critical software like the software that controls a car's brakes.

Alternative to types

Types guide program design
...
Types provide documentation and aid reading code

I'm currently writing some complexish software in a language so untyped that it doesn't even have classes.

It's liberating - the code is so short.

But I find that to replace the documenting aspect of class interfaces, I have to write more documentation by hand so that I can come back to the code after a time away.

I have a tendency to write that sort of thing anyway, but now the difference is that it is necessary.

And you know what? It's just fine. All of the "training wheels" features of languages can be replaced by voluntary practices. The voluntary practices are more flexible and they work.

Of course the problem with bad programming may not be YOU it might be your stupid coworkers.

You are absolutely on the

You are absolutely on the ball. BTW are we going to talk about Bracha vs. Fellisian on LtU? It is tough getting through a one hour video, but they seem to agree on the point that early but detection is not the most important type system feature.

Also, worse is better embodies the philosophy that broken but useful and timely is better than prefect but useless or late.

Where I don't like "worse is better" is in PL design

the idea that you don't need powerful programming tools is where worse is better fails utterly.

I agree, but the point I'm

I agree, but the point I'm making is not about worse is better in general, but about bugs in particular. That applies even for programming languages. Would you rather have a well designed complete programming language with an 98% correct implementation, where the remaining bugs can be fixed if it is successful, or a poorly designed or incomplete programming language with a 100% correct implementation?

Bracha vs. Felleisen

Link to what you're talking about?

I'm behind the wall right

I'm behind the wall right now, but they just had a debate at curry-on, the video is in my social feeds, I'll link when I get to work.

Yep. I was a bit

Yep. I was a bit disappointed no blood was drawn, they essentially agree on most things. I really learn more when truly opposing philosophies get into it. Maybe what I really want to see is Wadler vs. Bracha.

If you want blood you need

If you want blood you need Bracha vs Harper.

No no... if we go there,

No no... if we go there, Cook vs. Harper.

Well if it wasn't

Well if it wasn't informative then I'm glad I didn't watch it. I find it very hard to watch videos. The bandwidth is so low. I would watch Gilad Bracha vs. Conor McGregor, though.

It is still very interesting

It is still very interesting to watch. Don't get me wrong, you should really take a look.

Most businesses fail because

Most businesses fail because their software doesn't do enough good stuff, not because it does too much bad stuff.

I don't think that's necessarily true. People abandoned the original Mozilla browser because it hogged memory and rendered slowly, compared to the less featureful Firefox. Then they abandoned Firefox for some of the same reasons for the less featureful Chrome.

You could say that "is quick and uses few resources" is "good stuff", but that sort of metric means you're moving the goalposts in classifying "good", ie. something that was once "good" cannot become "not-good" simply by the progress of time, all else being equal.

On the other hand, the Mozilla browser was always recognized as having been over-engineered and thus bloated and slow, ie. "bad stuff". There was simply no better alternative that didn't have as bad or worse "bad stuff". Only when an alternative alleviated the over-engineered "bad stuff" did Mozilla's software really take off again.

Chrome

I think Chrome's advantage is that each page runs a separate process, this allows it to use the operating system's access to hardware memory mapping - ie when you close a page the memory is instantly back in the system, it doesn't have to wait for an optional compaction phase* in a garbage collector. The same fact also fixes Firefox's threading problems.

*if there even is one.

Nah, compaction is mandatory

Today's browser users expect to be able to keep tabs open for weeks. And 30 at the same time, too. Memory footprint and memory leaks are the number 1 problem for browser vendors and application developers these days, especially on mobile. (The web's inflationary feature creep doesn't help.)

Your product is inititially

Your product is inititially successful because it is novel, timely, and useful. But if you don't eventually pay back some of your technical debt, your competitors will begin to eat you alive.

Real life is quite nuanced: shipping on time, correctness, safety, security, performance, are all important to some points. you have to do them all right enough, and you don't need to do any of them perfectly. The lines are all drawn in the sand.

You could say that "is quick

You could say that "is quick and uses few resources" is "good stuff", but that sort of metric means you're moving the goalposts in classifying "good", ie. something that was once "good" cannot become "not-good" simply by the progress of time, all else being equal.

With "good" as a general statement, yes. On the other hand performance is usually not what you get from proving your software correct, on the contrary, it's something you trade off. Do you spend your time optimizing, or do you spend your time writing proofs? In some cases you may even need to pick a less efficient algorithm because it's easier to prove correct. So I don't think it is moving the goalposts, but rather an unfortunate choice of words (FWIW, I did already classify "Types allow for better performance" with "the good stuff").

Classic

All of this are rather classic arguments and, if you look carefully enough (and optimistically enough), unconvincing. (Except the part about more automation being needed: yeah, everybody agrees there.) If you would like to discuss it in more details, maybe you could create a separate forum topic?

art, craft or engineering

Depending on what program you're writing, programming is one of the three.

The needs are different, depending.

If you're making a medical device, sending someone to the moon or controlling million dollar machinery, proofs of correctness would be nice.

But even though people's lives are at stake, I google's self driving car has software written in whatever is fast enough, you know C, C++, assembly, that sort of thing.

It is even worse than that

It is even worse than that for self driving cars. Most of the software isn't written by programmers at all, and no one has any idea how it works, just how well it performs along with its error rate. DNNs and RNNs aren't much into black and white proofs of correctness, and there is no specification anyways, just corpus.

Is it recognition in neural nets?

I think we had a thread here where I was arguing against self driving cars.

I was alarmed by the attitudes of people on slashdot, they're sort of technically aware people, yet the assumption that whatever a machine does, it's much better at it than a human was being pushed over to areas where the opposite is still true.

It really is the future;

It really is the future; many of us are aware of all too fallible human capabilities. Computers are great at repetitive monotonous tasks, they can do them much more safely at that. It is only a matter of when now, not if.

It is fascinating to think that there are programs that we can write but not prove, but there are also programs that we cannot write but can generate through training. And these programs lack specifications to avoid bias, their "functionality" is discovered completely through training.

There is plenty of good

There is plenty of good influential work in academic research, but most of it seems to be noise published for the sake of publishing and making tenure cases. So, you can surely spot the gems 10 or 20 years after the fact, but if you want to find the next gem today, it can be very difficult. That is why FRP is only now picking up after being invented in 1996 or so, or even the tech going into Rust today. Academia just has very inefficient filters topped off with the normal time it takes for invention to turn into innovation.

So if you want to ride the wave of useful research, it is better to look at 10+ yro papers that have somewhat stood the test of time. Having discussions about current work can be very frustrating.

Not a problem in practice?

While it would be presumptuous to claim they are "gems", I don't see any of the "noise" problem you describe with the research articles I referenced on LtU in the past few years. It's impossible to predict what will and will not have a big, lasting impact (we are not doing divination), but there is still a lot of work I find very interesting -- and of course I didn't post everything I liked, even when restricting myself to stuff I found appropriate topic-wise for LtU (that is, not too theoretical).

There are many more interesting references that have been provided by other people participating to LtU.

I do agree there are some negative incentives at play inside academia, but it hasn't prevented the publication of much more quality work that we would have the time and space to discuss on LtU.

Those submissions are fine

Those submissions are fine but look at the conversations. Many of them are empty, some of them just go off in random negative directions unrelated to paper content and rather inspired by the titles. We can argue that there just isn't enough people on LtU, but there are plenty of lurkers and anyways, conversations about those papers aren't really happening anywhere else either; that is just the problem with publishing.

Not all of it is about negative incentives, well, it is the model that is really broken. How do you get more than two people to read your paper? Keep in mind that just because someone has cited it doesn't mean they bothered to read more than your abstract and conclusion. The converse of that is why are the papers I read so dry and seemingly devoid of anything I can actually use? One can imagine the author and the reader switching roles often, but the complaints are the same. I'm sure my tastes in publications are unique, but I don't think you'll find many people who think the current system is very effective in catering to their specific tastes.

There is just too much content out there to read anything but those you think will be gems, unless they cater very narrowly to your specific interests or you are doing duty on a committee. Likewise, as an author, go big (only write what you think could be gems) or go home, unless you need to play the "game of chairs", in that case, write baby write!

More about LtU than academia

It wasn't my point in providing these links (and in fact I didn't even look at the discussions while collecting them), but you are sadly very right that the conversations on these submissions tend to suck (or be empty). Some work (Zélus and Mezzo for example) certainly have a niche status that could justify why they aren't discussed much, and for a while I just assumed that I would eventually find some things to submit that would be of interest to a wider part of the speaking LtU participants (I had hopes for the Pycket work, for example), but I'm slowly coming to the conclusion that there is simply something broken with the discussing LtU community.

On the other hand, the amount of lurkers should not be under-estimated. There is a surprisingly large number of people that do get the LtU feeds and read them with interest. Some of the submission above (and many other of interest) have a small comment of the form "interesting, thanks": people can read an article with interest and be almost or completely silent about it. Having a comment to make, a question to ask or a discussion to start is not an automatism -- I find that most people do not train themselves to be able to do this. Contributing to LtU has value in term of audience and thus impact, irrespectively of the on-site discussions. For a concrete example, the "theory of changes" submission led to a very interesting interaction with Bob Atkey, in the form of a blog post.

I have had many interesting and thoughtful discussions about other people's work, and thus do not buy your pessimism. Some places are extraordinarily good at fostering interesting discussions, such as some technical subreddits or for example the Homotopy Type Theory Blog. LtU currently fails to be one of those places, but it could very well become one again (I think we would need higher moderation standards for this to happen).

I see. I've come to the

I see. I've come to the conclusion that the PL community needs improvement, at least in impact, relevance, and single-to-noise ratios, and whatever happens on LtU is just a side show compared to the larger problems of stagnation and apathy. Maybe we need an old fashioned revival!

And...there it goes. What does Homotopy Type Theory have to do with PL? There was a discussion over on pl-enthusiast about how PL is more than just PL now, applying some theories that incubated in PL to other fields (at least that sounds like the plan, not sure how the receiving fields feel about it). But you know, I just want to design languages, my focus is on the programming experience, and I grab or invent whatever technologies I need to support that. But there is a lot of work that is quite the opposite, has nothing to do with programming experiences, and is like...we started out in PL, developed this tech, and now are staying in PL while we talk about how we use this tech almost completely outside of PL contexts.

Probably at the end of the day we are just interested in completely different things, which I guess is to be expected (if you put 20 random PL people in a room, they would all probably just kill each other).

Interesting

That's an interesting counter-point. How would you picture LtU in a world where (1) your assumption about the cause of the observed symptoms is correct and (2) the problems there are solved? I imagine a forum where academic researchers produce Bret-Victor-esque talks monthly (altogether), followed by intense discussion of which one seemed the most exciting of the lot. I don't really believe it would be reasonable to expect this, though.

HoTT is about writing mathematical proofs as an act of programming in a very specific system -- an extension of today's dependent type theories that is not yet fully defined and understood. (There are several subcommunities inside HoTT interested in different things, but I believe this is a common denominator.) I would not classify it as "programming language" research myself (it is type theory, but not all type theory is PL-oriented); I only provided this link as an example of a fruitful discussion forum inside (a neighbouring sub-community in) academia. People expect that the language design resulting from HoTT (for example higher inductive types) may have programming applications, but this not studied a lot for now (see Homotopical Patch Theory, though) because the field has some more fundamental problems to solve. In any cases, those application would be at the level of dependently-typed languages for now, so certainly a niche area as long as we haven't found how to use automated theorem proving to make them really usable for non-experts.

I think in an ideal world we should be able to do interesting "outside PL" work (I would be careful with this "outside PL" qualifier though; most actual applications, for example domain-specific languages for software-defined networking, are stuff that the industry would happily consider PL-related even though it does not touch "programming experience" much) with more programming-centric work, without any of the topics or methodologies leaving the others in the dust.

I do believe that this pushing-out is happening to some extent, and this is a problem we should try to fix (you didn't seem particularly with my best idea to help so far...). But I don't believe (yet?) that this implies anything negative about the stuff that is better-accepted in PL academia today -- at least I personally fail to be not-interested by it. (While I do have non-interest for some others programming-related computer science research, such as UML or "the internet of things"; but that is possibly because I don't know much of these fields.)

What does homotopy type theory have to do with PL?

If you are convinced that dependent types are important (which you are probably not, but bear with me), then homotopy type theory is important because it solves the issue of equality. Homotopy type theory is dependent types with equality done right: it has the right notion of equality for functions, for types, and it has user defined equality on user defined types.

It also does other things, like allowing mathematicians to do synthetic homotopy theory inside type theory, but from a PL perspective that is a side curiosity.

I've never seen dependent

I've never seen dependent types motivated outside of proof assistants (well, I did see a cool synthesis oriented code completion demo once). I guess it could be argued that proof assistance are PL (after all, COQ is a PL), and languages like Dafny definitely are (since they bring the power to the programmers), but a lot of that research is framed as pure verification/safety work that really doesn't feel like the same thing that I do.

Motivation for dependent types

Programming language design is for a large part about aesthetics. One of the biggest aspects of these aesthetics for me, which you may or may not agree with, is getting a lot out of a little. What I mean by this is that the number of rules or constructs in the system should be small, but the number of things you can do with them should be large.

In conventional languages you have two layers: the value layer and the type layer. You have a value like 3 and a type like Int. On the value layer you have functions: something that takes a value and returns another value. On the type layer you have generics: something that takes a type and returns another type (like List<Int>). Dependent types are based on the idea of unifying these two layers. Instead of having two separate layers you have one layer where types are simply values of type Type. Then generics are nothing more than functions that take a value of type Type and return another value of type Type: List(Int) just like sqrt(3.4). There are no artificial limits. You can also write a function that takes an Int and returns a Type. The type of finite length arrays is a function Array(T,n) that takes two arguments (the element type, and the length) and returns a type. For a 3d game you could use Vector3D = Array(Float,3). Just like you can pass around sqrt as a higher order function you can also pass around List as a higher order function. List, like sqrt, is just a value. sqrt has type Float -> Float, and List has type Type -> Type, but other than that there is no artificial distinction between them.

You can do this in a language like C# using reflection, but that's different. You are getting a run-time representation of a type, but this lives in a different layer than the types that appear in type signatures. You can't construct a type using reflection and then use that in a type signature, for example like so:

Type MyType = ... construct some type using reflection...;
MyType myValue = ...;

With dependent types you can.

That's the fundamental idea of dependent types. You get a big reduction in duplication (e.g. only one kind of function, rather than value-level-function and type-level-function aka generics). It turns out that you can do mathematical proofs in such a system, but you can also do programming in it. Instead of having 20 extensions to a type system like Haskell's which add more power but more complexity, you get a system that is more powerful and simpler.

Maybe that sounds terribly complicated, I don't know, and dependent types are not exactly ready for prime-time yet, but I think this idea is very promising. If you think about your own interests regarding tooling, wouldn't it be nice if all those great tools you developed worked for types exactly as they do for values? Instead of a debugger for values and a type debugger for types, you have one debugger that happens to work for values of type Type.