Need to Talk

Someone has a need to talk.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

the pond has gotten more crowded

One basic contributing cause is: There's too many fish in the pond of academia.


If academics are expected both to publish and to keep up with each other's papers, the burden on each academic is proportional to the number of academics (in related fields). This is fundamentally impractical. There is no way to keep up. Even filtering mechanisms fail: it becomes difficult to discuss and reach agreement on what are the 'must read' papers.

Consequently, most academic papers aren't even relevant to other academics, much less to industry.

There is a lot of crap that

There is a lot of crap that you have to know just to protect yourself in PC deliberations. The utility of most academic papers is practically zero.

But I thought we knew we

But I thought we knew we couldn't predict utility or impact so well, just if we learn something?
Or are you talking about papers that claim practical utility they don't have?

I mean, a lot of papers are

I mean, a lot of papers are written not to inform, not to share ideas, not to provide practical know how. They are just written to be a mark on someone's CV, and they aren't meant to be read by anyone beyond the PC judging them.

Community split?

Isn't that why academic communities specialize and split?
OTOH, I have read PL papers and thought "but isn't this already in $otherPLPaper", so this mustn't be working ideally.

but of course

(a) nothing ever works ideally, especially when it comes to humans.

(b) Conway's Law.

I would say that there are

I would say that there are too many crumbs being thrown into the pond for the wrong reasons, leading to fish that behave in perverse ways to get those crumbs (and they also reproduce for the wrong reasons).

re: There's too many fish in the pond of academia.

I think it's more like academia has become a site where some instructors fake having relevance to cling to a career and most researchers are mercenaries, pirating charitable or public funds for private gain. In any event, PLT discourse has become thoroughly detached from relevance. Endless discussions on LtU abut topics such as "sapience" are probably a symptom.

PLT is More like Arcane Knowledge

The money was always in algorithms, libraries and frameworks anyway.

And with Scala as more or less the end-of-the-line in research, there's simply not a lot to be done anymore in PLT.

With advances in machine learning we're back to a focus on algorithms, I guess.

And with Scala as more or

And with Scala as more or less the end-of-the-line in research, there's simply not a lot to be done anymore in PLT.

Traditional directions in PLT research are pretty much at the end of the line. But that doesn't mean we can't reinvent our field, or have it subsumed by a new field that addresses how humans can use and form abstractions to build things by themselves (vs. machines doing the same).


Traditional directions in PLT research are pretty much at the end of the line. But that doesn't mean [my ideas are].

Make sure this sentiment covers my ideas, too. :P


Academics keep working on what they've been working on and pretend its important because otherwise they have severe existential crises. As a researcher, we are supposed to identify the next big thing, but we've lost our way into rehashing the same topics over and over again. So, it is not that case that sentiment should cover "my" ideas, but that my ideas should not just be based on intellectual momentum, and align with areas where we can do the most good.

I know it is a concept that is difficult to grasp. You can argue that I'm merely showboating my own research topics, but I just see the rest of the field as ignoring the huge opportunities right in front of their face.

Roy Levin (DEC SRC, MSR SVC) always used to first ask new/prospective researchers "what is the most important problem in your field", and then "why aren't you working on it?"

No accusations

I wasn't accusing you of showboating. I was just observing that you, like everyone else, are biased towards understanding the relevance of your own work.

I think the low quality of a randomly chosen CS paper is mostly just a consequence of Sturgeon's law and not a failing of CS in particular. I hear people complain about publish or perish. I'm not an academic, so I don't have strong opinions on the matter, but I have personally found much of the academic literature quite helpful in organizing my own thoughts on language design (I owe a debt of gratitude to those authors who've made their work freely available online).

Roy Levin (DEC SRC, MSR SVC) always used to first ask new/prospective researchers "what is the most important problem in your field", and then "why aren't you working on it?"

I never liked this question, or at least the implication that every one should be working on the most important problem in the field. Why isn't everyone working on the most important problem in the world? He presumably understands why not, since he phrased it as he did, but it's the same answer to both questions.

Sorry for taking offense, it

Sorry for taking offense, it is just something I'm passionate about (and have gotten into vicious arguments about it before). Everyone "believes" they are on the righteous path, if they are not true believers, then cognitive dissonance makes them believe just to keep their world views consistent. Besides Sturgeon's law, there is Gresham's law, where as a consequence of overvaluing one thing, you crowd out other more fruitful things. We have chosen a narrow set of topics to talk about, fund, and admit into the internal academic feedback loop.

I would separate out PL form CS in general...there are sub-fields where publications are not having such an adverse impact. In particular, fields that are relatively dynamic (improving the state of the art significantly every year) are quite immune to this (you can always find something useful to read at least).

I never liked this question, or at least the implication that every one should be working on the most important problem in the field. Why isn't everyone working on the most important problem in the world? He presumably understands why not, since he phrased it as he did, but it's the same answer to both questions.

I think the point is that we should put a lot more consideration into why we are working our topics...will success really move the needle? Taken literally, you are definitely correct. I brought this up because I feel it would be intellectually dishonest to not work on the area if I didn't think it was the right path toward needle moving.

Wait, you're not Matt Might?

Sorry if I ask, but you write "I'm not an academic", while I have assumed you were the same guy (and replied accordingly) for... forever?
And I've also always assumed you were an academic because of the research plans you describe and your familiarity with scientific literature.

If that was an error, I apologize for the confusion (even if invisible to most), but please take it as an attestation of esteem :-)

I Might not be

Nope, not me. Thanks for the compliment, though.


The original source of those questions is often misquoted. There was not any implication in the original that there was a unique most important problem:

Over on the other side of the dining hall was a chemistry table. I had worked with one of the fellows, Dave McCall; furthermore he was courting our secretary at the time. I went over and said, ``Do you mind if I join you?'' They can't say no, so I started eating with them for a while. And I started asking, ``What are the important problems of your field?'' And after a week or so, ``What important problems are you working on?'' And after some more time I came in one day and said, ``If what you are doing is not important, and if you don't think it is going to lead to something important, why are you at Bell Labs working on it?'' I wasn't welcomed after that; I had to find somebody else to eat with! That was in the spring.

In the fall, Dave McCall stopped me in the hall and said, ``Hamming, that remark of yours got underneath my skin. I thought about it all summer, i.e. what were the important problems in my field. I haven't changed my research,'' he says, ``but I think it was well worthwhile.'' And I said, ``Thank you Dave,'' and went on. I noticed a couple of months later he was made the head of the department. I noticed the other day he was a Member of the National Academy of Engineering. I noticed he has succeeded. I have never heard the names of any of the other fellows at that table mentioned in science and scientific circles. They were unable to ask themselves, ``What are the important problems in my field?''


The full quote is still an incredibly arrogant way to talk about research. The question itself is interesting, but "I have never heard the names of any of the other fellows", wow, wow.

It does sound that way. I

It does sound that way. I have wondered if he was completely serious, or if it was tongue-in-cheek. After watching the video I still can't decide. Either he has an extremely dry of humor, and perfect delivery, or just a low opinion of anyone who is not on their A-game.

It wouldn't be weird to have

It wouldn't be weird to have a low opinion of everyone who is intellectually dishonest, so...maybe they are just shunning the fakers, which we have plenty of academia (the people that write papers for citations and CV building rather than for communicating).

Hamming vs. lame researchers

Either he has an extremely dry of humor, and perfect delivery, or just a low opinion of anyone who is not on their A-game.

I think it was partly his way of trying to combat laziness, complacency, obsequiousness, and timidity. Being a well-placed researcher, at least then, courted those vices.

Partly also it was an expression of a generational obsession with respectfully bucking the system -- flouting institutional rules in order to satisfy institutional goals better than otherwise would occur. "Why aren't you working on the most important thing?" has both personal and institutional answers.

The pragmatic military mentality that diffused into the private sector was of the strict authoritarian institution that thrived by inciting the breaking of its own rules, so long as the rule-breaking advanced the intention of the rules better than adherence.

On the personal level, such "good kind" of rule-breaking manifested as stubborness, self-trust, loyalty, higher-purpose, calculated arrogance, etc.

You aren't working on what you think is important? What went wrong? Why aren't you fixing it?

p.s. Hamming

He had some visceral sense of what was important:


There is a reason a postgraduate qualification is called a doctorate of philosophy, "for the love of knowledge". Academic research should be done for the love of the subject, and to push the boundaries of human knowledge. Industry will develop what it needs if it makes commercial sense. The idea that research should be commercially relevant is not what universities are (were) for. Public funding should not be used to further the goals of commercial entities. It seems okay for companies to pay for research that furthers their commercial agenda, but not at the expense of furthering human knowledge.

There is a practical reason for this, research is not a straight line to a goal. If you aim at a goal you tend to severely limit discoveries. It is almost impossible to predict where the next breakthrough will come from. You need to follow your own enthusiasm and areas of interest, and keep looking at the open problems. Eventually a goal one step from some existing work becomes apparent, and you can then bring it together to solve a problem relatively quickly.

Research must be relevant.

Research must be relevant. Not commercially, but in ways of advancing our state of knowledge and having a chance of creating good outcomes in the future. Society must benefit from their funding in someway. This isn't about interpretive art!

There are plenty of examples of fields of research that just dried up and became obsolete, or were stuck to long in "normal science" phases without any revolutionary advances, and unreasonably resistant to paradigm shifts. This has nothing to do with commercial relevance, but relevance in general.

You can't plan relevance

The problem is we have no idea what is going to be relevant in the future. The greatest advances come not from planning, but serendipity where ideas from different fields that may appear irrelevant combine to produce unexpected results. See for example:
Mathematicians focusing on the goal and relevant research failed to solve the problem. What we need is more irrelevant research, combined with reviewing the open problems accross many diverse fields.

Look at this from the

Look at this from the perspective of Kuhn: there is no point in doing research on Ptolemy view of planetary motion when a better Copernican framework exists. Right now, we aren't exactly that wrong (after all, we are not discovering properties about the universe, our discipline is design based), but we are so set in our current frameworks that we refuse to consider alternatives. Most research builds on these frameworks, but they are leading us into dead ends...we are in crisis!

We need to reexamine our underlying assumptions and establish new paradigms, then we can go back to normal science. (tl;dr PL is due for a revolution)

Myth of the Objective

But there is no telling if Ptolemean planetary motion might hold the key to some other problem, which if nobody studies it, nobody will have the tools to solve. This is what the article I posted was about, a real world example of a problem that resisted all direct attacks, and that held the key to many problems in different disciplines, including the travelling-salesman problem that had seen no improvements for decades. There has been a lot of work done on why goal oriented behaviour prevents innovation. Here's a good book on the subject

This suggests people need to each follow their own individual academic interests, neither following the existing frameworks (which I agree with you about), but neither looking at what might be useful (which is your suggestion of a solution).

End of the line???

I appreciate your efforts to do different PL research, but I very much don't have sympathy for the claim that PL research is "at the end of a line". Especially not when FP is starting to become mainstream, and displacing C starts is starting to looking at least *technically* possible (modulo, say, 5-10 years for language engineering and longer for the network effects to take place).

It's good that we start having, say, research such as Hanenberg's — they are among the more reliable we have, but I still find them only interesting early efforts. If I had a chance, I'd finance much more such research, but not at the expense of existing PL research.

And I'm not convinced that psychologists/social sciences/... have yet tools working well enough for us to import — see the reproducibility crisis those disciplines (or even life sciences) are experiencing.

Heck, for all I know, I'm not sure we'll understand human brains well enough before we can simulate one.

Also, I'm personally not so interested in what is intuitive to humans, and I'm much more interested in how can we teach to humans mathematically/technically good programming models. Provocatively: current mathematical and CS education might have zero effectiveness, compared to what it might do potentially, so I'm not sure I'm so interested on designing PL to work around such deficiencies in education.

Meanwhile, the current PL community is working on figuring out what they should be.

In particular, I'm a teaching assistant this semester with a course using How to Design Programs, and I believe it's both much better than traditional teaching, and that there is obvious space for incremental refinement of it, and for radical improvements of underlying mathematical curricula.

Again, this might still turn out to not work — so having different research would be valuable, and the current balance is unfortunate. But claiming so much more than that won't win my sympathy.



new directions for PLT

there's simply not a lot to be done anymore in PLT.

I doubt that. Try this:

Historic PLT work has been organized around a series of projects:

Project: Automate instruction-level optimization.
Milestone: FORTRAN
Business case: Raise programmer productivity.
Social impact: Enabled useful non-pro programmers (e.g. physicists).

Project: Lower the skill requirement for business programming.
Milestone: COBOL
Business case: Lower cost of programming; increase business users
Social impact: Incompatible implementations exposed need for rigor

Project: Rigorously standardize a general purpose programming language
Milesstone: Algol
Business Case: Commodify hardware, software, and programmers.
Social impact: The invention of programming language theory as a subject of academic study.

The era and dynamic around Algol was profoundly influential and we're still getting over the hang-over.

There was a big push to bring "mathematical rigor and respectability" to programming. There was the recognition of functional style as an approach for that. Recursion was formally invented. Various approaches to mathematical semantics were invented. Parser generation was automated from high-level declarative specifications.

During that period, looking back at the literature, I think all the PLT practitioners saw themselves at the threshold of a new era. Forever after, every serious challenge of programming language design would be solved by mathematical deduction. Programs in general would become ever-more error free.

They were wrong, of course. By the 1970s they were out of gas.

Project: Design the division of labor for very large programs.
Milestones: Smalltalk, C++, Java, Javascript
Business case: Treat software as a factory-style mass-produced commodity
Social impact: The de-professionalization of programming; dramatic reductions in average working programmer levels of skill and knowledge; bloat-ware.


Where does this leave PLT today? Judging by what I read on LtU, the fashion among PLT practitioners is take the last two projects as the definition of the field:

Project: Rigorously standardize a general purpose programming language
Project: Design the division of labor for very large programs.

I claim, both of those projects are done. Both failed to achieve their goals. Both produced the negative result of proving their goals impossible. That is: We can rigorously specify and reason about programming languages with mathematical precision and in practice this is almost useless. We can organize the division of labor for very large programs and the outcome is that most big programs are unstable crap and the very few that are really robust are ultra-hyper-double-plus-expensive.

Meanwhile, the business world is going great guns using language that are not rigorously specified, building big systems that have planned-for, very high, permanent defect rates.

Is there a future for PLT?

Needs new projects.

Some ideas:

a) Empower individual producers to the detriment of bloat-ware vendors.

b) Try to kill the parts of the software industry with the highest levels of employment.

c) Make art.

Killing the forest on the trees' behalf

You talk about killing off the software industry, as though you can somehow make an individual more powerful while making a group of individuals less so.

Suppose you're right. Then those individuals are godlike. They'll be more powerful than any government or nation or species, since governments and nations and species are groups. IMO, people that powerful will do all right without my charity.

Suppose you're wrong. Then if you succeed at killing off the dreams of the industry, you kill off the dreams of the individuals comprising it. What is an individual with a dream going to do now, if they can't share that dream with anyone without it being killed?

I've probably glossed over some nuance in the way you're using the term "industry" or "software industry." The way I'm looking at it, the groups that make academia tick are examples of industry too, so you're offering them a suicide mission. And if I can take what you're saying at face value, that doesn't seem to be what you have in mind.

Grace Hopper had no compunction about killing jobs

COBOL wasn't exactly meant as a friendly gesture to programmers, you know.

every new beginning

I don't think Thomas Lord is aiming to 'kill the forest' but rather to change it by enabling fewer people to do the same job, or making some jobs irrelevant (e.g. buggy whip manufacturing). Focusing on existing industries is just a good way to focus on existing needs - the idea of 'killing industry' a way to preserve relevance without necessarily perpetuating the industry as it exists today. Of course, new industries will appear in whatever vacuum you create... and hence new targets.

It currently takes a team of hundreds of programmers and artists to create a top class video game. Whereas it only takes a few people (author, editors, publishers) to develop a top class book, or music, or other arts. What if we could make it so it takes only a few people - artists who can express interactive concepts with data or code - to develop top class games? (Why separate code and data anyway?)

Why is it so difficult for a scientist or artist to integrate tools? Why is it so difficult for users to move data produced in one application as input to another in real-time? Why is it so difficult for normal users to add features to applications to automate some painful task? Why do we separate our application API from our User Interfaces such that we need to write everything twice and hinder users from accessing the full power of our apps? Right now, every little change or integration of tools seems to require a professional programmer. There's a lot we could feasibly do to kill these jobs that shouldn't need to exist - killing parts of the 'software industry' because the tasks are simplified enough to get rolled into plain old 'computer literacy'.

OTOH, I strongly disagree with Thomas Lord's position that those 'division of labor' and 'rigorous PL' projects have been proven impossible. IMO, this is about as short-sighted as those who would have argued before 1903 that mechanical flight is impossible just because humans consistently failed at it for centuries before we succeeded. (Humanity hasn't even seriously worked at this newfangled CS and PL thing for even a full century yet.) There are many ideas, paradigms, and arrangements thereof that we haven't seriously tried, or that were promising but were abandoned for non-technical reasons (funding, inertia, hardware limits at the time, etc.), or that are still ongoing experiments (cf. the recent article on Bedrock which addresses both division of labor and rigor).

Why separate code and data

Why separate code and data anyway?

Because data endures long after code dies, so data is simply more valuable. Coupling it to code that will die before the data loses its value is thus a terrible idea.


Unnecessary coupling is bad regardless of whether you're talking of source or data. I think David's point was to ask where the line is between code and data. If an artist applies a series of tools and filters to arrive at an image, surely we want to keep that source input and not the final BMP. If the source input can be recorded in such a way that it's possible to vary parameters used along the way and obtain a new output image (which sounds good to me!), then it starts to look more like code and less like data.

Data doesn't endure past the

Data doesn't endure past the code to access it. It wasn't so long ago that many documents and files used binary formats specific to their word processors, before XML became a popular thing. Nor, in many cases, does data endure long past the code to produce it, e.g. because the data goes out of date. Software bit rot affects code and data both.

For data to survive, it is better that it be easy to process (e.g. simple structure, easy parsing and processing) and largely disentangled from its context (avoiding cyclic dependencies and stateful coupling). For code to survive, it benefits from those same properties. Today, we artificially separate code and data, trying to address survivability problems for 'data' while we continue to create cyclic code with sophisticated parsers and processing that is coupled to runtime states.

Meanwhile, the artificial separation creates challenges for features like: transparent procedural generation, living documents with spreadsheet-like properties, creation of interactive artifacts like embedding tutorials in blog posts or tabletop gaming over a bulletin board, DVCS control over application state in an OS, etc..

Data doesn't endure past the

Data doesn't endure past the code to access it.

It does when the format is standardized, like XML, and it ignores a big point behind decoupling code and data: reimplementations for non-technical purposes. If data and code were inextricably coupled, an open source implementation of, say, a .doc Word document editor like LibreOffice wouldn't be possible.

The databases of government statistics, or the data generated by the LHC will long outlive the programs that were used to generate them, and new programs will be written to access them in different ways. Coupling the original code with the data simply adds no value.

Certainly there are circumstances when including code can add value, as Matt pointed out, but I'm skeptical that it should ever be the default.


Unnecessary coupling or entanglement seems plenty problematic even between 'code and code' or 'data and data'. So I don't see any strong reason to assume 'code and data' is a special case for decoupling purposes.

Standardization of XML or specific schema doesn't seem different from standardization of a PL. Arguments about proprietary vs. open source software seems little different from arguments about other proprietary IP based on copyrights and trademarks. Government statistics and LHC data would still be reusable in many applications if represented as a module of code, perhaps trivially exporting a typed list.

What I am really asking is perhaps better phrased as: "Why distinguish code and data?" The alleged utility of the distinction between code and data is not clear to me. What I do see clearly are the barriers we've raised in the attempt to distinguish them.

Both, and neither

It is a spectrum. When we try to put barriers along the spectrum we can get both good and bad outcomes. Good ones are that we use Markdown instead of binary Word2016 when writing out Great American Novel. Bad ones are that we never consider all the problems we create by seeing data being 'qualitatively' different than code.

And there's probably more than one spectrum axis, I'd hazard to guess.

One is liveness, as in: I printed it out and I don't need any code to "run" it! Vs. this source code can't really "do" anything until some "live" code "runs" it. (Humans suck at running code in their head.)

Another is human readability. My dead tree Great American Novel can be encoded as ASN.1 binary format(s) which are bad, whereas JSON is good.

There's probably other axes, but taking those 2 together, one can consider ASCII Scheme source code: I can print it out and read it and everything. But it doesn't do too much good (for anything other than a really lame program like hello world) unless we have something to run it. And we won't find all the bugs in it just by reading and pondering the code.

(And in the long run yes it would be great if our Great American Novels had the option of being more than just dead trees cf. all those eReaders that could be doing more, even if it is only something as simple as using hyperlinks to make choose your own adventures.)

Interpreting, Parsing, Rendering

Simplicity of processing (for implementing and understanding the algorithms) seems to me of greater concern than whatever category we humans might classify this process, e.g. "running" it. Some PLs, like Forth or Scheme or Brainfuck, are simpler than many data languages. (Of course, this isn't the only important attribute.)

And you probably won't find all the bugs (grammar and spelling errors, inconsistencies, plot holes, etc.) in a novel you write, either. I'm not making any assumptions of correctness for code or data.

I like the spectrum analysis.

Standardization of XML or

Standardization of XML or specific schema doesn't seem different from standardization of a PL.

Except PLs get compiled to machine code which isn't so easily decompiled, unlike data formats like XML.

Government statistics and LHC data would still be reusable in many applications if represented as a module of code, perhaps trivially exporting a typed list.

Until you change to a different CPU instruction set for subsequent analyses. The original source might be lost so it can't be recompiled for the new arch, or a compiler for the new arch may not exist. Worrying about code along with data just magnifies the dependencies you need to juggle.

I think ultimately there's great value in simply utilizing a common data format like Cap'N Proto. Something that's expressive enough to model graphs, with bindings for any language you need. I hope future languages simply use it for native serialization.

Processed data

Compiling code is just one way to preprocess it. Sure, it isn't reversible. But as a complaint, that seems analogous to "but there are non-reversible functions on XML". Compiled code is directly related to partially evaluated data or cached computations. So another analogy is "but you can't recover source XML from my cached results". There are plenty cases IRL where you're fed cached computations in difficult to process formats rather than source data, e.g. weather predictions in natural language.

Are you assuming code will be machine code? Conflating code and data doesn't imply favoring the worst possible language for either use case. Machine code is worse than Brainfuck by metrics like portability, predictability, and securability. Rather, I'd argue we should design code languages that have nice properties we associate with data and vice versa. Portability to different CPUs is not even a difficult property for code.

Cap'N Proto has some nice properties as a transport, but it still assumes that problematic barrier between code and data. A barrier that requires professional programmers to make two apps talk in even the simplest of ways.

Code can be very stable

I bet there is a lot of Cobol code on back-office systems that has been running since the 1960s. An FBP program designed by J. Paul Morrison has been in execution for at least 40 years though many generations of IBM mainframes. The Unix cal(1) program remained unchanged, except for i18n modifications, from the 7th Research Edition all the way to Solaris 10. And you can run any Research Unix program using simh, the PDP-11 simulator, on any mainstream system today.

Destructive/constructive spin

I don't think Thomas Lord is aiming to 'kill the forest' but rather to change it by enabling fewer people to do the same job, or making some jobs irrelevant (e.g. buggy whip manufacturing). Focusing on existing industries is just a good way to focus on existing needs - the idea of 'killing industry' a way to preserve relevance without necessarily perpetuating the industry as it exists today. Of course, new industries will appear in whatever vacuum you create... and hence new targets.

I think it makes a difference when we say "killing industry" versus "meeting existing needs."

An explicit goal of killing the industry will sometimes be achieved not by trying to meet existing needs, but by developing all-new problems that sabotage and distract the people who had been trying to meet those needs.

On the other hand, an explicit goal of meeting existing needs will sometimes be achieved not by making industry leaner, but by seeking out all-new problems that the bloated industry can tackle. This meets an existing need, namely the need to find an effective use for existing resources, but it could go too far.

So the destructive and constructive goals are both somewhat negative if we apply enough rhetorical spin, but there are some extra details that break the tie.

The paths to the destructive goal have the side effect of severing communications and turning people against each other. The actual act of destroying the world would be a positive step toward the goal of killing industry, so if we set our minds to it, we could sever practically all collaboration. But we could also destroy the world by accident, so I don't find it important to invest in. (Besides, I have emotional scruples over the thought of turning people against each other.)

The paths to the constructive goal have the side effect of fostering more communication and collaboration. At worst, they can foster too much collaboration in unfortunate shapes, giving us monopolies and centralized control structures that resist innovation. But I think we are within reach of social and technological techniques to dissipate these structures across the whole population, and (trivially) the whole population is already a monopoly, so I'm optimistic that we can avoid being stagnant on this path.

In conclusion, I hope academia does not make an outright enemy of industry. Using the "killing" of industry as a metric to help judge the relevance of research is tempting and interesting, but actually reducing it to this metric would be dangerous.

twisted message

I certainly agree that "kill the industry" is a phrase ripe for misinterpretation, likely to make undesirable allies and unnecessary enemies. We need a motto that's more politically appealing for our insidious plot to destroy bloat and make industry leaner. :)

re I strongly disagree with Thomas Lord's position

I strongly disagree with Thomas Lord's position that those 'division of labor' and 'rigorous PL' projects have been proven impossible. IMO, this is about as short-sighted as those who would have argued before 1903 that mechanical flight is impossible just because humans consistently failed at it for centuries before we succeeded.

I am open to persuasion but your analogy is no good.

I claim research solved the technical problem of programming in the large, but in practice the economic use of this solution is mostly to build very bad, somewhat out of control systems.

I claim research solved the technical problem of mathematical semantics, but in practice the result is so unimportant that the most popular languages lack an articulated mathematical semantics.

An analogy would be if the Wright's had solved heavier-than-air flight, maybe even built modern jets, yet these turned out to be more of a novelty than a useful tool.

Partial solutions

You mentioned that robust programs are "ultra-hyper-double-plus-expensive". Practical robust programs? That's clearly your real 'heavier than air' flight goal. Mathematical PL semantics is insufficient but probably useful as a component, analogous to just determining a good shape for a wing in a wind chamber and doing some gliding.

Projects like 'rigorous semantics for PL' aren't whole solutions to larger goals, just small steps and experiments that seem likely to contribute to a whole solution. They certainly don't "produce the negative result of proving their goals impossible" as you've claimed.

I don't think we've even begun to touch all the challenges surrounding very large programs. It's still difficult to integrate, compose, and reason about remote services, for example. Nor have we experimented much with or reached any limits for division of labor and cooperative development. That's a whole field by itself, CSCW, and a relatively new one. AFAICT, all past PL projects and paradigms aimed at division of labor are based on untested hypotheses from language designers who don't even study cooperative work or emergent behavior of human systems.

The first planes and flights of the Wright brothers were a novelty. But, more relevantly to the analogy, they were the result of many years of research and experimentation developing components that would independently have appeared to be useless novelties and failures.

Some a, c, and Generics.

I don't see Scala as the end of PLT research, its got many problems, and I find its type system to be an inelegant mashup.

I am interested in a, c, and a bit of: "Increase productivity, improving the generality of code by better facilitating generic programming". This involves changes to remove the necessity of boilerplate and repetition, and to allow obvious things to be implied.

"improve readability and maintainability of code" is another interesting one.

"Killing the software industry" seems a bad idea, support and contribute to open source instead. It's better to contribute to the solution :-) Perhaps you mean "make the industry unnecessary because open source is so good", but I think there will always be niches that people just don't want to write stuff for, meaning companies will need to charge for bespoke development, or niche products for industries that don't have enough people in to support a community of part-time programmers.

Edit2: I think the following model is a good one. Companies keep their latest feature code to themselves and proprietary, however once everyone in that industry has developed the code to do a particular thing it no longer provides a competitive advantage, so the cost of each company independently maintaining their code becomes unnecessary, and they should get together to standardise an open source implementation, the maintenence cost of which they can all share.

Missed one

Here is another for your list (still ongoing):

Project: Design a standard method for inter-program communication.
Milestones: Unix pipelines, RPC, S-expr, XML, JSON ... (too many to list)
Business case: Allow programs to be used modularly to allow engineering of large-scale systems.
Social impact: Remove barriers of adoption and change the scale of granularity in the software industry.

This is another one that has never reached a conclusion. There are many "standards" for doing this, and none of them are either a) general enough to displace their rivals, or b) concrete enough to avoid become the dominant overhead in a project.

In some sense this is one project that will never be completed within PLT as it is based on the hopeful aspiration that the following process converges to a fixed point (c.f. xkcd 927):

1. Take a set of languages that we need to communicate between.
2. Take a "reasonable" set of programs within those languages.
3. Define an encoding for data used in those programs, and a protocol to allow communication of that data and the assumptions covering it.
4. Realise that the step three became too verbose and complex to be useful.
5. Write a framework to try and allow the result of step 3.
6. Watch as the framework inevitably soaks up new use cases and complexity until it is redesigned as a language.
7. Do programs in the new language communicate cleanly with one another and their crusty legacy, or do we jump back to step 1?

Sure we could point out that we're all going in circles retreading ground that has been trudged over for 40 years, or we could stop worrying and learn to love the bomb.

suggestion: reverse the polarity

I wonder whether the quality of academic research would be improved if the goal of researchers was to kill off the industry.

Hear me out.

Measurable results

It's easy to see objectively whether research is killing off industry. The main symptoms to watch for are:

  • big firms going out of business (or nearly so) because...
  • researchers have disrupted with non-commercial substitutes
  • users have chosen to flock to those substitutes

Emphasis on collaboration

A world rich in computing but lacking a computing industry would have to rely on the loosely coupled cooperation of developers and users around the world. (Perhaps something vaguely like the way libre software projects work, only more successful.)

Mirroring that no-industry goal, academic rivalry would take on a complementary incentive for cooperation and collaboration among small research groups and users, scattered in time and space.

Multi-disciplinary cross-pollination

To win users away from the clutches of industry, researchers will have to understand users as they are; to better understand how computing fits into the larger world; to better communicate past industry and with the users directly.

A renewed emphasis on ethics

Consider what a colossal brain drain firms like Google and Facebook have become. And on the basis of what? Advertising!

Advertising, for goodness sake. Advertising and mass surveillance. Mass surveillance and mass profiling. Mass profiling and the ever more articulated manipulation of user attention, money, and social behavior.

And what does the mass of capital from advertising turn into? Where does the attention of the brain-drain itself wander? Naturally it wanders to law-breaking business models, military robotics, and cheerful collusion with oppressive states.

Brought to you mainly by the revenues from hawking loans, lawyers, and weight loss miracles.

Academics are supposed to bow to that?

The proposition Matt Welsh offers is that not only should the owners of a near monopoly on ads have unprecedented control over the direction of computing research, but additionally, academic researchers who don't join the mothership should concentrate on making their work "relevant" to those industrialists.

Well, what greater "relevance" can there be but to put those guys out of business by democratizing (little "d") and socializing research?

Racking up points

The current game of chairs only really values points accrued through publication. It is a form of measurement that worked well on a much smaller scale academia. It is completely broken at the current scale. It relied heavily on controlling supply (people writing papers) indirectly through artificially tweaking demand (peer review standard at a small number of gateways). The problem is that this model of supply/demand is completely at odds with the actual supply/demand that controls academia: governments want more scientific "output" and insist on metrics that are easy to measure (number of publications) instead of metrics that are accurate (because they take years or decades to measure).

The only natural outcome of this imbalance is a race to the bottom: the vast majority of academics seem to believe that their colleagues should "uphold standards" while they personally maximise output in Minimum Publishable Units. It is a shame, but it is also guaranteed to happen when incentives do not align with desired outcomes. Academics are only attempting to act as rational actors in a broken system.

Things will continue to get worse for longer than people believe possible (much like the inflation of a bubble in any other marketable good), and then when it pops it will involve a very abrupt transition to another dominant metric / set of incentives. If anyone knew what that should be now then we would already be approaching the phase transition. What will prompt the sudden need for a change is a very simple trend that is already visible:

  • Software gets more complex over time.
  • Academia is supposed to investigate the new frontier of knowledge.
  • There is a fixed level of complexity that any individual researcher can manage.

The amount of complexity in interesting software systems is increasing monotonically. The amount of complexity that an individual researcher can manage is not. The number of authors per interesting publication will rise over time. The minimum size of team that is required to operate on the frontier will increase as will the specialisation of roles within it.

At some point the measurement system for output will have to expand beyond "scraped name onto paper for undisclosed reason" e.g. contributed an idea, did some of the writing, knew one of the authors really well, sat through a series of meetings in silences, was paid in kind for previous co-authorships in publishing circle... A more interesting form of measurement would be explicit credit for: contributing ideas, doing actual (implementation) work, providing data, providing (non-specific) code, ongoing maintenance, experimental design, experimental work etc...

Programming languages?

This post is interesting, and it is mainly addressed to the systems community. Programming Languages (PL) and systems have much in common, but I would not say that their approach to the industry is the same. There are several different criticisms in the blog post, but I think that not all of them equally apply to the academic PL community.

- studying the wrong problem: this may be a PL issue too
- being misguided or delusional about applicability to the industry: I don't think this part of the criticism applies.

It would be quite unfair to say that PL research does not have an impact on the industry -- although PL research topics do have a long maturation period before any form of mainstream appropriation. The research on statically typed functional programming has produced many ideas (type inference, algebraic datatypes and pattern matching, generics, rich type systems) that most languages are adopting in some form or another. Lisp research is still going strong and continues bringing improved understanding of meta-programming (same could be said of Smalltalk research, with mirrors for introspection for example). GC technology is now mainstream. Some ideas, such as monads, have changed the way we think about programming.

I would also cite proof assistants as a family of tools that have had a disruptive effect on our vision of what could be done with software, and may have a significant lasting impact on (some part of) the industry in the future.

There would be interesting comparison to be drawn with some of the problems mentioned above. Some systems idea only make sense at the scale of gigantic companies and have to be tried in the wild to be evaluated. A PL analogy to this problem may be the "tooling/IDE" problem: there are some language things that only large teams of programmers seem to be able to pull off (an IDE for Java, a good JIT for Javascript).


It would be quite unfair to say that PL research does not have an impact on the industry

And also unfair to say that systems research doesn't have an impact on industry. It has, and Matt cites several examples. Nor, as Matt states, does research *have* to benefit industry or solve industry problems. But it's also true that a lot of research that claims to be industrially relevant actually addresses problems that are already solved in industry, not actually a problem in practice, or reaches a solution by making simplifying assumptions that render the solution unusable in industry conditions. I have no doubt that you could look through recent PL research conference proceedings and find examples for any of these categories.

A PL analogy to this problem

A PL analogy to this problem may be the "tooling/IDE" problem: there are some language things that only large teams of programmers seem to be able to pull off (an IDE for Java, a good JIT for Javascript).

This is not accurate. We just haven't thought about these problems very much (programming is syntax, semantics, and a textual compiler, nothing more) AND the techniques that are considered orthodox for doing these things are completely inadequate. There really is not much difference in complexity between writing IDE support and building a compiler...or even a good JIT. At the research level, there are no good reasons why these cant be pursued by grad students working individually or in small groups.

Industry is still reaping the fruits of 20 years ago, but what new stuff emerging today do you actually think will have a disruptive impact tomorrow? We have multiplied a lot (many more researchers than before, much bigger conference programs), but the fruit being picked gets higher and higher up the tree.