Research in Programming Languages

Interesting blog post by Crista Lopes. Here is some text from the bottom that struck a chord with me:

In order to do experimental design research AND be scientifically honest at the same time, one needs to let go of claims altogether. In that dreadful part of a topic proposal where the committee asks the student “what are your claims?” the student should probably answer “none of interest.” In experimental design research, one can have hopes or expectations about the effects of the system, and those must be clearly articulated, but very few certainties will likely come out of such type of work. And that’s ok! It’s very important to be honest. For example, it’s not ok to claim “my language produces bug-free programs” and then defend this with a deductive argument based on unproven assumptions; but it’s ok to state “I expect that my language produces programs with fewer bugs [but I don't have data to prove it].” TB-L’s proposal was really good at being honest.

We've talked a little about programming language design research before.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

This is true in all fields of learning, though often forgotten

Here's Northrop Frye, scientific literary critic, on T.S. Eliot, literary literary critic (emphasis added):

Mr. Eliot's essay The Function of Criticism begins by laying down the principle that the existing monuments of literature form an ideal order among themselves, and are not simply collections of the writings of individuals. This is criticism, and very fundamental criticism. Much of this book attempts to annotate it. Its solidity is indicated by its consistency with a hundred other statements that could be collected from the better critics of all ages. There follows a rhetorical debate [...] maintained against Mr. Middleton Murry, who is spoken of approvingly because "he is aware that there are definite positions to be taken, and that now and then one must actually reject something and select something else." There are no definite positions to be taken in chemistry or philology, and if there are any to be taken in criticism, criticism is not a field of genuine learning. For in any field of genuine learning, the only sensible response to the challenge "stand" is Falstaffs "so I do, against my will." One's "definite position" is one's weakness, the source of one's liability to error and prejudice, and to gain adherents to a definite position is only to multiply one's weakness like an infection.Anatomy of Criticism (1957), "Polemical Introduction"

Maybe the context is too

Maybe the context is too little in this quote, but I have to disagree on two fronts. At the extreme the above attitude could lead a total loss of all certainty about knowledge of any kind. Should we, for example, consider the existence of atoms as a "position" to be taken, since it requires some level of belief (though now experimental evidence is very strong)--and if so, would it not be a one of those evil things which is a "source of liability and error and prejudice". In my view, science has been repeatedly advanced by people who come upon an idea, a hypothesis, reflect upon it, test it, and then come to BELIEVE it with enough fervor and passion (tempered by mounting experimental evidence) that they convince others to believe it. After all, Einstein developed the theory of relativity years before an opportunity for it to be evaluated ever presented itself. In fact he developed the theory in almost total isolation from any experiment that could have confirmed it (the Michaelson/Morley experiment had instead disproved the luminous ether theory). In my opinion, the only way that such a person could dedicate oneself to developing a theory like this is to _believe_ that one is on the right track. That becomes a lens through which experimental evidence (contradictory as it is so often) is viewed, and a "position" in a sense. It's not necessary a position in a permanent sense, but at the very least a "paradigm."

To the extent that positions are valid or even useful in Science is a function of how much is known about a field, how easy are experiments to perform within it, how backward-looking it is, and how falsifiable the predictions of its theories are. The more backward-looking, the more that is known, the easier the experiments, and the more falsifiable, then positions are less useful, since they can be rapidly validated or disproven. The more forward-looking, the less that is known, the more complex and ambiguous the experiments, and the less falsifiable, the more a position, a paradigm, and a worldview are important.

I might posit that this discussion is somewhat tangential, since programming language design, in my view, is only scientific in some narrow aspects.

Whether there is some

Whether there is some relativism in science is not relevant to the point Christa is making, which lies more on the distinction between science and design.

Do you believe with 100%

Do you believe with 100% certainty that there are atoms, as opposed to say 99.999%? There is a very important distinction between the two. With the former you will continue to believe in atoms regardless of any incontrovertible evidence to the contrary. I think this is what he is arguing against.

No, I don't. But it is my

No, I don't. But it is my "position" that there are, and it is going to take a lot of very strong evidence to convince me otherwise. When presented with new information about physics, or strange phenomenon in chemistry, I interpret it w.r.t. to my "position" that matter is made of atoms. I believe some things about PL design with much less certainty, yet I still approach some problems knowing what I know and with a certain "hunch" that might be called a "position".

Interesting bits and discussion

There are a lot of different aspects in this article that I found very interesting. There is indeed a discussion of the activity of "design" and its potential opposition with the "usual" empirical scientific methods. I can see why Sean find it interesting and, indeed, this is a valuable question and discussion that he has already started.

The comments are also interesting, though it seems the discussion has attracted eccentric people that may have some kind of snake-oils claims. The comment by Matthias Felleisen is a must-read.


In the discussion of the application of the scientific method (in the original blog post not the comments), I found something else: there is the idea that maybe Computer Science is a radically different field that needs a *different* kind of scientific methods.

it’s is clear to me that software systems are something very, very special. Software revolutionized everything in unexpected ways, including the methods and practices that our esteemed colleagues in the “hard” sciences hold near and dear for a very long time. The evolution of information technology in the past 60 years has been _way_ off from what our colleagues thought they needed. Over and over again, software systems have been created that weren’t part of any scientific project, as such, and that ended up playing a central role in Science. Instead of trying to mimic our colleagues’ traditional practices, “computer scientists” ought to be showing the way to a new kind of science [..]. I dare to suggest that the something else is related to the design of things that have software in them. It should not be called Science. It is a bit like Engineering, but it’s not it either because we’re not dealing [just] with physical things. Technology doesn’t cut it either. It needs a new name, something that denotes “the design of things with software in them.” I will call it Design for short, even though that word is so abused that it has lost its meaning.

The way I understand it, the term "design" here is inappropriate, or at least it's not used in the sense that Sean intends, that is -- if I understand correctly -- taking inspiration from the method of actual designers to evaluate the creative aspects of the discipline. The way I personally understand the paragraph above, it's still thinking of a "scientific method", but says that maybe it needs to be a different method from what other sciences are doing.

While that may be considered an admission of weakness -- I suppose there are fields where the research is simply not good enough to meet existing scientific standards, and I can imagine people in those area ducking their head in the sand by claiming that they need to be evaluated by different standards -- I am tempted to give that thought-provoking idea a chance. It does not look absurd: there are already vastly different evaluation practices in different fields called "science". The activity of a mathematician (or theoretical physicist) are completely different from the empirical methods that physics and some other science promote. Some fields of computer science clearly draw more from the mathematical tradition than from the empirical/experimental one. I would have considered that it's ok for different subfields to have different existing evaluation methods, but maybe there is some place for a new evaluation method, that would be more appropriate for Computer Science. At least, it can't hurt to think about it. It seems clear at least that the use of computers did change the practice and evaluation techniques of other scientific fields -- for example due to the ever-increasing importance of numerical simulation. The idea of new evaluation methods isn't absurd.

I also have a personal intuition that empirical method of the kind "let's put dozens of grad students in a lab and see if we can get a productivity difference between those two tools" is not the right and only way to evaluate type systems. The article has an interesting remark about that:

I have seen even more research and informal articles about programming languages that claim benefits to human productivity without providing any evidence for it whatsoever, other than the authors’ or the community’s intuition, at best based on rational deductions from abstract beliefs that have never been empirically verified. Here is one that surprised me because I have the highest respect for the academic soundness of Haskell. Statements like this “Haskell programs have fewer bugs because Haskell is: pure [...], strongly typed [...], high-level [...], memory managed [...], modular [...] [...] There just isn’t any room for bugs!” are nothing but wishful thinking. Without the data to support this claim, this statement is deceptive; while it can be made informally in a blog post designed to evangelize the crowd, it definitely should not be made in the context of doctoral work unless that work provides solid evidence for such a strong statement.

I don't think that empirical data is needed to make meaningful statements about programming language semantics, and I don't feel that this subfield is fruitless due to the absence of such "scientific methods". When you prove a theorem, you don't run controlled studies among your students to check that they are convinced of the proof with statistical significance -- though there certainly is a social process. Empirical methods would certainly be essential, and we need more of it, to make a link between the formal properties that are currently evaluated and the actual empirical practice of programmers in the field. The Haskell propaganda cited above could be said to carelessly jump from the formal properties of the language -- that are well-established by current scientific practices, that must be the "academic soundness" the paragraph refers to -- to practical advantages in the real word, that have been less rigorously evaluated, because it's probably not the central area of interest of the Haskell research community.

I'm going to disagree with

I'm going to disagree with Matthias's assessment. There is plenty of theory, science, and engineering going on in software that pretty much sticks to the principles of these activities as they are performed in other fields. But design is fundamentally different, let's call it "invention" if the word design is meaningless. Invention can be informed by science and theory and can involve engineering, but it is fundamentally a completely different activity whose output cannot be measured in the same way as the other activities. But then invention in other fields is pretty much the same; how do we evaluate a new kind of wheel or a better potato peeler? I completely reject that we are somehow new and unique in this regards.

I hope we don't become like the HCI community who are increasingly trying to cloak their design (invention) activities as science by throwing pointless user studies into their papers, just to satisfy PC requirements. There is definitely space for empirical methods in our field when science is actually being done, its just that science isn't always what we are doing or even what we need to be doing. Perhaps the only reason why we are unique is that there is a lot pressure to invent, which might not be the case in other fields. However, I really doubt that this is the case; that design activities in other fields just don't get published like we expect our work to be.

Some supporting discussion

Some supporting discussion from 1998 by Rickert. Excerpts:

The key theme seems to be that inventions come from inventive individuals, not from any particular method, and certainly not from focus groups or marathon "ideation" sessions. I am not suggesting soliciting users' views or collaboration with colleagues is unimportant; however, they may be overrated as invention contributors. Nor am I suggesting that science and theory are not important, quite the contrary. In his seminal work, Creativity(1996), Mihaly Csikszentmihalyi, stresses that inventive individuals must first master the "symbolic systems" of their fields. The symbolic system may be grammar and style, mathematics, music theory, engineering fundamentals or theories of human-computer interaction. Picasso first mastered conventional techniques before inventing his own revolutionary style. Likewise, mathematical breakthroughs generally require a very sophisticated understanding of fundamental principals. Inventions are new ways of doing things. If one is to create a new way, he or she has to be familiar with the old way, or the dominant paradigm.

Invention is a separate activity from science/theory, but science/theory definitely informs invention. Actually, this shouldn't be surprising to any of us.

Interesting notion, that invention precedes theory. Can this be true? The notion flies in the face of conventional wisdom, which holds that inventions are developed from methodical application of scientific theories. Many who have studied the history of invention and innovation have questioned the conventional wisdom. In his book, The Evolution of Technology (1988), George Basalla concludes that inventions take place in gradual steps of small improvements over previous inventions. Scientific theory, while often playing a significant role in the education of the inventor, plays a minor role in the actual invention process.

We invent things that we don't completely understand, where understanding comes later...

This seems to be particularly true in the field of human-computer interaction. Carroll, Kellogg, Rosson, as well as Barnard, (Carroll, Kellog and Rosson, 1991; Barnard, 1991) observed that innovations in the design of user interface artifacts have almost always preceded theory, rather than the other way around, e.g. the case of direct manipulation. In other words, designers design solutions to things they perceive as problems. Some of these attempted solutions (i.e. inventions) eventually become recognized as useful and/or usable, then Psychology steps in to explain them.

Basically, successful invention provides fodder to scientists who use the scientific method to determine why an invention is successful. But the activities have and should probably remain separate given the time required to understand an invention is probably greater than it takes for the invention to be useful (exceptions maybe being pharmaceuticals and anything else that can kill you, like a collapsing bridge).

Edit: on second thought, I might be conflating design with invention.

A counterpoint by Chris Martens

LtU readers may be interested in this reply by Chris Martens.

But when I took a course on "human aspects of software development" to try to understand this frustration more scientifically, it felt like the emphasis was in all the wrong places---the field seemed to take Norman's message and pervert it into the idea that what can't be treated as an everyday thing just isn't worth studying; anything that isn't instantly discoverable or learnable in one lab study is an idea worth discarding. This means that their field doesn't appear to get much further than surface syntax and editing tools when it comes to understanding PLs. And I mean, is that okay? Is the rest of it just too "wide and deep" of a problem to understand on design terms? I don't think so, I say cautiously based on personal experience programming in different languages combined with the deep understanding I have through theoretical training---there are meaningful differences that have nothing to do with my editor or syntax, and we will probably not be able to use the same methodologies that we use to examine everyday things, but we shouldn't pretend that these differences have nothing to do with design.

[...]

Or, interpreting the influence of theory on design another way, what if we think about interfaces as language problems? What's the linguistic abstraction offered by a REPL? Or a drawing program with brushes and palettes? Or a musical instrument, or a toaster? Could we understand these tools better if we thought about them in terms of logically-informed languages?

Wrong tools for the wrong job

The essay uses the terminology design vs science but the discussion of those terms sounds like a rehash of older arguments. How does this differ from experimentation vs exploration, qualitative methods vs quantiative methods, inductive approaches vs deductive approaches?

In some fields it is easy to ask questions of correlation: given artifact X does it behave as if it has property Y? Physics simply sets X to the universe and Y to any testible property. It does not provide a complete description, as not all phenomena fall into this category so the left-over pieces fall into theoretical physics and are treated by a completely different methodolgy that takes more inductive approach to finding out what the implications are of different theories. Occasionally someone makes a new connection between the two fields and we learn something important.

The essay seems concerned with which side of the line PL research lies on, but it seems like an invalid question. Taking Sean's point from about invention vs science, our problem as a field is that the situation is much worse than that. No matter how hard it is to try and measure / quantify / test invention that is only a subset of PL research. The main outputs of PL research are methods for invention, after all any programming language is merely a particular search bias among the set of possible / potential programs. So we are not only concerned with trying to measure invention, but also to try and measure the second order effect of how well the invented language performs as an aide to inventing programs. It is no wonder that the field is a chaotic mess of qualiative and quantiative efforts to establish anything, and that on occasion people get confused and try to justify their research in the wrong context.

The question that I would raise is not should we push doctoral students into the line of fire (yes we should, it's an amazingly rich and interconnected field in CS), but rather why are we not drowning in data to the same extent as other natural sciences that are forced to mix inductive and deductive approaches? If we consider how many millions of programs have been written in the past decades there is a remarkable poverty of data in this field considering the size of benchmark and test suites that are commonly used.

Edit for clarification: some of my papers have used as few as one test program so I am as guilty of this as any one else, but interested in hearing ways to improve.

The essay seems concerned

The essay seems concerned with which side of the line PL research lies on, but it seems like an invalid question.

I'm beginning to think that there are just many different sides to the problem. Actually, many computer science fields have this problem, not just ours (systems, HCI, variety of application-oriented fields), as well as other non-CS fields. There is a deeper common meta-problem that we seem to be running into. Anyone who says it is "better over there" just doesn't understand the problems they are having "over there."

why are we not drowning in data to the same extent as other natural sciences that are forced to mix inductive and deductive approaches?

I like the idea of focusing more on data-driven methods, since we definitely have lots of data (existing programs). My first research project involved snarfing all Java applets from the web for a simple analysis (many web masters were very angry about that). I heard that Google was using more data-driven methods in Dart (looking at existing Javascript programs). On the other hand, the data can be deceptive as its not a great indicator of what could be, we definitely have to be careful about what we can infer from it.

Edit for clarification: some of my papers have used as few as one test program so I am as guilty of this as any one else, but interested in hearing ways to improve.

I think most of us are guilty of using inappropriate tests to validate our research (at least I am!), its the only way we can get published! I have completely lost interest in core PL conferences accordingly, the only one that seems interesting to me these days is Onward. I think the current debate now is mainly about how we can improve on that (at least in academia).