PL Grand Challenges

The notes from a panel held at POPL 2009 on this topic are available here.
The panelists were Simon Peyton Jones, Kathryn McKinley, Xavier Leroy, Martin Rinard and Greg Morrisett.

Among the topics raised: Effects, Program verification, Parallelism, Visualization tools for understanding behavior of parallel programs, Secure software, High assurance.

Not surprisingly all these topics have been discussed here repeatedly in recent years...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Some comments

I'm not sure what the authors had in mind with "Grand Challenge" but I suppose I have something else in mind.

The emphasis of the panel is on pursuing current research topics to further levels of refinement; doing more of what "PL research" is currently perceived to be. This is fine as long as you really believe that atomic blocks, monads, end-to-end verification, etc., really do point to the distant future.

More importantly, the emphasis is on languages, rather than programming. Historically, PL research has leaned toward assuming that a program exists and asking what it means, whether it meets its intended purpose, how it could be efficiently be actualized by a machine. This great disconnect between the programmer and the program -- the programming process -- is still being largely ignored, modulo some variations in notations here and there.

In some sense, the (short) path from Lisp to Haskell acknowledges the programming gap, throws its hands up, and decides to deal with it by systematically taking power away from the programmer. "Whatever the programmer does, we at least are guaranteed properties X Y and Z." The research largely ignores the thinking process, the actions taken, and the decisions made by the programmer. The programmer is thus further disconnected from the machine, by increasingly being told what is and isn't okay to do; the definition of "okay" being written in stone by a set of philosophers. Certainly, there are many benefits. I'm just not so optimistic that this curve we're on leads somewhere interesting.

HCI went to address the programming gap but in the process got largely distracted by the supposedly larger-impact problem of how end-users interact with machines.

Software Engineering also has tried to address the gap but their concern with "programming in the real world" confined them to much of the existing programming technology. It hasn't historically been a language design field.

And this is where I believe lies one of the grand challenges of programming.

We need new languages, and by this I mean "new ways to communicate with machines," not ASCII syntactic sugar for long-known models of computation. We need systems that drastically reduce the gap between the programmer's mind and the program. The programmer should be deeply immersed in the essence of programming; to have a deep and immediate understanding of how this or that action shifts or impacts some current set of goals; to spend very little time understanding odd behaviors and the majority of the time constructing meaningful ones. Parallelization, verification, lack of side-effects, etc., should be achieved naturally by construction, if needed, and to the desired extents. The machine should in turn build an understanding of the programmer(s) and adapt quickly. The two should collaborate, guiding each other. The programmer should work with exactly the needed information, zooming in and out, taking and modifying various slices and views on the system, and having a deep understanding of the implications with the aid of the machine.

We need to merge the programmer with the program, the learning with the doing. Making language design choices in relative isolation and relying on education to propagate that knowledge to programmers is an extremely slow and inefficient cycle that will become unbearable and *the bottleneck* past a certain threshold in the technology growth curve. Students already learn Java, Python, and Perl on the internet instead of (or before) CS100. The Web browser is increasingly closer to the IDE.

A lot of what I'm saying here is not new. My overarching point is that I fear we might be missing the forest for the trees when we think about Grand Challenges. If our goal is to quickly build programs that the programmer understands and that behave as intended, then I suspect that we can all agree, at least to a decent extent, that what we're doing now, at the rate at which we're doing it, won't cut it.

This sounds much like the

This sounds much like the Alan Kay agenda going back to the '70s of the last century. My standard response to this is that programmers shall continue inventing programming languages they feel comfortable with themselves. I'd rather consider this as a minor challenge though. I have little faith in collectivist approaches and social engineering languages just like one cannot consult a team of psychologists and anthropologists to create a piece of art or a movie that everyone likes. Same goes with the hope of language researchers being better language designers.

Notice that some aspects of the grand challenge I find rather amusing. Replacing millions of lines of ASCII code with millions of tiny little interconnected cartoons which are supposed to be more close to our ape mind doesn't seem to be all that useful. It might make a good wallpaper though. Other challenges of the past have become rather convenient goals in our perception: componentization and decoupling of interface and implementation, DSLs, lightweight frameworks etc. It is not that there isn't still space to improve.

Reminds me...

Shrimpx comments reminds me of the time a recently graduated business ph.d gave a talk about cloud computing and business to the CS dept. This guy was a Ruby hacker and made a comment that everyone in the business grad school were hacking with one of: Ruby, Python, PHP as part of their dissertation work.

Bright, non programmers getting 'real work done'? Preposterous! But wait, hasn't that been the goal in some respects? So haven't we arrived in some small way?

Not because of research

Ruby, Python and PHP all don't even have a static type system, where most research seems to be done these days.

The research question is...

Do Ruby, Python, and PHP bring anything to the table that Scheme can't already accomplish? The answer is probably no in terms of expressiveness, but perhaps yes in terms of learning curve or ease of use (but these things are less easily quantified).

less easily quantified?

What prevents learning curve or ease of use from being "easily quantified"?

Answer: humans

The human element. If you have a question that can be answered by writing some code and running it, that's easy. (If you're already a programmer.) If you have a question about how well humans learn X, you have to set up some cognitive experiments, get a bunch of volunteers, and run the experiments. And these sorts of things are really easy to get wrong, for example by selection bias or by observer bias.

I'm not saying it's impossible, but it's a fair bit of work. "Less easily quantified" seems entirely accurate.

"Less easily quantified"

"Less easily quantified" seems entirely accurate."

If that's all that was meant it's on the same level as it's less easy to walk 3 miles than 1 mile - oh well!

What Scheme lacks

Perl, Ruby, Python, PHP have the pragmatics - the rich libraries and integration with OS facilities.

A discussion of "what scheme

A discussion of "what scheme lacks" is in danger of being off-topic for this thread. I advise caution!

I think the Ruby, Python,

I think the Ruby, Python, PHP, and Perl languages don't bring anything to the table except complexity. The basics parts of Scheme which these other languages incorporate are expressed with fewer, simpler, more uniform concepts in Scheme.

These other languages do bring bigger communities, standard libraries, community libraries, and more books with them. All of them have "benevolent dictators for life" and canonical distributions (at least for a long beginning period of time.)

The absence of language

The absence of language researchers that lead the discourse about the language. So there was still something that could be left away even in Scheme.

Semiotics

We need systems that drastically reduce the gap between the programmer's mind and the program.

I don't fully agree on your ASCII comment, but the above I do agree with. I would like to see some studies on the intersection of CS/natural languages/psychology and philosophy on PL design.

Current languages are selected Darwinistic. Studies like that might bring better languages, or tools, sooner.

Current languages are selected Darwinistic?

Selected by whom?
Selected by non-programmers who want to become programmers?
Selected by non-programmers who want to create a product or service?
Selected by programmers to further their careers?
...

That's easy

Survival of the fittest means fit with respect to the total environment.

[Half answer, and selected by the total environment, so by all of the above.]

That's facile


I think you're both actually trying to make the same point

The fitness function isn't the one we want.

Ok, ok

My view is that there are a lot of great languages which evolve out of cross-breeding of 'good' features and an occasional new idea. And, as stated, language adoption is not very well understood.

I just don't see (hardly) any qualitive reasoning, or models, for what constitutes a good language, and I certainly haven't seen any back-up quantitive empirical data either. Now, maybe, this isn't a field where such an approach would be warranted, but still, I would like to see some research. Just to know better what 'makes stuff tick.'

Change function

People adopt a new language if the perceived benefit of doing so outweighs the perceived pain of making the change. Effort and 'pain' is involved in:

  • educate self and/or employees in a new language
  • manage, learn, possibly purchase configurations and development environments for new language tools
  • hire from a smaller pool of persons fluent in the language
  • maintain module or interface compatibility, deal with FFIs, integration complexity
  • develop new idioms and patterns and coding standards and review procedures and testing protocols for code written in the new language
  • etc.

Compared to that sort of pain, it is really difficult for a new language to gain traction... not even if it is perceived to be better in almost all ways to another language. It hurts even more that, say, competing in optimization is extremely difficult if you haven't had a large developer community researching and implementing optimizations targeted for your language.

For people starting new projects, the change function has a much lower bar, but even then the bar is still fairly high. Languages that have the best chance of slipping in under the change function are DSLs and scripting languages, especially given that it is not difficult to make an application support a modular plugin-driven set of scripting languages.

As a language designer, the change function offers a very important caution regarding the dangers of premature success and problems surrounding incremental improvements to a language. The more successful and widespread a language is, the harder it will be to modify without breaking existing code then receiving competition from previous versions of the product. A possible answer to this is to design an upgrade path (for syntax as well as semantics) directly into the language, such as: grammatically requiring a 'using mylang 1.0' at the top of each source file to allow multiple versions to integrate (extensible language), indicating 'deprecated' features for some time before dropping them, providing source-to-source translators for upgrading sourcecode, etc. One can also prevent dependence on implementation details of prior versions of a language by ensuring the debug implementation pseudorandomly enforces non-determinism, hiding names that aren't for export, etc.

Anyhow, the rule is: technical merit does not lead to widespread language adoption. The battle is between inertia and marketing (and smear campaigns like 'considered harmful'). Technical merit is, at best, a tool to be used in the marketing campaign - and even then it is only 'perceived' technical merit that actually matters (though perceived technical merit often has some basis in real merit).

That it is hard or impossible to objectively measure what means 'technical merit' pretty much ensures that such decisions will remain entirely in the hands of marketers and individual percept.

As devil's advocate

People adopt a new language if the perceived benefit of doing so outweighs the perceived pain of making the change.

That's actually more rapid than would happen, since it's only weighing against the perceived pain, rather than the actual pain.

(Speaking as someone about the try to make a Fortune 500 company bet on a language that doesn't break the top 40 on the TIOBE index...)

Only perceived pain matters...

... but keep in mind that perception of pain rapidly grows more acute and accurate as one begins attempting the challenge. ^_^

And a language that looks 'new' will also have adoption risk (i.e. there is a fear that the 'perceived benefit' won't be all it is cracked up to be). That risk is also painful, quantifiable in terms of time and money potentially wasted.

Not entirely what I meant

That it is hard or impossible to objectively measure what means 'technical merit' pretty much ensures that such decisions will remain entirely in the hands of marketers and individual percept.

This is true, but it doesn't explain the complete lack of data on programming language use or design. [Or some models on the relation between signs/sign ordering and human interpretation of code.]

I mean, if it is possible to study, say, sentence analysis and synthesis in the human brain, say, with respect to certain structures and beta and theta rhythms or response time; than certainly it should be possible to study the recognition of code, or recognition of code-errors, or the synthesis of code.

Another example: We all think lay-out is important, and assume that to be true. But, where are the numbers?

This is an adult [whoops, I meant mature ;-)] industry with a lot of bucks flowing through it. Some studies are to be expected at some point, no?

[Maybe I missed them, that may also be true.]

CS very young

I might consider computer science, software engineering, plt, etc. to be 'mature' in (optimistically) about a century, when we're working with multi-layer languages where the top two layers involve trust, contracts, auditing, service matchmaking, etc. and above that goal-oriented programming that can automatically grab up the correct services and tie them together into mashups.

We're still young enough to be picking low-hanging fruits (e.g. effect typing, message passing, resumable exceptions, transaction support, open distribution (including distributed transactions, undo, search), safety, security, scalability, performance, resilience and recovery from attack, graceful degradation in the loss of bandwidth or processing power or denial of service) simply by rearranging a few layers in the language and OS, or by carefully deciding how we integrate 'orthogonal' features that penetrate layers of the language.

To call us a mature industry is, I think, tending towards self delusion. We have a long time to go before aiming for incremental improvements in operational language properties, much less organizational language properties (code layout and syntax) will offer significant benefits relative to the larger structural changes.

If code organization and syntax are goals, perhaps consider wikifying code and IDEs for creating services a cloud and allowing great cross-project reuse of code. Consider supporting extensible attribute grammars that can add or remove rules used to parse a section of code. Consider enabling inversion of dependency, allowing multiple pages to contribute towards a named value/function/type rather than vice versa (export only), as in combining values into tables and enabling extensible datatypes and functions. Etc.

Only after we've solved these lower hanging problems to the point that further improvements seem to be far out of reach should we ever bother considering expensive studies with dubious benefits like researching theta-rythms while recognizing code errors and such.

Boyish Enthusiasm

Mature development is steered by numbers. Now, whatever one's view is one how many numbers one needs, or what the role of these numbers should be, a total lack of numbers implies a (total) lack of steering.

So, I agree, this industry is immature.

Apples and Apples

To make use of numbers requires a formal model, and to construct and verify a useful formal model of how language properties will impact users would require you have a huge sample set of how language features and the presentation of these features (the syntax, types, layers around the features) and level of expertise will impact the users/programmers. Computer science is greatly hindered in this capacity by both its relative youth and the huge number of combinatorial factors that go into this.

Chances are, when we start 'steering by numbers' we'll either be comparing existing languages or seeking incremental improvements to a language, and for incremental improvements it will not be easy to compare experts... so it is more likely that any such experiments will favor features that help the beginner.

But a lack of numbers doesn't mean a total lack of steering. Humans are excellent at pattern recognition and prediction, even when it isn't formally supported by numbers, and so they can usefully drive some aspects of language development even without numbers. Economy is another driving force that naturally exists.

Besides... even if you had numbers, the ability to steer by them is useless if you don't know exactly where you want to be. And you'll tend to find local minima and maxima in whatever model you have, rather than global minima and maxima.

Do you mean five apples

or three apples?

The whole point of qualitative reasoning is that at some point it doesn't mean *** if it is not supported by quantitative data. Which is exactly where we are now.

In the absence of studies, it is easy to find reasons why they shouldn't be conducted.

I mean Apples vs. Oranges

The whole point of qualitative reasoning is that at some point it doesn't mean *** if it is not supported by quantitative data. Which is exactly where we are now.

I don't believe we are anywhere near the point where qualitative reasoning is failing us in the absence of quantitative data. How will knowing five apples vs. three apples help if we don't even know whether we want apples?

For now, and for some time into the future, achieving formal qualitative analysis for properties will move us much further along than making decisions based on the vague spaces of measured numbers.

There are a few things that we know we want: higher performance (time, memory, bandwidth, latency, energy), fewer shipped errors, quicker production, provable performance, resilience and graceful degradation, robustness and resistance to a variety of attacks, ability to modify a system non-invasively, mobility and accessibility, etc.

Of these, all but 'fewer shipped errors' and 'quicker production' already support formal qualitative analysis. The problem is that they are heavily influenced by achieving other non-functional goals. And historically it seems that 'quicker production' is mostly achieved by composing libraries that already, mostly, do what you want... at which point it is difficult to quantitatively compare languages that aren't themselves at equal levels of maturity.

In the absence of studies, it is easy to find reasons why they shouldn't be conducted.

In the presence of studies, it is also easy to find reasons why they shouldn't be conducted. Studies in mature sciences often cost millions of dollars. You shouldn't expect computer science to be any different.

Apples and Oranges

The fact that you can eat apples or oranges doesn't imply you need to restrict yourself to either kind; I like both. LtU is riddled with remarks which are often best substantiated with empirical studies.

For now, and for some time into the future, achieving formal qualitative analysis for properties will move us much further along than making decisions based on the vague spaces of measured numbers.

Not even wrong.

The fact that you can eat

The fact that you can eat apples or oranges doesn't imply you need to restrict yourself to either kind; I like both.

Ah, but would you trade one quality for another? And how about that mystery fruit you've not tried yet? Might that not also be worth trying?

The fact that you like both apples and oranges - a qualitative difference - suggests that any empirical study is going to help you very little in a design or business decision that forces you to pick one over the other.

LtU is riddled with remarks which are often best substantiated with empirical studies.

So you say. Show me the numbers!!!

Whatever

Ah, but would you trade one quality for another? And how about that mystery fruit you've not tried yet? Might that not also be worth trying?

I was pointing out that you were setting up a straw man, this is yet another.

The fact that you like both apples and oranges - a qualitative difference - suggests that any empirical study is going to help you very little in a design or business decision that forces you to pick one over the other.

Again... Do I need to remind you of 6-sigma or CMMi?

LtU is riddled with remarks which are often best substantiated with empirical studies.

So you say. Show me the numbers!!!

BS. But, ok, some examples:

LL(1) languages are more easily read than ..

Lay-out sensitive languages are more readable than ..

Lots of parentheses languages are less readable than ..

Or a lot of propositions claiming A is better suited for purpose X than B because of ..

You can quantify a lot of the above.

We all know the ten lines a day statistic. Stuff like that is important, and yes, people do make business decisions on that basis.

And I would like to see some studies on, say, feature X gives a respondent an average increase of Y ms in recognizing syntactic/semantic violation Z.

[Will please now cut you crap?]

Civility, please.

Civility, please.

Again... Do I need to remind

Again... Do I need to remind you of 6-sigma or CMMi?

I have studied both of these. I do not believe that either of them have anything to say regarding language design, or the impact of language features on error rates under given constraints for non-functional requirements. Further, both of them focus on just one metric... errors to errors, apples to apples, and the resulting secondary effects (such as rework). This may be extremely useful if that is your primary concern. OTOH, some people like oranges...

You can quantify a lot of the [below]:

  • LL(1) languages are more easily read than ..
  • Lay-out sensitive languages are more readable than ..
  • Lots of parentheses languages are less readable than ..

You say that the above can be 'quantified' and I really am NOT inclined to believe you. To make any claim about 'readability' of LL(1) languages as a whole is rarely reasonable due to the degree of variation within all possible LL(1) languages. I assure you, there already exist 'esoteric' programming languages that can be parsed easily by an LL(1) grammar but that are widely regarded as nigh unreadable. To merely point out a few such languages should be enough to dismantle such a claim without the massive cost and dubious results of a study that attempts to compare a shotgun group of LL(1) languages with a shotgun group of languages that cannot be parsed with LL(1) grammars.

Or a lot of propositions claiming A is better suited for purpose X than B because of ..

There are ways to formally qualify whether a language is suited toward a particular purpose or not. My favorite is to forbid appeals to completeness and foresight when discussing features of a model or language. Anti-completeness forbids proposing frameworks or language-within-language to describe features 'supported' by a model or language (i.e. you can't say: it's not impossible). Anti-foresight means that libraries shouldn't need to integrate unrelated code patterns 'just in case' X is needed by client code in the future. More generally, the two of these together forbid achieving features by use of design patterns then calling them features of the model or language. There are other such measuring sticks that can be tested logically, but this one has served me pretty well... and doesn't require any sort of controlled empirical study in its application.

What would a controlled experiment be able to test? Well, first you'd need to control for experience, so you could either start with a bunch of newbies or a bunch of experts with equal years. Then you need to control for outliers, since variation among programmers has long ago been proven quite extreme, so you need either very large groups (probabilistic distribution) or you need to categorize programmers ahead of time to make certain there are proper counts of gurus. Then you must control for nature of A and B, because each of those may have many different valid semantics and implementations... the easiest way to do this is to reduce the scope of the test. Then you must control for the presentation of A and B because the layer/environment/IDE in which these are embedded can potentially make a huge difference (unless you've already done earlier tests to establish they will not). And by the time you're done controlling for the relevant variables, you'll be a few million dollars shorter but ready to get the test running... at which point you'll need to do quite a bit of management to avoid corruption of the study.

More likely, people will take shortcuts and attempt to look at programs: how many projects succeeded or failed to do X using A vs. B? Unfortunately, without the above controls, such statistics are not particularly valuable to designers or Computer Science as a whole. Experience with A vs. B, the nature of the projects, and varying attempted and final support for what may be critical non-functional requirements, can all play into success and failure.

feature X gives a respondent an average increase of Y ms in recognizing syntactic/semantic violation Z

And this, I believe, helps identify the heart of my disagreement with your quantification goals. By use of the words 'average increase or decrease' you are assuming a 'relative' gain or improvement to some baseline, and you are further assuming both a context and a change for which syntactic/semantic violation Z may occur, and (finally) you may easily be ignoring both the time in milliseconds it takes to apply or not apply feature X and you can easily, in any such study, overlook other errors and properties of the system changed by the presence of feature X.

This sort of study is fine if your goal is to achieve incremental improvements to a baseline system or language and you've got money to burn to make it happen. But this sort of investment into a baseline system is very rarely going to apply in other contexts, other languages or models of computation or even other 'standard libraries' with different programming idioms, IDEs, cultures.

Such empirical analysis for incremental improvements from a baseline is most appropriate after baselines have really been established and you've moved into evolutionary development. At the moment, logical analysis of system properties seems to be a far more efficient expenditure of resources for achieving objective improvements to systems, especially given the continued revolutionary development of language and programming models.

Need not be that complex

feature X gives a respondent an average increase of Y ms in recognizing syntactic/semantic violation Z

And this, I believe, helps identify the heart of my disagreement with your quantification goals. By use of the words 'average increase or decrease' you are assuming a 'relative' gain or improvement to some baseline, and you are further assuming both a context and a change for which syntactic/semantic violation Z may occur, and (finally) you may easily be ignoring both the time in ...

Sentences are processed by the human brain. Syntactic and semantic violations are studied by neurolinguists since it reveals where, and how long, it takes for a sentence to be processed completely. [They do find significant differences between languages and sentence structures.]

Why not observe how long, for several languages, it takes to process simple expressions? I.e., one can start with small experiments, like can you find a difference between human understanding of '* (+ 2 3) 8' and '(2+3) * 8', or '4 * true' and '* 4 (< 3 2)'? Or the processing OfThisStatementInCamelCase and that_other_statement_in_c. Or the understanding of two pieces of code written according to two different code conventions, or with and without keywords.

I would suggest making complex experiments a bit later.

I didn't claim it would be the road to world peace; I was arguing that I miss, or maybe missed, studies which provide actual data.

2 3 + 8 *

And here I thought, when you said 'feature X' you actually meant something more significant than a syntactic feature.

Consider modifying your question to: how fast do we recognize syntactic and semantic violations in the presence and absence of function passing and curried functions in each of prefix, postfix, and infix notations. Then try to weigh such results against the value of function passing and currying...

Or perhaps: how many milliseconds does it take to identify a race hazard or deadlock condition under various forms of concurrency control (locks, multi-locks with automatic reordering, locks with scoped destructors, rendevous, STM, transactional process calculi, dataflow programming).

I honestly cannot consider studies about whether a semicolon process an expression faster to be of any real significance to PL design, especially if not studied in combination with other utilities (like IDE support to match parens, color different expressions, faintly highlight subexpressions, etc.) that simply offer presentation layers for the language. Syntax is pretty trivial to language design... unless you're attempting to make it extensible.

Anyhow, I wouldn't mind studies that offered quality data, but it takes a lot of effort to control for variables in experience, education, environment, etc. It is better to operate in ignorance of bad data than to make decisions relying upon it. Further, as a PL designer I still couldn't make much use of the data unless they offered something of more significance, such as demonstrating very significant differences or such results as 'needs to consult implementations of Z two or more times to write Y correctly' vs. 'able to write Y correctly without going back to look at the implementation of Z' vs. 'able to write Y correctly without EVER looking at the implementation of Z'.

J D Gannon

Of course there are studies, starting with John Gannon's Ph.D. thesis (from 1975) "Language Design to Enhance Programming Reliability", which reports on experiments evaluating the relationship between various language features and error rates. I haven't followed the field, but it looks like you can get a foothold starting at John's DBLP. (Look mostly at early papers -- his interests moved on to testing and model checking later in his career.)

Wienberg

Wienbergs "Psychology of Comp Programming" delved into the effect of languages a little. Iirc there are a couple of studies done on undergrads in the book. Oddly, this 'field' never took off, or did it become folded into HCI? Seems like there would be lots of work remaining to be done in a field titled as such:

http://www.amazon.com/Psychology-Computer-Programming-Silver-Anniversary/dp/0932633420/ref=cm_cr-mr-title

Psychology and CS

For an industry which well, is maybe not the driving force, but the foundation of a significant part of the world economy, and which is steered for a major portion by applied intellect alone, it would seem that studies at how to apply that intellect best are well within everybody's best interest.

Contrary to what has been proposed before, these studies are not even that expensive. I believe a Psychology PhD costs around 60% of a CS PhD in most countries.

If Microsoft can fund a large number of CS PhDs on work which has dubious direct pay-off, they surely could fund some psychology PhDs on work which might actually have some direct pay-off.

[Btw. Thanks for the reference. I really would like to see what his observations are on best programming language design.]

Agreed.

It's high time that CS enjoy the same level of advancement that psychology has delivered to older disciplines like math, science, and engineering. We also need to invest more in the philosophy of CS.

Ha.

Snarky, but funny.

We can only hope

It's high time that CS enjoy the same level of advancement that psychology has delivered to older disciplines like math, science

Yes, studies like this could be useful. Scientists shouldn't forget that they're human, with weird ego-driven biases and other ingrained perceptual distortions that get in the way of objectivity and rationality.

An especially pernicious version of this is the prejudice of the scientist who claims, essentially, "we don't need no steenking psychology or philosophy." It makes one wonder what they have to hide.

To be fair, some examination of history indicates that one reason for this prejudice are the blows that math and science and took from people like Cantor, Gödel, and the quantum physics crew. We take it all mostly for granted now, but at the time each of those advances seemed like major steps backwards to a scientific community that thought it had been rapidly approaching determinism and certainty in its understanding of the universe.

An irrational dislike and reaction against any sort of introspection, whether psychological or philosophical, is one response that the scientific community had to these discoveries. People exhibiting this reaction now appear to have inherited it from Cantor and Gödel's peers. This adds yet another distortion to the arsenal of things which hamper our ability to think clearly.

Are you calling me yellah?

I realize sarcasm often comes across the tubes as mean spirited, but my previous comment was light-hearted. I'm certainly open to whatever insights psychologists and philosophers are able to give, but my impression is that those insight will likely not be a major driver of technical progress in computer science. I think a little skepticism in the context of this thread is warranted.

Mellah yellah

I apologize, I was mostly talking to the straw man standing behind you.

I have to agree that those fields "will likely not be a major driver of technical progress" in CS, or PLs, any time soon. I was aiming at the separate question of reasons for that, beyond the most handy stereotype.

Rhetoric...

I'm not sure I see a valuable distinction between "having dubious direct pay-off" and "might actually have some direct pay-off"... Seems like there's a lot of room here for personal inclinations.

If there haven't been a lot of studies

You might trip rather fast on some results, and with high probability they'll have a good chance of being worth-while.

I suggest we study the

I suggest we study the impact of computing on ant populations, first.

.

[whatever]

Brushing off my rebuttal

Brushing off my rebuttal without refuting it is a form of proof by intimidation.

I agree that the measuring the social aspects of languages and programs written in them might reveal interesting information about how to design good ones, including by exploring cognitive and psychological aspects. However, such blue sky research is often a needle-in-the-hay-stack deal, and given how immature our field is (worker threads for javascript programmers? that's a horrible way to make a browser parallel), there is a steep opportunity cost. Reading the literature, paying attention to both hits and misses, makes this clear.

I would argue that Brad Myers, Miryung Kim, and many others are taking this user and process oriented approach to language and programming tool design, so if you really want to motivate examining this, I'd examine their results and decide if it's worth going even more theoretical. I believe there is enough low-hanging fruit in their under-appreciated approaches that it isn't worth looking even further -- we can get more knowledge and knowledge we need in easier ways.

This is an argument for incremental science that monotonically increases our knowledge at a slow but steady rate: we can try to make big leaps (that's the way to get a nobel prize ;-)), but I'm skeptical of just throwing a Ph.D. student at it and getting a useful result relative to the cost.

I once browsed Weinberg's

I once browsed Weinberg's book at the UW Bookstore. Was thinking about buying it because the title/content was fascinating, but such a small book at ~$50...and I didn't get much out of the book reading it for 15 minutes. But maybe its something that needs to soak in?

Here is the problem: psychology and HCI are soft sciences, so when you focus on theory, you are basically in lala land where you can't validate your results correctly. With theoretical physics, algorithms, programming languages, you can produce something (proofs) and make predictions about reality--well, not so sure about theoretical PL, but definitely true with physics and algorithms.

As a result, psychology/HCI is almost completely based on studies, user testing, and statistics; i.e., its all experimental. User studies are expensive! More expensive than benchmarking or prove something. Basically, you have to get people to agree to your study, get them to show up, you have to carefully design your study so that its not wasted effort, etc... And you really have to know what you are comparing against?

If you have a new language idea, everyone is going to be skeptical, its just human nature. To make your case, you've got to develop the idea almost completely. Could we maybe "paper prototype" PL ideas to do adequate user testing? Interesting thought...

For now, I agree with parent's argument that cost/benefit is important. Such research has to be both effective and cost effective, otherwise we are just minting another basket weaving PhD. Hopefully they don't drink their own Kool Aid...

Lala-land ain't that bad

Here is the problem: psychology and HCI are soft sciences, so when you focus on theory, you are basically in lala land where you can't validate your results correctly. With theoretical physics, algorithms, programming languages, you can produce something (proofs) and make predictions about reality--well, not so sure about theoretical PL, but definitely true with physics and algorithms.

I know lala-land: It is full of bright colours, fluffy animals, and tattoo-ed girls with long braided hair stuffed with beads and great smiles.

Apart from that. There are studies around there where I would propose that some 'greek-symbol'-ified paper is way further of in lala-land on claims than a good study from the social sciences.

Another: since in the natural sciences it is more easy to derive 'hard' truths, I think people are often more inclined to state unjustifiable claims, since they often feel they are right by arguments of greek-symbolism alone. (And they are actually not that trained as, ok: some, people in the social sciences from recognizing unjustifiable claims.)

As a result, psychology/HCI is almost completely based on studies, user testing, and statistics; i.e., its all experimental. User studies are expensive! More expensive than benchmarking or prove something. Basically, you have to get people to agree to your study, get them to show up, you have to carefully design your study so that its not wasted effort, etc... And you really have to know what you are comparing against?

So they have a harder time qualifying/quantifying stuff. And, yes, the local minima statement is true; but at least with some studies one could discuss several local minima against each other. I didn't claim it should be easy.

If you have a new language idea, everyone is going to be skeptical, its just human nature. To make your case, you've got to develop the idea almost completely. Could we maybe "paper prototype" PL ideas to do adequate user testing? Interesting thought...

Except for that, the paper prototyping. I would hope that at some point some 'rules' or theories would be developed on the interplay between source-code, semantics and human mind. Such that that interplay can be optimized, even if it is just for local minima.

[Actually, it seems most people think of HCI while reading my comments. Uh, I was mostly thinking about psycholinguistics studies.]

[Another, the fact is that when you're designing a language you're of in lala-land by definition. So, the idea is to get out of there.]

HCI is the closest to what

HCI is the closest to what you are talking about thats been done before, especially Dr. Greene's cognitive dimensions of PL design. But if you go back and read these papers, they are mostly common sense/experience-based, like design patterns. As with design patterns, there is never going to be a credible theory of PL ergonomics if only because this is more trial/error engineering than a well behaved hard science.

User studies are useful, but then we aren't talking theory anymore, this is experimental PL ergonomics, which could be a very useful field to study.

As for lala land, the problem is if you have no one who can check or validate your work, your work is essentially useless. Lovely write up by David Patterson on this (when choosing a PhD topic), but right now I can't find it.

And neurolinguistics is the

And neurolinguistics is the closest to what I was proposing.

From wikipedia:

Neurolinguistics is the science concerned with the neural mechanisms that control the comprehension, production and abstract knowledge of language. As an interdisciplinary field, neurolinguistics involves methods and theory from fields such as neuroscience, linguistics, cognitive science, neurobiology, communication disorders, neuropsychology, and computer science.

Now, I would think that there should be some methodologies and studies which may be interesting for PL-design, since, in some sense, they are more the experts on language than CS scientists are.

Obtaining an understanding

Obtaining an understanding of PL ergonomics by studying brain chemistry is like gaining an understanding of politics by studying quantum mechanics. Ya...at some level you are right, but the leap in scale is so huge.

Checking the wiki page, neurolinguistics is also very much based on experiments. I can see it now: "now Mr. Programmer, we want you to write code while we cut into your head and measure your brain activity...is that OK?" Ok, you could probably use a cat scan or something. My hunch is that the process of constructing software is not dominated by PL, so you might learn more about programming than languages.

But if you can find a patron to fund this very pie-in-sky research...its worth a shot.

Understanding why we should obtain

As far as I know, most experiments in neurolinguistics are not like that, but are pretty simple timed observe/respond kind of experiments.

One statistic I liked from a study by Tim Sweeney is that in the code written for a game, the fast majority of bugs (60%?, 70%?) were out of range indexing. To me that was a) a very stunning statistic and b) seems like a pretty good argument for introducing iterators into a language.

I like Haskell's syntax, but I have also seen arguments (on Slashdot, that is) from programmers that it's too concise and seems like line-noise. I think I've seen arguments that a sufficient number of key-words in a language would remediate that line-noise. I also believe Haskell programs are very easy to write, but I have doubts about its readability by others.

Now, I think they (psycho-/neurolinguistics) have the tools to study and draw some decisive conclusions on, say, the above example. And hopefully, some interesting results (say like Sweeney's) would emerge.

At least knowing whether introducing specific keywords has a small, or large, significant impact on the readability of a language is already a result I would be interested in. (Or the readability of say, very abstractly stated programs vs very explicitly stated programs on practitioners.)

I do believe you can do without these studies, but, as presented by the Sweeney example, sometimes knowing the numbers really illuminates practice.

(Actually, I think the converse holds too. It might be interesting for neurolinguistics to study human interpretation of terms in term-rewriting systems since they would have the cleanest syntax/semantics of all languages studied by them, and it would be pretty easy to set up experiments of increasing complexity. Like, observe the response to the question is KK(SK)I equal to a).. b).. or c).. )

This thread is probably too

This thread is probably too deep but....

"What can go wrong will go wrong" is a very old engineering saying that could almost be a law (well, it is Murphy's law!). You give someone a chance to mess something up, they will. I don't think you'll find any neurological explanation on why we screw so many things up! So take away direct indexing and allow access through iterators..and voila! Errors go away.

I would be very surprised if you could measure anything conclusive on brain activities between different programming languages. But who knows...its an interesting idea, but as Leroy said, their are bigger fish to fry that have more potential for producing results.

Why is neurolinguistics

Why is neurolinguistics necessary for analyzing bug reports?

The latter is commonly practiced by language, software, and security researchers. It's something we like to write in the 'introduction', 'motivation', and 'evaluation' sections of papers -- or even have to be the theme of the entire paper. Some of us (though not all) do treat PL/SE as an experimental science, though, unfortunately, the turn around time for good evaluations for anything that isn't program analysis is generally too slow for the expected pace of advancement.

There are lessons to be gleaned for natural language studies, but, even then, learning studies are probably more important, and, unfortunately, cog. sci. only just started to get quantitative. Attention studies might help as well. There are a few researchers trying to measure workflow patterns now to see what's really going on -- I feel like you're ignoring their hard efforts (e.g., read ICSE style conferences for the past decade, there's generally something every time).

Finally, perhaps try looking more at the negative space when reading books and papers -- why *didn't* somebody explore a seemingly obvious or otherwise viable avenue of thought? -- as well as understand what reasoning got them there -- e.g., repeatedly seeing the same problem, motivating them to solve it.

Multi-disciplinary development

Why is neurolinguistics necessary for analyzing bug reports?

It isn't, but they have experience studying the interaction between language and brain.

Finally, perhaps try looking more at the negative space when reading books and papers -- why *didn't* somebody explore a seemingly obvious or otherwise viable avenue of thought? -- as well as understand what reasoning got them there -- e.g., repeatedly seeing the same problem, motivating them to solve it.

Because if you put EE, CS, and Physicists on a multi-disciplinary development track in the same room and close the door, it takes on average 30 seconds before blood will start trickling from underneath it. [I don't expect it will be a lot better if you combine CS with cog. sci.]

[I think it is probably hard to find anyone, unless MS appears with a lot of bucks, in a neuroscience department to study PL processing. They want to study natural languages, probably PL would be too pragmatic for them.]

... and what does the brain

... and what does the brain have to do with bug reports? The mind, perhaps, in some tenuous way, and in one that I'd rather examine traditionally HCI style user studies (which you can view as similar to what many cog. sci. researchers are doing, in practice).

Perhaps I've been misunderstanding your use of terms. Perhaps you are unintentionally conflating programming languages with natural languages (we have hardwired support for natural languages, and higher-level symbolic processing is spread a bit elsewhere and is much much less understood), nor brains with minds (neuroscience vs cog. science), etc. I almost became a neuroscience researcher rather than a web one for grad school, so perhaps I have a more cynical take on all this. If you really do mean hooking up sensor arrays to someone's head to see where neurotransmitters get fired in response to getting a missing semicolon error... we're still struggling to bridge cog. sci. and neuroscience, and we can't much more often than we can.

As an FYI, modern cog. sci. is often tightly coupled with machine learning and statistical reasoning to explain phenomena of how we think -- they're both 'CS' people and share many conferences :) In terms of utility, however, while the models are getting fancy, the questions they're being used to answer are still _very_ fundamental.

Finally, there is investigation of PLs in various ways in the neuro community. Primarily, we can take this into account in brain-computer interfaces. Investigating how your spiny structures feel about PLs, however, is less pragmatic to this community than getting a paralyzed person to control a wheelchair, laptop, or an arm.

Maybe a little better than semicolons

If you really do mean hooking up sensor arrays to someone's head to see where neurotransmitters get fired in response to getting a missing semicolon error... we're still struggling to bridge cog. sci. and neuroscience, and we can't much more often than we can.

I believe that, a result that semicolons helps the brain process the combination of expressions faster, or doesn't help in processing, as shown by response time, would still be a significant result to PL design.

And yeah, I, personally, would like to understand a bit more about PL processing, even if I agree with you that they'll probably won't find a lot. (But no-one knows until someone tries.)

[Maybe you should read my "Need not be that complex" post on this forum. I think you can do a lot of timed observe/respond kind of experiments on simple examples, and it might be that you can actually measure brain activity on syntactic/semantic violations, or even very roughly measure the 'amount of work' a brain has to do to distinguish different violations in several settings. I don't know, I am not a neurolinguist.]

[Btw, the difference between natural language processing and PL processing I find only relevant for as much as results from NLs carry over to PLs. Although I admit I made a mistake in presenting say a bunch of combinators and rules as a language, I believe linguistic researchers would just see that as just an instance of math.]

[Btw, Neurolinguistics is a different part of neuro sciences. Most in that field are probably more interested in how Cherokee learn/use language than in wheelchairs.]

Check out Luca Cardelli's

Check out Luca Cardelli's current work :)

You do realize that there is

You do realize that there is a thread going on at the moment about Elephant, right?

Are natural languages

Are natural languages sufficiently well specified to use for programming? I don't think so. If they were then I would think that current programming language specifications would likely be much thinner and easier to write/read than they are.

The programming languages I've encountered that try to pull from natural languages are counter-intuitive disasters, in my opinion.

John Hughes's claim is a bit shocking

What are the proofs that functionnal programming delivers an order of magnitude improvement?
Compared to what: C, C++, Java, Ruby?

Turner's quote

Turner's quote probably is from the mid-80s, so compared to the languages used at the time - which were pre-ANSI C, BASIC, Pascal, Cobol, etc. Now that mainstream languages keep absorbing features that were pioneered in the FP world (garbage collection, memory safety, closures, polymorphism, type inference, structural types, type classes, inductive and recursive programming, immutable and persistent data structures, continuations, futures, ...) the gap is shrinking, of course.

"Simon PJ: One very

"Simon PJ: One very understudied thing: Visualization tools for understanding behavior of parallel programs are sadly lacking. Good research topic."

I always wondered why there weren't more tools for this. I was surprised to find a program called "showthread" was included with Critical Mass Modula-3, which could be called from inside a program and actively show you threads (and their states, etc) of the running program. There's also other similar tools for visualizing the heap, and objects. Are there other languages that include tools like this? Modula-3 is the only one I've seen to include them.

Spoonhower et al.'s space profiling visualization tool

On the topic of Simon's comment, check out this paper from ICFP 2008 by Spoonhower, Blelloch, Harper, and Gibbons. They have developed a space profiler for parallel functional programs based on cost semantics, which allows one to visualize the impact of different scheduling policies on space usage.

Re: visualisation tools

Java has pretty good support with debuggers, (heap) profilers and "remote management" consoles (like JConsole and VisualVM).

Indeed, I enjoyed (and

Indeed, I enjoyed (and continue to enjoy) Programming Language Pragmatics, 2nd edition. I also have "Advanced Programming Language Design" by Finkel, whom Scott was a student of (and is structured very similar to PLP)