## What data sets would tell you the most about what sort of programming language to design?

Let's say all data in the world is open to you, for the express purpose of understanding the habits of programmers, and you have the best crack team of statisticians in the world to help you analyze it.

Which subset of this data would you be most interested in, as a programming language designer or programming language theory researcher?

## Comment viewing options

### Simple optimisations

I'd want to see what takes programmers a relatively long time to implement and where they tend to introduce errors. That would be a good starting point, but it probably won't be enough if one was interested in really radical change.

### Refactoring

A somewhat vague suggestion: Patterns and metrics related to refactoring code could be interesting. How much and what kind of maintenance goes into various paradigms. Additionally, statistics related to integration between various libraries, languages and frameworks. Perhaps this could point out some snowballing effects?

Upgrade patterns, too, especially public APIs. This and refactoring would be my choices.

### Differences

I'd ask "What's the differences between different languages". A braindump:

Which types of behavior is most different between different languages? What takes more time and less time for developers? Which different things do they do more / less? Ie, more time spent designing, more time spent typing, more time spent refactoring, more time spent debugging, more / less deviation in time for "similarly estimated changes to program".

Which features are more / less used in different languages that has the same kind of features? Ie, both Ruby and Python have higher order functions relatively easily available - where are they used most? Both Ruby and Python have trivially defined classes - where are the class definitions used most? Is this culturally tied (where you'd expect to see large differences in multiple axis some subgroups that don't know the general culture of that language) or is it tied in with the cognitive load of using different features in each language (ie, does the syntax make a significant difference)?

### DSL by cluster

I'd use the magic power of the well-trained statisticians to organise all of the data according to the intended output of each programmer in the sample. Then I'd cluster this data to find similar tasks that lots of people spend lots of time trying to do. I reckon there would be 700 such clusters in the data and it would be nice to know what they are...

### Errors and rework

Figuring that language design has been dominated by "projects start with an empty editor window" thinking, I'd love to see frequencies of various error categories and fix time distributions. Similarly for frequencies of architectural rework and rework time distributions.

### Aha!

So this is the second reason I started this thread.

My attempt will be to (hopefully) show that part of what you want to know, is already freely available today. Fix time distributions are trickier, and really depend on the nature of validation and verification. For example, 44% of security products tested by ISCA Labs have security flaws. But what is really interesting is that security product certification continuously changes, so as you fix one problem one way, a new set of problems might emerge that no security products handle, so you lose your product certification. Are you measuring the fix time based on how long it takes for your product to be re-certified? Or are you simply measuring the fix time it takes to fix the specific problem ISCA reported to you the last time you went through their product testing program? So interpreting what a fix time distribution means is tricky, even if you have one (which ISCA sort of does).

WebDAV creator James Whitehead has a methodology for fix time that measures from the time the specific bug was introduced to the time the fix was introduced. See: How Long Did It Take To Fix Bugs?; they measure ArgoUML and PostgreSQL. They also aggregate bug fix times by looking at which files are likely the messiest, requiring the most rework.

Perhaps you might be interested in Common Weakness Enumeration. Some of these security weaknesses are also general programming errors, such as CWE-665: Improper Initialization.

### Question

The basic idea is to develop a score of how usable an API is and how well it prevents errors by simplying searching Google with the error message and some context information (stack trace hints, class name, type, etc.).

My translation of this idea would be to bake into the programming language unique error GUIDs as part of the error message, so that if users were to post error GUIDs on the Internet, and other programmers were to explain the nature of the error, then we could data mine, given an error GUID, how are people using this API most commonly to create invalid usage scenarios? The IDE could then also support, in the debugger, the ability to search online for human help with this error message. There could also be a default forum for dumping error messages, stratified by things like what language, language version number, library, type/class, function/method, etc. My basic issue with the stack trace + error message paradigm used today is that the error message alone usually does not provide detailed enough information, because sometimes errors require context of where they were called to understand the problem, especially when debugging live.

In this way, debugging would be assisted by human beings. Sort of in the spirit of Luis von Ahn's work on recaptcha and gwap.

To be clear, the programming language feature I am arguing for is error GUIDs, plus leveraging the supercomputing power of every human programmer that uses that programming language to more quickly debug problems.

### It's called cooperative bug

It's called cooperative bug finding -- I've long been a fan and variants are important for large companies (e.g., Adobe, Firefox, and Microsoft have variants, whether for feature tracking, bug fixing, or something else).

### Not quite

Seems like they are using a different methodology.

I was thinking about making it a feature of the language itself.

I also think their reporting mechanism is more computer-oriented than human-oriented, so I dislike it for use by average joe's who just want immediate help in the IDE for solving a problem.

The kinds of crashes I want to diagnose are related to third party vendors with poorly designed libraries. One example would be ASP.NET's SqlDataSource control and its lack of support of data independence:

<form id="form1" runat="server">
<asp:SqlDataSource
ID="SqlDataSource1"
runat="server"
ConnectionString="<%\$ ConnectionStrings:Northwind08 %>"
SelectCommand="my_sproc_example"
SelectCommandType="StoredProcedure">
<SelectParameters>
<asp:ControlParameter Name="Param1"
ControlId="Param1Filter"
PropertyName="SelectedValue"
ConvertEmptyStringToNull="True"
DbType="Int32"/>
</SelectParameters>
<FilterParameters>
<asp:ControlParameter Name="Param1"
ControlId="Param1Filter"
PropertyName="SelectedValue"
ConvertEmptyStringToNull="True"
DbType="Int32"/>
</FilterParameters>
</asp:SqlDataSource>
<asp:Label ID="Param1FilterCaption"
runat="server"
AssociatedControlID="Param1Filter"
Text="Filter by Param1" />
<asp:DropDownList
ID="Param1Filter"
runat="server"
AppendDataBoundItems="True"
AutoPostBack="True">
<asp:ListItem Selected="True" Value="">Any</asp:ListItem>
<asp:ListItem Value="100">Some</asp:ListItem>
<asp:ListItem Value="0">None</asp:ListItem>
</asp:DropDownList>
</form>


The scenario here is that by default, the dropdown list control has the Any option selected, which translates into an empty string. The SqlDataSource1 control then is wired to interpret the empty string as a null value. However, because we've simultaneously specified ConvertEmptyStringToNull="True" and DbType="Int32", the program crashes on a stored procedure parameter formatting exception. However, if the DbType can be implicitly converted (i.e., it is a varchar and a DropDownList.Value returns a string) then no exception occurs.

Simplying creating "profile trees" (what that project you linked seems to be focusing on) isn't enough. ASP.NET controls even have a BuildProfileTree configuration boolean option baked into the object model, but nobody uses it!, especially not as a way to link humans together into clusters so that one human can see another human solving potentially the same problem.

By the way, just for fun: So what is the solution to the above problem?

Depending on how much rework and how robust you want to make the solution:

1. You can add an architectural invariant disallowing developers from using SqlDataSource. That's IF you are the architect on the project and have control over what static analysis rules get used. You can then create your own SqlDataSource control that implements the same problem domain abstraction responsibility, except less prone to error. That's IF you have the time to do so and are willing to accept the fact that you will never have as much test coverage as millions of ASP.NET programmers testing the SqlDataSource control.

2. Add the following to the SqlDataSource1 XML element: OnSelecting="SqlDataSource1_Selecting"

In the code-behind file, add the following:

    protected void SqlDataSource1_Selecting(
object sender,
SqlDataSourceSelectingEventArgs e
)
{
if (e.Command.Parameters[0].Value == null)
{
e.Command.Parameters[0].Value = System.DBNull.Value;
}
}


Note the array index into 0 is due to the fact that there is a coupling between the order parameters are put into the SelectParameters collection and FilterParameters collection.

3. Generalize (2) so that SqlDataSource1_Selecting maps to a chain of selection handlers. Do not depend directly on array indexes for parameters. Generalize the problem so that whenever a value is null and the target value to convert to is not a string, the correct conversion logic is applied. Use Mono's Gendarme or FxCop to statically check that this special selection handler is always used whenever somebody sets both ConvertEmptyStringToNull and a conversion target type other than string. Further rules can potentially be applied to detect other configuration errors, but these rules might not be specifiable using Gendarme or FxCop and you might need a better tool like NCover and its Code Query Language (or simply roll your own using System.Reflection).

I've long been a fan and variants are important for large companies (e.g., Adobe, Firefox, and Microsoft have variants, whether for feature tracking, bug fixing, or something else).

How does Microsoft solve these problems for their own library? MSDN is vastly overrated, and a data mining of their community contributed content sections on MSDN would reveal that they basically get no community contributed content. When they do, their is no good mechanism for notifying others about community contributed content or where to look for it. Moreover, as more and more "SDK's" are published by Microsoft for .NET, such as Silverlight, documentation diverges and its not at all clear how Microsoft chooses to propagate community contributed content for platform-specific "SDK's".

This is my thoughts on the problem, driven by a real world example. By contrast, the Wisconsin project seems to be hoping to find real world examples, and they seem more interested in how they can apply statistics and machine learning theory to software engineering, than actually focusing on particular real world problems (this isn't a criticism so much as a fact) resulting from poor design methodology. Here is a direct quote from their website:

This is what we call statistical debugging: finding bugs in programs via automated statistical analysis instead of laborious manual inspection. Our strategies combine facets of traditional program analysis with innovative techniques from statistics and machine learning theory. We cannot say too much more at this point, as this is the facet of our research that is changing most rapidly: anything we post here will be obsolete by the time you read it. The curious machine learning researcher is advised to take a look at our published work for a more detailed treatment of the specific statistical modeling techniques we use.

To be honest, their more academic approach is fantastic. However, it also is more complex than necessary and still ultimately requires humans to actually solve problems. So my position is you should get humans involved early, as early as possible, as soon as they are aware they are having problems. Then we can incorporate ideas from places like George Fairbanks' Design Fragments Ph.D. thesis at Carnegie Melon to aid in patching bad designs and preventing client API users from using bad design configurations. This evolves an API (and even programming language) into a safe subset, avoiding the too clever by half nonsense nobody will ever use safely, anyway.

How would you propose we still do their machine learning approach ("statistical debugging") yet still allow extremely early problem solving by humans? In other words, why wait for the law of large numbers to kick-in? How do you balance that concern, as a statistician? Literate programming is about displaying data for humans so that it can be turned into problem-solving knowledge; literate programming is not for machines. I am more interested in crashes at development time, because when a programmer is using a library and some function call is causing a crash and he cannot figure out why, he loses valuable billable business hours trying to work around the problem. Moreover, these issues might be "showstopper" bugs that users never see but ultimately add to the cost of developing software.

Liblit's Ph.D. work seems more problem repair-mode oriented, whereas I am interested in problem avoidance-mode. One way to integrate the two could be if I could use Liblit's repository to determine if a crash had a stastistically improbable dump: the program crashed on a configuration never reported before. This would let the programmer know they are pioneering the wild yonder.

The next big idea would be to incorporate what I want to do into a type system!

### [format] Z-bo

Think you could reduce the width on some of those <pre>-formatted lines? I'm having to scroll horizontally to read every post here...

### Done

Sorry 'bout that. For what it's worth, I spent most of the effort in writing that post on a script to convert the < and > into HTML DTD XML Entities.

### Psychological experiments

I'd like to measure and correlate conciousness to correct code, and correlate brain activity to correct code design.

Then, I'd like to hook up a programmer to realtime brainwave-reading hardware that can provide the programmer with feedback on what code may contain errors, what code may be designed overly complexly, what code was rushed, etc.

### Aren't there tons of UIST

Aren't there tons of UIST and CHI papers already on this subject? Ok, these papers are mostly biased to end user programming, but their conclusions are applicable in general.

I'm confused by your question. What do you mean as "data"? All knowledge is available to us? All recorded history? Or just anything that has every been written down/saved (source code, documentation, design documents, etc...)? There is a big difference. "I could ask an all knowing being anything" would be very different from "assume we have access to all written records and could analyze them in very powerful ways."

In the latter case, I'd be interested to see how much duplication (poorly abstracted artifacts) in the current universe. And what get's duplicated the most, maybe that could inform language design. But I suspect that coming up with the right questions might be harder than acquiring the data and analysis technology.

### Good question

My best example would be my personal wishlist.

I wish I had access to data about how learning materials were used. In particular, sales figures of books, including breakdowns of who they were sold to.

e.g., if it was sold at a campus bookstore, then the book is targeted for a college course: what course? College bookstores do keep track of what course (maybe even section) the book is for. It would be even more interesting if I knew what parts of the book the course covered.

I have a basic thesis that a rising tide lifts all boats. I wonder how much better computer science would be if current mainstream authors I dislike (Savitch, Dale, Deitel&Deitel, etc.) simply stopped publishing their mediocre-at-best books. I really dislike books that poorly define PLT concepts and then give incorrect examples of them. Most of the 'bad' authors appear to have their roots in Fortran programming and try teaching structured programming in an object-oriented course. A good reaction to this has been Kim Bruce's more object-oriented approach, but as far as I am aware the total sales of Savitch, Deitel&Deitel and Dale books are probably an order of magnitude more than Bruce's. This is a shame, considering Bruce is a respected PLT researcher.

I've also always been curious about Norvig's claim that his textbook has 98% of the market share for introductory AI undergrad courses.

Now, today, this data might be impossible to get at. Erik Brynjolfsson at MIT is an economist (Really, statistician) who tried reverse engineering Amazon's sales ranking system to understand the sales history of books. I didn't really find his results that interesting, though...

But I am beginning to wonder if these e-Reader devices will totally invert how statistics are gathered and collected about what people read. I am wondering if people will have more control over how they share information about their habits. This could yield interesting collaborative results, such as insights into how people read books. I could also see that instead of using a forum discussion, future e-ink systems have rich hypermedia annotation capabilities allowing to ask other readers questions about material.

I've bought many books over the years (I don't know how many tech books I own, but my closet top shelf is tacked five feet high and five feet wide, and must be 400 books...), and the one odd thing I notice is how few books for sale have highlighting. This could merely be a fact that book sellers dont purchase marked books often. However, even when I buy a book with highlighting in it, what gets highlighted is weird. It's especially funny when I happen to coincidentally order a book from Microsoft's Redmond Library.

When I was a kid and we still had the paper card system for checking out books at the library, I always knew what the good books were to checkout because I'd see people a class ahead of me who read it and the best books always had the most paper card stamps. This was cool, because I could get the best books on mixed martial arts instruction and follow the pictures in those first. Likewise for drawing stuff.

O'Reilly does give hints on what emerging technologies are popular, but they are not that insightful; the whole purpose behind O'Reilly's existence is to force people into learning new technologies (creating new technical book markets).

### I would like to know which

I would like to know which portions of the program text have been looked at the most. I would assume that a slower movement of the eye muscles implies greater concentration per line.

This information is freely available today. I will dig up the reference, but the critical idea is that most programmers using imperative programming languages with mutable state tend to travel back up to declarations. Eye movement is very rarely left-to-right. It is mostly down-and-up.

The Relo project and MIT goes beyond just eye movement and tries to capture correlations in how programmers "touch" source code and possibly make edits to it. Vineet recently posted on LtU, mentioning he is attempting to commercialize these ideas. Relo can present to the programmer visualizations about their work habits as they program.

I do not know what studies say for applicative, side-effect free languages.

### Other code.

If we're really allowed to ask for any data, and for sophisticated analysis to be done on it, the data I'd ask for would be the millions of reams of licensable code that people have written, rendered as much as possible in a canonical form that is at least likely to be matching for functions in the same language that do the same thing, in searchable indexes by license category.

I have this recurring fantasy of a tool that, when invoked on a buffer, looks at what you've typed in the buffer so far, does some analysis to try to find a "canonical form" expressing those semantics insofar as possible, compares it to a vast codebase, and then interactively makes suggestions of other code to 'complete' what you're trying to do. For example if it found one of the thousand canonical forms of 'enqueue', it would say that it recognizes the function from queues (LIFO), stacks (FIFO), and lists (collection abstraction), ask which you're trying to implement, and if you pick one of those options, present you with a list of available code (different possible 'completions' with different sets of bells & whistles). You'd then select and insert the rest of that canonical unit, object, or module of code (the 'dequeue' function, the 'member' function, the 'is-queue' function, the 'insert-if-not-already-a-member' function, etc).

The idea is that, particularly with programming language constructions, finding exactly the thing you need in a usable library, downloading it, figuring out how to interface to it, etc, is often more of a pain than just rewriting a basic thing from scratch.

In order to make a search precise enough, you have to get down to the level of code, and not just code but code semantics. No search engine on earth, optimized to deal with ambiguous human languages and short queries, is prepared to accept and do anything useful with that kind of detail in a search. Moreover, even if it were, finding a useful query for that search, by specifying the semantics in some unambiguous language, would require the same semantic discipline and precise type of language as programming itself.

And if you're going to be writing code to specify the semantics you're searching for ... why not just start by writing it in a programming language? The best search query is a function (or two) of the very same code you'll need to have written anyway if your search fails. That way, if the search works, you save time and effort; if not, you don't lose more effort than just the simple act of invoking the tool after you've written the first method or two.

Sigh. This is not so much about language design as it is about making usable libraries effectively and effortlessly searchable. Of course the same data would be an excellent starting point for language design -- if you can identify things that people keep writing, over and over and over, you can identify the things that you need to provide in a standard, understandable, easy-to-interface set of standard libraries with your language.

### There's billions of lines of

There's billions of lines of open source code availble

### Stacks, Queues vs. ......

How many lines of codes are implementing RB trees or evaluating folds these days vs. banging line after line after line at some API or another to access an RDBMS, some (terminally unique) GUI library, a DBM persistent hash, encoding up some data for transmission over some TCP/IP based protocol and so on and so forth.

In other words, do we need some kind of "program" to come up with a new set of much larger abstractions, or large composites of smaller ones, in order to recognize these other by now "canonical" but much larger abstractions or "patterns" (don't want to open that can of worms) in code bases as you describe???

In other words, with almost any language tool I'm going to pick up today, I *already* have a reasonably nice collection class library, and access to, at the very least, likely *free* libraries of the most common algorithms (numerics, engineering, search, maybe graph algorithms, etc.).

So wouldn't I be mining a code database for the much more complex (in terms of number of annoying moving parts) API intensive chunks of code I'm likely to not want to reproduce for the umpteenth time.

Of course, this presumes 1) such API intensive code can be reasonably abstracted and 2) these abstractions will "trickle down" to the API's themselves in order to *have* any reusable code of this kind.

Scott

### I contend that yes, it's exactly what we need.

Work on "patterns" etc has been based on the personal perceptions of professionals so far. And much of our limitations in working with patterns, subsystems, or subassemblies, is an artifact of the languages we use and how those languages shape our thinking about semantics.

For example, although the stuff in the "Design Patterns" books was touted as recurring patterns that never quite recurred in forms close enough to reuse code, the authors were thinking in C++ or Java, and perceived a difficulty with reuse that doesn't necessarily exist in all languages. I found that most of the time they were implementable, in a directly reusable form, using Lisp macros.

This isn't saying I'm smarter than those guys; I'm not. But a lisp background gives me a different set of tools for thinking about semantics. Professionals are human beings. We human beings see mostly what we expect to see and what we are looking for. Data mining tools, and rigorous analysis, if we can figure out how to apply them to semantics, would ideally see what's there instead.

I think there are entire large-scale classes of software patterns and code subsystems or subassemblies that we haven't even noticed yet, because we are too busy dealing with the individual leaves and trees to have made a cogent map of the forest. No language we've developed or used so far gives us the means to think about them in ways that make the similarities expressible.

If you really want to revolutionize the way people program, you have to provide tools that work with semantics - with the meanings of programs - in nuanced ways. And right now we don't really understand programming semantics well enough to provide more than the most simplistic of such tools.

I think the first step in building our understanding of programming semantics is to collect millions of reams of code and start working hard to observe & document semantic similarities and differences between "similar" code. And, more to the point, once we get our hands around the technique of searching, observing, and documenting those semantic properties, we can develop tools to automate the search. That's exactly the kind of database that my "fantasy tool" mentioned above would require. And it's a decent starting point for it; it could be useful/worthwhile without more semantic knowledge than we have now, but it would dramatically reward every advance.

Ray

### Semantics-based Code Search / Service Search

There has been some work done to the effect you're seeking. One paper on the subject is Semantics-based Code Search [and website] by Steven Reiss.

I'm not so certain that's a practical way for development to 'scale', though, at least not by itself. It is almost certainly intractable for complex pre and post conditions, and especially those that involve side-effects or integration with external systems (actuators, sensors, UI).

I think it critical that we change how programs are pieced together to something fundamentally more scalable, more secure, more subject to runtime upgrade. Rather than searching for code that does XYZ, why not search for a capability (which could be provided by an external service)? If Google already provides a services that convert numbers to roman numerals and back, why not use it? That way it's maintained outside your application.

Capability security allows one to run untrusted code locally, which raises the performance ceiling much higher than it is for ambient authority systems: no need for context switches, far less need for serialization, and reduced duplication of process. Object graphs may be reduced during a compile, to eliminate costs of delayed binding. To the degree that services are stateless and contain no sensitive capabilities, Google could simply replicate chunks of the service over to your machine and have them compiled in by the linker; this would be subject to decisions regarding disruption tolerance and performance, and replicated components of a service could be easily shared among all applications on a host. Services designed for it would be subject to runtime upgrade, though a JIT might be needed to get good performance.

Using capabilities instead of source libraries allows them, importantly, to be specialized to particular devices, particular services. I.e. one might have a different capability for each webcam, for each file - including remote webcams, remote files. The developer has a database of capabilities to combine into new service, and can find more by poking around in public registries, or by creating services to be shared. The new services, in turn, export new capabilities (and may, with support from the IDE, be subject to live programming).

Since capabilities are specialized, they are much more interesting and much more widely varied. A library of capabilities would offer a developer a lot of real power, and truly 'useful' decisions. A choice of one capability over another can be a significant one... far more interesting than a choice between an AVL tree vs. RB tree.

We'll still want library code to support confinement, to control implementation details, to glue services together, to break large services into bite-size chunks, etc. But I'd favor seeing the use of source libraries marginalized in favor of capabilities to live, external services. We can't easily search in terms of preconditions and post-conditions on services, and that's why a human needs to get involved.

### You're THIS close

Work on "patterns" etc has been based on the personal perceptions of professionals so far. And much of our limitations in working with patterns, subsystems, or subassemblies, is an artifact of the languages we use and how those languages shape our thinking about semantics.

For example, although the stuff in the "Design Patterns" books was touted as recurring patterns that never quite recurred in forms close enough to reuse code, the authors were thinking in C++ or Java, and perceived a difficulty with reuse that doesn't necessarily exist in all languages. I found that most of the time they were implementable, in a directly reusable form, using Lisp macros.

This is Peter Norvig's argument, which mostly misunderstands how some of us view the GoF's seminal text.

First, as you say, it is not at all clear the GoF fully understand the significance of what they did. The primary clue is the descriptions of certain patterns, such as the GoF State Pattern. Ask any pair of OO practitioners and it is possible they will disagree on how to interpret this pattern. For example, this old usenet://comp.object thread Applying the State Pattern has two viewpoints: Charles Rapp's and Martin Vuille's.

Now put ALL THAT analysis into a Lisp macro, trivially. cap-talk and the UML Forum will all be interested to see how maintainable that solution is.

As you can infer from this discussion, I do not think highly of programmers who trivialize design patterns by arguing they are invisible implementation patterns. Arguing they can be made invisible completely misses the point behind model-driven architecture: They should NOT be invisible, as by doing this you are basically arguing for throwing away model artifacts and working just from code. This then puts all the pressure in the world on you to develop a programming language that is homomorphic to your models yet somehow hides a whole bunch of cruft your programming language will inevitably force you to provide to get the model to execute. On the other hand, design patterns probably also don't need to be implemented using static structure in the coding language. That seems to be abusing the language's production rules to encode a design pattern. Therefore, if you separate analysis, design and implementation, you will end up with a different factoring of your code.

<HotButton>
(One more side note: GoF describes State pattern as a synchronous flow, and does not describe the possibility of encoding an actor to asynchronously transition between roles. In this way, you are THIS close to realizing that the GoF probably did not fully achieve the potential significance of their work, but you throw away your realization by arguing for something worse [Lisp macros].)
</HotButton>

But a lisp background gives me a different set of tools for thinking about semantics.

Lists for thinking about semantics? I don't think Marvin Minsky convinced anybody other than fellow Lisp programmers when he argued Lisp was a good mental model of how the mind works. Nevermind the fact if I am going to be thinking about semantics, I want some more involved mathematical discipline than list processing.

### Design Patterns or Implementation Patterns?

There seem to be many touted 'design patterns' that are basically implementation patterns that cannot be effectively abstracted in a host language. One should not be so quick to dismiss this view, as held by Peter Norvig, Paul Graham, and many others.

Paul Graham says in Revenge of the Nerds:

In the OO world you hear a good deal about "patterns". I wonder if these patterns are not sometimes evidence of case (c), the human compiler, at work. When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I'm using abstractions that aren't powerful enough -- often that I'm generating by hand the expansions of some macro that I need to write.

And similarly, Peter Van Roy in Paradigms for Dummies notes:

• If we need to model several independent activities, then we will have to implement several execution stacks, a scheduler, and a mechanism for preempting execution from one activity to another. All this complexity is unnecessary if we add one concept to the language: concurrency.
• If we need to model updatable memory, that is, entities that remember and update their past, then we will have to add two arguments to all function calls relative to that entity. The arguments represent the input and output values of the memory. This is unwieldy and it is also not modular because the memory travels throughout the whole program. All this clumsiness is unnecessary if we add one concept to the language: named state.
• If we need to model error detection and correction, in which any function can detect an error at any time and transfer control to an error correction routine, then we need to add error codes to all function outputs and conditionals to test all function calls for returned error codes. All this complexity is unnecessary if we add one concept to the language: exceptions.

The common theme in these three scenarios (and many others!) is that we need to do pervasive (nonlocal) modifications of the program in order to handle a new concept.

Which is to say that there are pervasive, non-local patterns that could not be abstracted effectively without the new concept.

Haskell developers discovered a sort of one-size-fits-all pattern with Monads or the more general Arrows, allowing one to express and abstract a very wide variety of non-local implementation patterns. Lisp uses Macros - in combination with functions and state - for the same purpose.

Visitor pattern and Composite patterns are good examples of 'design patterns' unnecessary to many host languages.

And 'State Pattern', the one you raise as an example, doesn't seem to be a special exception. If one wants behavior to change based on history, one certainly needs the state concept. (More accurately, change in behavior based on history is the very definition of state.) But if 'State Pattern' is to mean more than 'State', it must refer to using state in a particular way - most examples describing some sort of mutable delegation to another object. But the need for this particular approach may be viewed as an implementation artifact of the host language and its type-system and the common tendency among OO languages to represent object behavior as second-class. If the host language is flexible enough to represent object behaviors as first-class values, one would not need extra objects - extra names - to represent state. Avoiding the extra names has many advantages, esp. wrgt. modifying behavior continuously or using many external resources to compute an object's behavior, as via an FRP expression.

There do seem to be 'design patterns' that aren't so focused on implementation. Strategy pattern, for example: it doesn't really matter how it's implemented, so long as a pluggable strategy is separated from the goal. OTOH, perhaps 'Strategy Pattern' only seems an exception because we don't have languages that provide better support for expressing programs in terms of goals supported by a library of strategies. I.e. the 'strategy pattern' can be viewed as the 'pervasive non-local pattern' of explicitly separating and late-binding a strategy in order to achieve a set of goals.

I think I'll stand with Peter Norvig et. al. on this issue. Design patterns, for the most part, are really implementation patterns due to weak abstraction of the host language. I'm interested in good counter-examples.

### Paul Graham is right

Peter Norvig is saying something different, however: He is almost implying that a good programmer can 'reverse engineer', as if the programmer were a tool, the design intention of the original author despite the fact that the design intention is made invisible in Lisp! Too many programmers, including the GoF, appear willing to accept Norvig's position as a Good Thing. Why should I bow so easily? At every level of abstraction, my intention is to maximize signal and minimize noise. Forget visbility! The mathematical thought process is about building a minimal model of a task.

Paul appears to be agreeing with me that OO programmers tend to use a language's production rules to encode a problem into some static structure. To use a class to represent every state in a finite state machine seems like such overkill, and also is obviously not modular and editing this configuration requires complicated rules (see Robert Cecil Martin's paper). That this is the GoF book's shortcoming was a major point of mine. Except I would probably today think in terms of term rewriting instead of macros.

If one wants behavior to change based on history, one certainly needs the state concept. (More accurately, change in behavior based on history is the very definition of state.) But if 'State Pattern' is to mean more than 'State', it must refer to using state in a particular way - most examples describing some sort of mutable delegation to another object.

That is a very powerful and big IF. You shouldn't presume that you need 'history' to change behavior. In a finite automaton, you need only an input and the current state to decide the next behavior. One state suffices. Computation is really about making decisions. At the model level, how well those decisions emulate the problem domain will reflect the maintainability of the solution. At the programming level, how well those decisions can be reasoned about efficiently and transformed into an implementation will reflect a set of non-local choices, including hardware the instructions execute on.

I will repeat that I think it is hurtful and unnecessary to think of the State pattern in terms of state, and that it is better to view it as a way to model migrating between roles, and the pattern should not dictate flow of control (e.g. asynchronous vs synchronous, lazy vs. eager). That Baumer, GoF and Reenskaug all appear to have independently come to the same conclusion suggests people do not study patterns but rather blindly assume they understand their properties based on name alone. This is an unwelcoming suggestion, since it implies most authors of patterns are programming with their pants down.

I don't think the Context object itself has to choose how to handle state transitions. It could be completely asynchronous. However, the key is that the Context object has an internal parameter hidden to clients. Clients are exposed to capabilities only via the shadow of that internal parameter (think of a base relation versus a derived relation formed by a dynamic context). Otherwise, they would be able to (e.g. using out-of-band communication) navigate to responsibilities they do not have the authority to invoke, violating encapsulation of state-process. That the Context object itself shouldn't be able to predict what will happen next simply allows for many implementations (e.g. FRP expressions handled by decoding the message into events).

I.e. the 'strategy pattern' can be viewed as the 'pervasive non-local pattern' of explicitly separating and late-binding a strategy in order to achieve a set of goals.

Good point. A Strategy pattern is a global choice.

### You shouldn't presume that

You shouldn't presume that you need 'history' to change behavior. In a finite automaton, you need only an input and the current state to decide the next behavior.

I did not mean to suggest maintenance of a full recorded history of inputs. If the history of inputs affects the current state of the FSM, which affects behavior of the FSM, that is stateful; the behavior is 'based on' its history. A different history may have led to different behavior.

If the relevant observable behavior of a given system is independent of its history - independent of the number, timing, and content of inputs - then that system is not stateful.

Computation is really about making decisions.

That's an interesting view.

I do not possess an answer to what "computation is really about", but my own unofficial view is that 'computation = communication + calculation'. Calculation seems to largely involve transformations on models. The models may express contingent expectations via their developers, of course, and models are also useful for planning - so computation is at least useful in making decisions. Communication covers real action and interaction with users, sensors, actuators, persistence, pluggable architecture, and even live programming and runtime upgrade and such.

### Calculation seems to largely

Calculation seems to largely involve transformations on models.

Transformation is a category, which is too broad and model is one which is too vague. Maybe I just fail too see why anything but Church-Turing is involved in the "essence" of computation? This might not hold when one approaches computation from certain metaphors or human psychology but those are just more or less useful approaches to create crutches and I suspect all programmers have their own ones which can be roughly divided into known languages and paradigms.

Transformation is a category, which is too broad and model is one which is too vague.

Are you suggesting that 'calculation' means something relatively narrow and precise?

If so, I'm all ears: provide me a narrow, precise definition.

I just fail too see why anything but Church-Turing is involved in the "essence" of computation?

Church and Turing describe particular mechanical approaches to calculation, and minimally cover communications. There still remain many questions about the validity of their theses when dealing with systems that interact with users and environment. Attempting to treat physical reality as just another Turing machine has always struck me as woefully inadequate.

Rather than the 'essence' of computation, I'd agree only to point at Church and Turing and say, "computation is at least that big".  And even that agreement is marginal. A non-terminating 'calculation' seems to me a contradiction in terms - an artifact of language in the same class as Russell's Paradox.

### re: non-terminating

it can be doing useful work even though it never terminates e.g. keep on finding that next digit of pi and doing the side effect of printing it out as things progress -- does that kind of thing become disqualified from being a 'computation'?

### Progress

Progress is a form of termination: that the 'next calculation', when requested, will always terminate.

But progress isn't a property of calculation alone; 'progress' requires communication to be useful, such as printing the digits where someone else can see them. The original Turing Machine was not interactive; it didn't allow anyone to see the tape before it was finished. "print out all digits of pi before showing the tape" does not represent progress even if each individual digit can be printed in finite time.

I did give thought to the possibility of mentioning progress, but since the above comment focused upon the calculation aspect of computation, I chose to elide mention of progress.

### Are you suggesting that

Are you suggesting that 'calculation' means something relatively narrow and precise?

There is no meaning of a term in itself. One can treat computation historically or psychologically or as a process of nature. But when our ancestors abstracted what is common to all of them they have formed the concept of algorithm and in order to make precise what an algorithm is, machine models have been suggested. The purpose of all this is to delimit the variability of meaning.

When we somehow add "modeling" to this, we shift the meaning of computation towards the intentionality of the programmer. There is something outside of the computation that shall be represented by the computation but it doesn't touch the in-itself of the computation. When we add "transformation" to it, it becomes indistingusihable from physical or mental processes and we ontologize nature as computation and information processing. The situation becomes slightly better when we combine "model" and "transformation". But now we have to consider the change of a model in the human mind as a computation as well.

There still remain many questions about the validity of their theses when dealing with systems that interact with users and environment

Physical machines interact with users and environments. Abstract machines or computations do not. They can be completely understood in terms of all possible input values and their responses. A realist can assert there is no such thing as a computation in the way understood by the idealist / platonist. On the other hand physical machines can be designed to express the ideal of an abstract machine most accurately but it can never enclose all of its possible inputs. So we relax this demand in terms of communication and interaction which opens the system and makes the computation actually useful.

A non-terminating 'calculation' seems to me a contradiction in terms - an artifact of language in the same class as Russell's Paradox.

Do we really have to modify the concept of computation in order to include termination and finiteness? I don't see the parallel to Russels paradox which doesn't claim that only finite sets are proper sets - Russels paradox in set theory can be easily avoided adding well foundedness or finite set inclusion which might still be relaxed but this is another discussion.

You keep reading 'computation' every place I wrote 'calculation'. It's quite irritating, especially after I went to the effort of explicitly asserting a position that calculation is only a component of computation. I reject most of what you concluded on the basis of equivocation. For example, I never said a thing against non-terminating computation, only against non-terminating calculation. Communication, and therefore computation, can reasonably be non-terminating.

And I didn't add 'modeling' anywhere. I said 'transformations on models'. The act of designing the model was outside anything I said.

 There are abstract machines that represent concurrency and communications internally (pi calculus, CSP, and actors model being examples). In such systems, you cannot map inputs to responses except in broad strokes. In any case, between Goedel's theorem and Rice's theorem we cannot completely understand (i.e. answer arbitrary predicates for) even simple, abstract, deterministic programs. The best we can do is decide which properties are most important to us, and develop the programs and language in a manner that allows us to assert these properties.

 Re: parallel to Russell's Paradox: Russell's paradox is an artifact of the language used to express the sets. I.e. it is a consequence of assuming that predicates expressed in grammatically correct English are inherently sensible. The 'parallel' is in assuming that calculations expressed in Turing Complete languages are inherently sensible. I would say that (lambda(x)(x x) lambda(x)(x x)) is no more sensible than 'the set of all sets that do not contain themselves'; that is, similar to how English can make statements that make sense grammatically but are ultimately meaningless, so can lambda calculus and other Turing Complete 'calculation' languages. You say that Russell's paradox can be avoided by switching to a more restricted language for describing sets - one with well-foundedness principles - or other mechanisms (such as verification of stratification). The same can be said of calculation languages. Insensible calculations including 'bottom' can be avoided by switching to total functional languages, or type analysis, or various other mechanisms. I'd say this is a fairly strong parallel.

And don't equate 'termination' (a property of calculations and computations) with 'finiteness' (a property of a model).

### Good counter-examples: how flawed do you want them?

For whatever value judgment you place on good...

I think I'll stand with Peter Norvig et. al. on this issue. Design patterns, for the most part, are really implementation patterns due to weak abstraction of the host language. I'm interested in good counter-examples.

The Factory Pattern in API Design: A Usability Evaluation by Brian Ellis, Jeffrey Stylos and Brad Myers

The usability of software APIs is an important and infrequently
researched topic. A user study comparing
the usability of the factory pattern and constructors in
API designs found highly significant results indicating
that factories are detrimental to API usability in several
varied situations. The results showed that users
require significantly more time (p = 0.005) to construct
an object with a factory than with a constructor
while performing both context-sensitive and contextfree
tasks. These results suggest that the use of factories
can and should be avoided in many cases where
other techniques, such as constructors or class clusters,

which builds upon:

Comparing API Design Choices with Usability Studies: A Case Study and Future Directions by Stylos, Steven Clarke, and Myers

There are more APIs than ever, and designing APIs that are usable by
their target audience is difficult. Work at Microsoft has demonstrated that running
controlled usability studies with participants from different personas and
analyzing the results of these studies using the cognitive dimensions framework
is effective at identifying and preventing usability problems in APIs. This paper
presents a generalization of that technique in the form of usability studies of
common API design choices. By studying a single design choice from multiple
perspectives, we can generalize our results to any API that might be affected by
the design choice. We show the feasibility of our approach with an initial study
on whether or not to require constructor parameters, and present our current
and planned work to study more API design choices.

See section 2.4 of that paper for the findings about constructor parameters; interestingly, the Create-Set-Call pattern they mention is exactly the pattern used by the WPF team internally for XAML Control Design: they refer to it as the Create-Set-Use pattern. What's weird is that this internal Microsoft pattern is basically not mentioned anywhere on MSDN. Create-Set-Use also blocks non-nullable reference types unless you are configuring all your objects with default values (e.g., through use of a Propagator), thus allowing invalid object states and preventing invariants that could be used, say, to optimize the program automagically.

I think these papers are very early research into API usability and as a consequence severely flawed, similar to Fagin's seminal study on code reviews. Hopefully, unlike Fagin's study, these studies will not stand for 30 years before someone says, "Hey, this experiment was severely flawed and has a ton of implicit assumptions".

For example, one scenario not covered by Stylos in the original paper was the idea you mentioned: thinking about providing the client user with a library of 'strategies'. After all, a default parameter is really just in essence a default strategy that exists merely so that the client API user can compile the program with minimal fuss. It therefore makes more sense to rethink this study, asking, Should default parameters be tightly coupled to interfaces (e.g., should MSIL stored default parameters directly)? Or should we use some form of external configuration (similar to the Config and Configuration modules in Maude) instead, forming a correspondance between a module instantiation's values and another module's signatures?

Another thing, realizing these are global choices, makes us wonder why we are delegating these global choices to the API designers and trusting their out-of-the-box recommendations. e.g. Windows Communication Foundation ships with some odd default values that can hinder scalability of services, and you have to know all the parameters to tune the system, anyway.

There are other issues here I am not touching, such as thread-level safety, object-level safety, method-level safety and statement-level safety of destructive assignment for trustworthy dynamic reconfiguration.