archives

In Praise of Scripting: Real Programming Pragmatism

Ronald Loui, In Praise of Scripting: Real Programming Pragmatism, IEEE Computer, vol. 41, no. 7, July 2008. [Openly accessible draft here]

The July IEEE Computer carries an article arguing for the use of scripting languages as first programming languages, and also arguing for a greater study of what the author calls "language pragmatics" (the original article is behind the IEEE paywall, but you can find a draft that has roughly the same content here). The argument for using scripting languages as educational languages can be summed up by Loui's abstract:

The author recommends that scripting, not Java, be taught first, asserting that students should learn to love their own possibilities before they learn to loathe other people's restrictions.
The bulk of the article is devoted to exploring this basic theme in more depth, and provides an interesting contrast to the arguments in favor of moving away from Java (and scripting languages) advanced in Computer Science Education: Where Are the Software Engineers of Tomorrow? (discussed earlier on LtU here).


Loui spends the latter part of the article arguing that, in addition to syntax and semantics, research on programming language should include a formal study of language pragmatics. According to Loui, a formal study of pragmatics would address questions such as:

  • What is the average lifetime of a program written in language X for programmers of type Y, for a program of type Z?
  • What is the average time spent authoring versus debugging a program in language X for programmers of type Y, for a program of type Z?
  • What is the consumption of short-term memory when programming in language X for programmers of type Y, for a program of type Z?

Features of Common Lisp

A compelling description of the features that make CL the king of the Perl-Python-Ruby-PHP-Tcl-Lisp language ;)

Lisp is often promoted as a language preferable over others because it has certain features that are unique, well-integrated, or otherwise useful.

What follows is an attempt to highlight a selection of these features of standard Common Lisp, concisely, with appropriate illustrations.

Languages without operator precedence

(If this has already been discussed here, and I assume it probably has although I haven't been able to find such a discussion by search, please be so kind as to point me in the right direction.)

It seems like most languages come up against the operator precedence problem and take one of these two choices:

1. Make an operator precedence table. (C and many, many others)

pro: assigns meaning to 2 + 3 * 7 = 23 (although if you really like 35 or dislike the normal rules, you can have that interpretation too)
con: readers have to know the precedence table to mentally arrive at the same parse tree that the parser does

2. Forget the whole thing, and always make left (or right) deep ASTs. (lisp, APL, many others)

pro: easy to remember
con: while this is much easier to remember, you still need to know that this is the rule, when coming across it for the first time

I'm wondering about the (less-explored?) third option, and would appreciate information about languages that have taken it, or papers exploring it, or just peoples' thoughts:

3. Forget the whole thing, and _always require explicit grouping_ (at least when properties of the operators in question can't be used to arrive at a "don't care" conclusion).

pro: easy to remember, and easy to mentally parse even if you are unfamiliar with the rules
pro: enforces the (common?) practice of using grouping to communicate intent
con: you have to write either 2 + (3 * 7) or (2 + 3) * 7 (although potentially a clever editor could offer you a choice of candidate interpretations when what you write is ambiguous)

Note that we can use the associativity of + to allow expressions like 2 + 3 + 7 because we are indifferent between (2 + 3) + 7 and 2 + (3 + 7). Similar observations apply to a number of other common operators.

Note also that this doesn't completely eliminate operator precedence, since we still need ( ) to bind more tightly than +, for example. This looks like a matter of degree more than an iron-clad rule, so I'm hoping to avoid stepping in a bees' nest here :). Some might argue to relax the rules even further and allow tight bindings for unary negation, which I think makes good sense, since -x + 2 might not be likely in practice to be confused with -(x + 2).

It's possible to write an LALR (and possibly simpler?) grammar that deals with all of this. I'm still trying to wrap my head around techniques for generating sensible errors from parsing failures, so I'm not sure of the extent to which it is possible to give intelligible error messages.

Is this idea hopeless? Misguided? Already in 17 languages I should have heard of? All of the above?

Error Messages in Dynamically Typed Languages

From wikipedia:

[Static typing] allows many errors to be caught early in the development cycle.

Statically typed languages have another, maybe even more important, advantage: clearer error messages.

Can you spot the error in this code:

def print_all(books):
  for book in books:
    print_book(books)

In a dynamically typed language you might get an error like Undefined method `title` for Array in file print_book.xyz on line 2, if print_book is implemented like this:

def print_book(book):
  print(book.title)
  print('----------')
  print(book.summary)

A statically typed language would give you a more helpful error message print_book wants a Book but you passed an Array in file print_all.xyz on line 3.

Now suppose that you fixed print_all() but made a mistake in print_book. For example, books don't have summaries but descriptions. So the correct code is:

def print_book(book):
  print(book.title)
  print('----------')
  print(book.description)

In this case the error messages will be similar for statically and dynamically typed languages: "Undefined method `summary` for Book in print_book.xyz on line 4".

So the statically typed language 'knows' where the error is. In the first case it's print_all's fault: it passed the wrong value. In the second case it's print_book's fault. Dynamically typed languages don't know where the error is. They usually give you a stack trace and some information on what went wrong in the deepest call in the trace. This is annoying, especially if you are calling library functions not written by you.

What I would like to know is

- Are there cases where statically typed language are wrong about the location of the error? Does this happen in practice?
- Is it possible to guess where the error is in dynamically typed languages, and how?