## Sapir: Language, An Introduction to the Study of Speech

Thanks to the amazing Project Gutenberg, Edward Sapir's classic book on language is now available online.

True, this isn't about programming languages per se, but still an important work that some of you may want to check out.

## Zing (MSR)

Zing is a new software model checking project at Microsoft Research.

Software model checking is hard, but it is still a promising line of research.

Tools are generally influenced by the type of languages they model. On top of that, I am sure software model checking is going to influence language design and implementation.

## In the Spirit of C

(via Keith Devens)

In the Spirit of C, by Greg Colvin.

A somewhat biased and over enthusiastic overview of the evolution of C and ilk.

I am sure LtU readers will find a lot they disagree with. I suggest starting with the quote from the ANSI C Rationale...

## Tunes create context like language

This article discusses the extension of the notion of context from linguistics to the domain of music. In language, the statistical regularity known as Zipf's law -which concerns the frequency of usage of different words- has been quantitatively related to the process of text generation. This connection is established by Simon's model, on the basis of a few assumptions regarding the accompanying creation of context. Here, it is shown that the statistics of note usage in musical compositions are compatible with the predictions of Simon's model. This result, which gives objective support to the conceptual likeness of context in language and music, is obtained through automatic analysis of the digital versions of several compositions. As a by-product, a quantitative measure of context definiteness is introduced and used to compare tonal and atonal works.

Related Nature article.

From Gyan on Metafilter.

## Shorts

A couple of short items.

• ICLP'04 early registration is open. The accepted papers look intgeresting.
• Jeremy Zawodny wonders whether the perl community is broken, and asks about the communities that formed around other scripting languages.

## Code Generation Netwrok

A nice site dedicated to all things related to code generation.

This site includes a detailed list of code generators for various languages and platforms.

## Constraint-Based Type Inference for Guarded Algebraic Data Types

Constraint-Based Type Inference for Guarded Algebraic Data Types. Vincent Simonet and Francois Pottier.

Guarded algebraic data types subsume the concepts known in the literature as indexed types, guarded recursive datatype constructors, and first-class phantom types, and are closely related to inductive types. They have the distinguishing feature that, when typechecking a function defined by cases, every branch may be checked under different typing assumptions. This mechanism allows exploiting the presence of dynamic tests in the code to produce extra static type information.

We propose an extension of the constraint-based type system HM(X) with deep pattern matching, guarded algebraic data types, and polymorphic recursion. We prove that the type system is sound and that, provided recursive function definitions carry a type annotation, type inference may be reduced to constraint solving. Then, because solving arbitrary constraints is expensive, we further restrict the form of type annotations and prove that this allows producing so-called tractable constraints. Last, in the specific setting of equality, we explain how to solve tractable constraints.

To the best of our knowledge, this is the first generic and comprehensive account of type inference in the presence of guarded algebraic data types.

Seems rather interesting.

## Interactive Programming

By way of Joe Marshall in comp.lang.lisp:

Here's an anecdote I heard once about Minsky.  He was showing a
student how to use ITS to write a program.  ITS was an unusual
operating system in that the shell' was the DDT debugger.  You ran
programs by loading them into memory and jumping to the entry point.
But you can also just start writing assembly code directly into memory
from the DDT prompt.  Minsky started with the null program.
Obviously, it needs an entry point, so he defined a label for that.
He then told the debugger to jump to that label.  This immediately
raised an error of there being no code at the jump target.  So he
wrote a few lines of code and restarted the jump instruction.  This
time it succeeded and the first few instructions were executed.  When
the debugger again halted, he looked at the register contents and
wrote a few more lines.  Again proceeding from where he left off he
watched the program run the few more instructions.  He developed the
entire program by debugging' the null program.


## Light-Weight Instrumentation From Relational Queries Over Program Traces

Light-Weight Instrumentation From Relational Queries Over Program Traces. Simon Goldsmith, Robert O'Callahan and Alex Aiken.

Neel mentioned this paper when I complained on LtU1 that I haven't seen something interesting enough to post for a while.

It is indeed an interesting, well written, paper.

The authors show the design of Program Trace Query Language (PTQL), a declarative language for querying program behaviour. The PTQL compiler can then be used to instrument a java bytecodes and insert the required data capture operations.

Quite a lot of effot went into making the instrumented code efficient enough to allow queries to be evaluated while the program is running.

## Tim Bray: Languages Cost

Tim Bray writes about custom document schemas:

HTML isnâ€™t unusual. Documents are hard to design, and general frameworks for families of documents are even harder. The conventional wisdom back in the day was that to get yourself a good DTD designed, you were looking at several tens of thousands of dollars.

Then, once youâ€™ve got your language designed, you start the hard work on the software. Frameworks like XSLT help, but no significant language comes without a significant cost in software design.

As I've often said here ("here" in the general sense that is), XML vocabulary design is language design. Language design is hard. Hard things often cost.

However, Tim wants us to believe that one language is enough. I really hope he is wrong about that...