LtU Forum

Extensible Term Language 0.2.1

The Extensible Term Language is a high level meta-syntax language that that allows to define small and big languages that use blocks, expressions, operators, and statements as primary meta-syntax elements. The language definition is compiled to LL(1) grammar afterward.

ETL tries to find a new balance between syntax generality and extensibility. It is designed to allow creating DSLs and programming languages almost as extensible as Lisp on the syntax level (but macros are supposed to be implemented as tree rewriting rules), while retaining nice surface syntax (this example tries to be as close to Java as possible and this one to be somewhat close to dynamic functional languages). The parser also supports automatic error recovery.

Java implementation of the parser is available for download. The documentation is also available online on the project's web site.

Since the previous announce there were mainly usability changes in the grammar definition language and now it is much more compact. There was a lot of bug fixes. And finally there is a tutorial that demonstrates how to implement own DSL on using the AST parser.

ETL might be a nice tool for quick implementation of own DSL with nice surface syntax and for creating new experimental programming languages.

Call for Speakers: DSL DevCon

Are you interested in presenting a 45-minute talk on some Domain Specific Language (DSL) related topic? It doesn't matter which platform or OS you're targeting. It also doesn't matter whether you're an author, a vendor, a professional speaker or a developer in the trenches (in fact, I tend to be biased toward the latter). We're after interesting and unique applications of DSL technology and if you're doing good work in that area, then I need you to send me a session topic and 2-4 sentence abstract along with a little bit about yourself.

I'll be taking submissions 'til February 9th, 2009, but don't delay. Passion and a burning story to tell count twice as much as anything else.

And don't be shy about spreading this announcement around! I've got good coverage in the .NET and Windows communities, but don't know very many folks in the Java or Unix or hardcore modeling worlds, so if you're in that world, let those guys know! Thanks.

The DSL DevCon itself will be in Redmond, WA on the Microsoft campus April 16-17, 2009, right after the Lang.NET conference. Lang.NET will be focused on general-purpose languages, whereas the DSL DevCon will focus on domain-specific languages. The idea is that if you want to attend one or the other or both, that's totally fine. We'll have 2.5 days of Lang.NET on April 14-16 and then 1.5 days of DSL DevCon content. Oh, and the cost for both conferences is the same: $0.

We're only accepting 150 attendees to either conference. Every one of the five previous DevCons have sold out, so when we open registration, you'll want to be quick about getting your name on the list. Submit your DSL-related talk idea!

Lambda Calculus Notation

I am diving into the world of lambda calculi by reading a few of the books that I've seen mentioned here on LtU. The books sensibly start with "classic lambda calculus". Even though each book uses a different terminology and notation, there seems to be a common inconsistency in the definition verses the use of "term" (a.k.a "expression", "lambda-term", "lambda-expression"). Being new to the formalism, I imagine there is a good chance the inconsistency is only perceived on my part.

In Chris Hankin's Lambda Calculi: A guide for computer scientists (1994), a "λ-term" is defined on page 8 by the following

  Definition 2.2 (λ-terms)
  The class Λ of λ-terms is the least class specifying the following
    (1) x ∈ Λ, x is a variable
    (2) if M ∈ Λ, then (λxM) ∈ Λ
    (3) if M, N ∈ Λ then (MN) ∈ Λ

Page 9 continues with more notation...

  We will generally use the symbol ≡ to denote 
  syntactic equality between terms.

It seems to me that the above is a relatively sloppy statement. The use of "generally" leaves me guessing later and "terms" was not defined as a synonym for "λ-terms". I understand this book is a relaxed introduction so I am not really nitpicking these points.

The inconsistency comes on page 15 when substitution is introduced.

  x[x := N] ≡ N

Definition 2.2 did not allow for square brackets indicating substitution to be part of a λ-term; however, from what I understand the ≡ symbol is to have a λ-term on each side.

To resolve this inconsistency it seems to me that Definition 2.2 needs to be extended.

    (4) if x, M, N ∈ Λ then M[x := N] ∈ Λ

I have used Hankin's book as my example above but the other books I am reading repeat this inconsistency: Revesz (1988), Hindley and Seldin (1986), Barendregt (1984).

Certainly something basic like this would have been noticed in the past if it was incorrect. What is it that I am missing here?

Macro systems

When I first learned Scheme I had heard about how powerful the macro system was, and that the full power of the language was available during the macro system.

What I think is interesting was an assumption I had before I learned how the macro system actually works. I had assumed that one could call functions during macro expansion that were defined in the source program.

I am trying to find out whether such macro systems actually exist, where the AST is evaluated as needed while it is being transformed. One simple approach I can imagine, is that when a macro uses a function defined in the source, that function is evaluated/compiled as needed. If it contains macros that haven't been expanded into something meaningful, then the compiler simply fails.

I think this kind of system where macros are intertwined with a program interpreter, may be hard to reason about formally, but could be a very expressive tool.

Has anyone encountered such a macro system?

Weird computability problem relating to state + lambda calculus

Hi all,
I have a weird question that I've been trying to solve.

I've been reading about lambda calculus, specifically the different augmentations of lambda calculus. The ones that most interest me are the ones with some sort of state, like a store (mu). So my question is:

Given a term, t, in the (untyped) lambda calculus (or in a typed calculus which features full abstraction), is it decidable whether or not it ever modifies the state (updates some value in the store, etc).

My gut says that it's not decidable for all terms, but I want to PROVE it.

-Kevin

Compilation/method resolution with structural subtyping

Nominal types, such as classes and interfaces, give us names to "hang" other names on - methods, data member names, nested types and classes, etc. Interfaces/mixins/traits provide some challenges vs. the well known single inheritance "v-table" of virtual methods, but I found an interesting IBM paper on efficient execution of interface defined methods.

So my question is: What about structural types? The structural types used for (among other things) the "static duck typing" that Scala most recently has brought to general attention?

What is a means for a) correct and b) efficient execution of method resolution, data member access, etc.?

To make it interesting, consider two examples.

-- One structurally typed argument
fun print-name(thing : { .name(-> Str) }) =
  print(thing.name());

-- Arg is List of structurally typed things
fun print-names(things : [ { .name(-> Str) } ]) =
   for x <- things in 
       print(x.name());

I considering the first case and thought that the compiler could pass a hidden function parameter, or some indexes that provide a route to a method in a table of the parameter.

But then I considered the second case, and my (relatively speedy) scheme fell apart. In the second case, any given item in the list could (1) be some object whose class defines a virtual or final .name method, noting that final methods don't typically show up in vtable/RTTI type data; (2) it might be an object that implements a .name method required by an interface; (3) it might be an object who gets an actual .name method definition from a mixin/trait; (4) or it could be some object itself instantiated (via new) directly from some unnamed structural type.

I don't see a way to distinguish this mess of cases until runtime, particularly in the second example of a list of structurally typed objects, and how to compile such code for efficient execution.

I tried CiteSeerX but didn't turn up any papers directly addressing approaches to analysis and compilation.

Any pointers? Thanks VERY much in advance.

Scott

p.s. (I'm very interested in this question because I think an answer might also help me to efficiently compile a set of general, functional record operators and record type operators described in a paper by Cardelli.)

Scott

Looking for papers describing advanced language topics in terms of C programming

I'm interested in increasing my understanding of some of the more advanced programming language constructs I see discussed here and in other places. Specifically the ones that are common in functional programming (first class everything, continuations, closures, etc.). Now the best way to understand them is to actually use them in a language that implements them. But that requires a good investment in learning those languages (Haskell, Lisp, etc).

What I'm looking for are some papers that describe the various advanced topics in terms that a regular C programmer, like myself, can understand. In other words, pretend that you are extending the C language -- how would the syntax look like for the various constructs?

Here's an example explanation for first class functions:
In C, you can take the name of a function, and pass it as a parameter to another function, like what is used in the standard library qsort(). You can also assign those function names to variables and array elements (which is useful for creating state machines). However, you have to create those functions somewhere else in the program. This can make it difficult to read the code and figure out what is going on. In the case of qsort(), which requires (among other things) a pointer to a comparison function, that function is probably only used once. So it would make sense to define the function right there where it is used. This would require first class anonymous functions. If C supported this, this, then the syntax would probably look like this:

qsort (my_array, 25, sizeof(*char), lambda(const void *p1, const void *p2) { return strcmp(*(char * const *) p1, *(char * const *) p2); });

In this example, the keyword "lambda" creates a function on the fly, and returns the address of that function as it's value. You can also use the expression:
x = lambda(int arg1, int arg2) { /* contents of function */ };
And you can now call the function via: "x(arg1, arg2);".

So, if C added the "lambda" keyword with the semantics given above, then it would have full first class anonymous functions.

...

As you can see, I think I've got an understanding of first class functions. What I'd like to see are write ups similar to the above explaining other concepts such as closures, continuations, monads, or anything else that could be expressed in an imperative language such as C. Or, if anything doesn't fit in that context, an explanation of that also.

A couple of postings that I've found that come close are Joe Spolsky's article "Can your programming language do this?" (which gives examples in Javascript), and the paper "Functional programming for the rest of us" on defmacro.org.

In case you are wondering why I'm looking for this, first of all I kind of understand some of the concepts, but would like to further cement my understanding. And secondly, I'd like something I can send to other people I know that are primarily C (or similar language) programmers (since I don't understand the concepts enough to explain them). Also, I'd like to extend the language I'm developing to support as much as is relevant (yes, I know, why write another language -- I'm doing it mostly for my own education, and possibly to use as a resource for others to learn from).

Thanks.

CWE/SANS TOP 25 Most Dangerous Programming Errors

This article is making the rounds on the intarwebs.

"There appears to be broad agreement on the programming errors," says SANS Director, Mason Brown, "Now it is time to fix them. First we need to make sure every programmer knows how to write code that is free of the Top 25 errors, and then we need to make sure every programming team has processes in place to find, fix, or avoid these problems and has the tools needed to verify their code is as free of these errors as automated tools can verify."

Introducing Dawn - yet another new language

During my entire programming career (19 years) I have been interested in programming languages, how to improve and implement them. I am not formally trained in compiler theory, but have implemented a number of compilers/interpreters for a number of domain specific languages during this time. Now I am pregnant to bursting with a new language, which I need to discuss/present to someone, and I hope that LtU is the correct forum to do so (I can see from the posts that a lot of highly gifted people are lurking, so I hope to get thoughtful feedback).

The language is currently under development, so it is not fully specified. My hope was to submit papers to OOPSLA in march with a working prototype, alas it does not look like I can make it happen. Still I cannot go on for another year without releasing something about Dawn. Enough said about me and my motivations...

I will post a couple of times under this topic, with the various aspects of the language, to discuss it with the distinguished members of LtU.

History of Python

Guido van Rossum is recounting the history of Python as a series of blogposts. It's off to a good start.
http://python-history.blogspot.com/

XML feed