generating interpreters, IDEs, etc., from simple specifications?

I've been wondering about this for a while, google and LtU archives didn't bring up anything:
Is it posssible to generate an interpreter and an IDE for a language by providing a simple BNF type specification?

My understanding is that BNF (or EBNF) as it is usually used by parser-generators is not strong enough for this task because:
1. The specification has to deviate from its simple language representation due to problems such as left recursion.
2. CFG, by definition, can't do context based analysis, doesn't know anything about typing rules, in other words, doesn't know about the semantics of a language.

Regarding the first problem, I found the following from CTM (p. 641, last paragraph) interesting:
"Relational programs can be written to parse highly ambiguous grammars. This is one of the most flexible ways to do parsing. It can parse grammars with absolutely no restriction on the form of the grammar. The disadvantage is that if the grammar is highly ambiguous, the parser can be extremely slow."

Doesn't this mean that we can describe a general grammar which is good not only for parsing, but also for extracting a simple, clean AST (let's say we accept the slow speed for the sake of building an easy language-prototyping system).

Regarding the semantics of a language: why couldn't we use some sort of operational semantics, the way they are described in Pierce's TAPL (actually PVR has has an SOS in CTM...lol :) )

In other words, given the following:
--A simple logic language with a context, well integrated with an EBNF syntax (an 'extended' EBNF),
--A way to tell this 'system' where to find existing source code files (so the generated IDE can parse those 'libraries' to provide code-completion)
--Typical OS interface routines (to print to screen, read from keyboard, read/write HD, etc.)
What else is required? Would this scale to non-[OO, procedural, functional] languages?

(I understand that many existing languages are not easy to define using TAPL type rules...perhaps a system such as this, with all its benefits, will encourage non-expert language creators to put more thought into semantics)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Prolog, etc

Regarding the semantics of a language: why couldn't we use some sort of operational semantics, the way they are described in Pierce's TAPL (actually PVR has has an SOS in CTM...lol :) )

Yes, you can generate interpreters directly from a natural deduction-style operational semantics using something like Prolog. You can parse the surface syntax the same way too--it's all just term rewriting.

An alternative is a denotational semantics, which is pretty much a directly executable interpreter written in a purely functional style.

I understand that many existing languages are not easy to define using TAPL type rules

I assume you mean the natural deduction approach to specifying an operational semantics. That actually works rather great for most languages, including imperative ones.

follow up question

Naturally, the question arises, why isn't it mainstream then? Is it simply because of a disconnect between researchers and industry practitioners (and hobbyists) or is there a more fundamental reason?

Verbosity

Part of the problem is that precise specifications in these kinds of formalisms are often far more verbose than implementations (assuming a good implementation language). The "precise" part is important: a lot of the examples of operational semantics for languages you see around are not totally precise and unambiguous, leaving humans to interpret them and fill in the holes in the "obvious" way.

Some cases, like a type relation specified via natural deduction-style inference rules, are more concise than their implementation but these tend to be the exception rather than the rule.

A few resources, maybe minus the BNF

I've seen a few things about getting nice IDE support for a DSL (good error reporting too, I think), probably with some help building the interpreter too.

Some work from microsoft under then name Intentional Programming has come up here before.

There's also the JetBrains Meta Programming System, which seem to only have been mentioned once on this site.

Meta Programming System

I read about MPS a while ago, it doesn't look like there have been any updates in over a year. From what I recall, MPS mixes syntax and semantics. I'm afraid I can't critique it more specifically since I don't have the necessary knowledge or experience. The system I envision would allow independent development of syntax, typing rules, etc. In other words, similar to TAPL, where Pierce develops a number of languages without ever defining a specific syntax. I suppose the same would have to be true of someone who wants to experiment with different syntax, but would 'cut&paste' the semantics.

LSE

You might want to take a look at DEC's Language Sensitive Editor.

Did you notice 3APL?

perhaps a system such as this, with all its benefits, will encourage non-expert language creators to put more thought into semantics)

I wonder if you saw an earlier post for 3APL. The cognitive semantics the authors propose is constructed in a very flexible "triple" language system. The three languages are old familiar languages implemented in Java. They are STRIPS rules, prolog syntax rules and an inference cycle. On this basic level 3APL is a language of predicates. predicates are the starting point for semantics, and of course easily converted to XML the current semantic favorite (ie the semantic web). What's not to like about this???

Check out TinkerType

Since you mentinoed TaPL, you may be interested in TinkerType. TinkerType is the system used by Benjamin Pierce when writing Types and Programming Languages* to generate parsers, interpreters and typecheckers for the languages presented in the book. You can even download the TinkerType system from his homepage and play around with it.

I think the approach he uses is facinating, because the inference-rule based approach he uses to define the syntax, typing and evaluation rules reminds me of the way one would model the semantics of the calculs in a categorical setting. It would love to see such a connection explored...

(*) At least, he alludes to a book draft in the linked paper, which I can only assume turned into TaPL.