LtU Forum

[ANN] YARD 1.0: C++ Template Metaprogramming Parsing Framework

Three years after the initial announcement here at LtU of the YARD parsing framework, I've finally posted version 1.0 of YARD. I figure despite being self-promotional (I figure it is a forum quality, if not quite meritorious of front-page status), this post should still be relevant to LtU readers because of its overlap with other LtU topics:

Of course many readers of LtU are language implementers and may find a new parsing framework useful or inspirational.

Below is the official announcement of YARD:

Version 1.0 of the YARD parsing framework for C++ is now available for download at http://sourceforge.net/project/showfiles.php?group_id=126822. The YARD project now also has a new tutorial, written with the help of Max Lybbert, that provides an introduction to language parsing and parsing expression grammars http://yard-parser.sourceforge.net/cgi-bin/index.cgi?dest=Documents&doc=tutorial.

The YARD framework uses a novel template meta-programming technique to construct efficient recursive-descent parsers at compile-time. YARD parsers are constructed as parsing expression grammars expressed using templates in a form resembling EBNF. Parsers constructed using YARD combine lexing and parsing phases, and can automatically generate abstract syntax trees, without requiring a separate code-generation phase.

The YARD framework has been under development for three years and has spawned other related projects (e.g. the Biscuit parsing library, http://p-stade.sourceforge.net/biscuit/index.html ). YARD has been used in commercial tools (e.g. http://www.tmg.de/admin.local/lib/antenna/linux/ReadMe.txt) and various open-source projects (e.g. http://www.cat-language.com).

The YARD library is public domain ( http://creativecommons.org/licenses/publicdomain/ ) but for those who require a release with a specific open-source license, can request one on the discussion group ( http://sourceforge.net/forum/forum.php?forum_id=432769 ).

Simply Easy! (An Implementation of a Dependently Typed Lambda Calculus)

Simply Easy! (An Implementation of a Dependently Typed Lambda Calculus)
Andres Löh, Conor McBride and Wouter Swierstra

We present an implementation in Haskell of a dependently-typed lambda calculus that can be used as the core of a programming language. We show that a dependently-typed lambda calculus is no more difficult to implement than other typed lambda calculi. In fact, our implementation is almost as easy as an implementation of the simply typed lambda calculus, which we emphasize by discussing the modifications necessary to go from one to the other. We explain how to add data types and write simple programs in the core language, and discuss the steps necessary to build a full-fledged programming language on top of our simple core.

The PIllars of Concurrency

Herb Sutter, The Pillars of Concurrency, Dr. Dobb's Journal, 2007-07-02

In his inaugural column, Herb makes the case that we must build a consistent mental model before talking about concurrency.

When I first read the abstract, I assumed that Sutter meant a mental model along the lines of the pi-calculus. But it turns out what he was talking about has more to do with outlining the fundamental things for which concurrency is used, what requirements those things imply, and what kind of solutions can be used to meet the requirements. Nothing startlingly new, although the overview Sutter provides is interesting.

From a language design perspective, it might be interesting to consider which of the "pillars" that Sutter identifies are supported by a given language, how well, and why the language developed that way. Many of the arguments about the "best" way to handle concurrency may simply boil down to differences in the application domains being considered.

Also interesting to note is that much of this column was inspired by work by David Callahan on concurrency support for Visual Studio, so the things that Sutter discusses provide some insight into what Microsoft is thinking about for future development products.

Why Events Are A Bad Idea (for high-concurrency servers)

Why Events Are A Bad Idea (for high-concurrency servers)
Rob von Behren, Jeremy Condit and Eric Brewer, 2003

Event-based programming has been highly touted in recent years as the best way to write highly concurrent applications. Having worked on several of these systems, we now believe this approach to be a mistake. Specifically, we believe that threads can achieve all of the strengths of events, including support for high concurrency, low overhead, and a simple concurrency model. Moreover, we argue that threads allow a simpler and more natural programming style.

The authors take the example of a web server to show that with compiler support for improved threads, they can achieve as good or better performance than event-based applications in high concurrency context.

Extensible Languages

I'm interested in some thoughts on issues related to syntactic extensions to programming languages. As mentioned Felix now has DSSL (Domain Specific Sublanguage) support. The Felix grammar is now loaded dynamically from the library at run time. This looks like:

  syntax dssl1 {
    nt := sym1 sym2 ..  =># "scheme code";
    ...
  }
  ..
  open syntax dssl1, dssl2;

Here every name used is taken as a nonterminal symbol. There is no error checking. A new GLR automaton is built where the open syntax statement is issued, and applied until the end of the containing scope; more precisely, the scope is determined directly by the grammar itself: state modifications to the parser effected in a reduction propagate to the end of the production parsing that nonterminal.

Attempting to parse with an undefined nonterminal raises an exception.

Several DSSLs, including the current set, may already define nonterminals required for some other DSSL. There is no way to specify such dependencies. The same DSSL can be loaded twice. No attempt is made to stop that, the grammar is just extended again. My syntax doesn't allow for handling merges: the parser engine takes the most recent rule and uses that, so an ambiguous parse of a nonterminal will be silently repaired.

Clearly, we'd like DSSLs to be modular and well typed, with the usual kinds of properties we expect from libraries and module constructs. How would you suggest to fix these problems to achieve those goals?

Whilst this question is aimed at programmers, I have one for the category theorists too. In a language like Ocaml, an interesting technique termed open recursion can be used to extend types and functions covariantly an in a modular fashion. An otherwise recursive construction is made non-recursive by replacing inner calls with a parameter to form an open version of it, and then a second construction defines the desired closure by applying the symbol to itself (yes I know that's a crude description). The parameter also allows covariant extension by composing several such open constructions, and then closing them with respect to the composition.

To me there is an obvious and unexplored extension to this technique to grammars. We make a production non-recursive by replacing a recursion by a parameter, so now we have a new notion: a nonterminal parametrised by another nonterminal, and we form a closed term by applying the nonterminal to itself. If we add new productions to the nonterminal grammar, we can form closure of that too, to obtain a new language.

An extension of this idea is to parameterise a whole DSSL (grammar package) with other DSSLs, much like an Ocaml functor, and again form a closed language by a self-application.

It seems to me a nonterminal is really a just a type, and alternate productions are really constructors, so the analogy with open recursion for types is strong, and it would seem the user actions similarly would need to be open recursive.

Is there any theoretical work on this idea that suggests how a statically typed grammar with open recursion would be typechecked? For the modular extension? Is there even a model of it?

To me this kind of thing is better than what I am doing now, which effectively adds productions to an existing nonterminal, making the old version unavailable. Although this is dynamically covariant, it is not pleasing because it is a mutation rather than being purely functional and combinatorial.

Cilk, OpenMP, or what?

For SMP programming at the C level, Cilk's model is sweet and clean. The OpenMP model is "industrial-strength", but complex and error-prone.

Anyone have experience with both? What, if anything, can be done (in parallel) in OpenMP but not in Cilk?

Cilk seems not to have have had much recent development (indeed, the link to the ICFP Programming contest they won is now dead). OpenMP on the other hand is now part of GCC. Is this a case of "doomed to success"?

Josh

The Evolution Of LINQ And Its Impact On The Design Of C#

MSDN Magazine, June 2007

"The language design team now had several prototypes to get feedback on. So we organized a usability study with many participants who had experience with both C# and SQL. The feedback was almost universally positive, but it was clear there was something missing. In particular, it was difficult for the developers to apply their knowledge of SQL because the syntax we thought was ideal didn’t map very well to their domain expertise."

Tail call experiment

This is probably not a new idea, it was one of those things I just felt like working through myself, but it ended up taking way too much time.

Two stacks for tail calls gives a different approach to tail call optimization than the ones I was able to find, and my hunch is that's because it doesn't improve that much. But I still think it's a neat idea, maybe someone else will too.

Question regarding relationship of propositional logic to category theory

I am by no means deeply educated in regards to category theory, but I wanted to see if my thinking is on the right track.

Can I construct a category consisting of propositions as its objects and inference rules as structure preserving morphisms between these objects? Does this category in some way offer an operational semantics for a top level predication that asserts this complex of propositions?

Asynchronous calls and error handling

I would like to hear your experience and opinion about error handling as related to asynchronous messaging.

If I read correctly, Erlang uses a separate thread for error handling, which supports custom handlers. I find it difficult to see how such a scheme can be used for error recovery in concurrent programs. Does the handler match the sender of the message that caused an error? How does that sender put things together in a coherent state after its handler is called?

I am thinking of a completely different approach for dodo which uses the traditional exception handling mechanism.

My idea is to delay the error reporting until the caller stops to receive a reply from the callee. The main problem with that is that the caller may never wait for a reply, and the error needs to be cleared before another message is sent (new session). Effectively the context changed and the previous error is no longer relevant.

To address that reuse of the same channel for a new session should be discouraged. That looks like a problem that does not need to exist, so I am hoping you can help me here.

Discussions I think may be relevant on LtU:
Error handling strategies
Erlang concurrency: why asynchronious messages?

XML feed