Eliminating left recursion

Does anyone have a reference to a correct algorithm which eliminates left recursion from a grammar? The one in Wikipedia is incorrect. Lots of youtube videos and lecture notes Google finds are also incorrect. All make the same mistake, they forget that a production A = N A X is left recursive if N is nullable. Requiring the grammar to be null free would be absurd since the algorithm itself inserts epsilon productions.

I also wish to extend the requirement to an attribute grammar of the form usually used in practice, with an extra symbol kind for AST construction, etc, which cannot be eliminated and yet recognises the empty string.

BTW: I'm looking for the basic algorithm. Lost my copy of the Dragon Book ;(

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Pfff..

Long time ago, I always do it by hand but I forgot the algorithm. Calculate the first-set, was it? I'll google and come back to you.

BTW. Why are you interested in attribute grammars? It is good theory to know but they have very limited applicability?

First hit

My first hit on google gives this, which I glanced at. He seems to know what he is doing and gives an algorithm.

Ah. That's half the answer. I guess you can eliminate epsilon, and then apply the standard eliminate-left-recursion algorithm, and then (or earlier) calculate the first set to see if it's LL(1).

Dragon Book

For a moment you had me thinking I had also lost my book, but I found it. The algorithm in the dragon book also assumes epsilon productions are removed from the input, but notes that they may be present in the output. Hopefully this hint (or someone else's comment) is enough for you to figure it out, because none of this is in my working memory at the moment.

Don't

I think I understand your interest but my advice is: Don't.

Attribute grammars, as I stated have very limited applicability in industry. Apart from free flow grammars, which often have horrible complexity, most attribute grammars will be directed. I.e., some attributes are analytic (flow from top to bottom, are input), and some attributes are synthetic (flow from bottom to top, are output).

You can simply encode these kind of algorithms in a functional style as functions on the grammar/ast. Pass the ast and all analytic attributes as arguments, synthetic attributes become the output. You can also add a rewritten ast in the result. And, after that, it is easy to generalize that pattern and define abstract rewrite functions which also handle the information flow around productions.

This decouples your calculation from the grammar specification but that is also good practice since it allows you to implement multiple passes. (There are/were compiler generators, CDL3, which can handle multipass information flow, but it turned out to be a bad idea. It's way too computationally intensive.)

Better still, once you write general transforms you can write them in such a manner that you only need to specify the rewrites you are interested in. (Usually impossible in compiler generators based on attribute grammars, they want you to specify the complete grammar.) In the bootstrapped Hi compiler (defunct), I implemented trivial passes as this:

    def eta_transform: lambda_transform unit =
        [ u, lambda_abs pp e ->                      <- this handles an abstraction
            (u, e) = eta_transform u e;
            vv = lambda_free_vars (lambda_abs pp e);
            vv = list.map lambda_name vv;
            l  = lambda_abs (vv ++ pp) e;
            l  = list_to_apps (cons l vv);
            (u, l)
        | u, l -> lambda_child eta_transform u l ]   <- this handles all other cases

    def eta: lambda -> lambda =
        [ l -> (_, l) = eta_transform nop l; l ]     <- I simply ignore synthetic arguments, and pass a dummy analytic, here

And in the interpreter for an untyped language I am working on I implement passes like this:

class RewriteLet: public Rewrite {
public:
    AstPtr let(const AstPtr& a) {
        return rewrite(a);
    }

    //  F( (l = r; b) ) -> ( [ l -> F(b) ] F(r) )
    AstPtr rewrite_expr_let(const Position& p, const AstPtr& l, const AstPtr& r, const AstPtr& b) override {
        auto r0 = rewrite(r);
        auto b0 = rewrite(b);

        AstPtrs gg;
        gg.push_back(l);
        auto m = AstExprMatch(p, gg, AstEmpty().clone(), b0).clone();

        AstPtrs mm;
        mm.push_back(m);
        auto q =  AstExprBlock(p, mm).clone();

        return AstExprApplication(p, q, r0).clone();
    }
};

AstPtr pass_let(const AstPtr& a) {
    RewriteLet let;
    return let.let(a);
}

In the latter code, attribute flow is modeled by modifying the state in the rewrite class. Can't have it all but handles all cases well, sometimes better than I thought. (Could be solved with templates but that would be overkill and explode the interpreter size.)

For every compiler writer I also suggest, make the grammar LL(k). It easily implemented, people intuitively grasp the syntax and error messages, and it is blazingly fast. I have never really seen the additional benefits of other approaches. (The C syntax comes to mind.)

I think I see where you want to go but it doesn't pay off. It does pay off to be a bit smarter than theory provides. (Unless you're an academic and need to publish of course. But this all has been done before.)

Addendum

The above scheme doesn't solve all problems. For one, it fixes the types of the analytic and synthetic attributes. This is often not possible, for instance attribute grammars over finite lattices are (I think) still used for natural language processing, and there you'll find that the number of attributes can vary wildly per production rule.

But it seems to work for (trivial?) compilers/interpreters.

(Actually a nice case for an untyped functional language. It doesn't care about types.)

intention

What I'm trying to do is define some "parser combinators". The problem is to be practical they have to handle left recursion. Now, for a recogniser, this is trivial, we just apply the the algos that remove indirect left recursion and then direct left recursion.

Now the problem with parsing is you have to produce an AST, at least a labelled parse tree. So automatically, a grammar is useless without augmentation. The standard augmentation is to add a constructor for every production which labels the parse tree node.

The problem with that is if you refactor the grammar to eliminate left recursion, what happens to the constructors? So my method is like this: the user adds action codes to each production:

    A = B C {A_1/2} | E {A_2/1}

We use a shift/reduce parser and propagate a purely functional parser stack. Each terminal pushes an attribute onto the stack, each action code pops N symbols off the stack, and then pushes a single node with N children and the specified label onto the stack.

I have code that can parse any language generated by a non-left recursive grammar this way, ambiguity isn't a problem, the parser produces all possible parses. The machinery has a remarkable property: the parse tree returned is invariant under refactoring. This is because the control stack of the parser is decoupled from the parser stack.

The problem is that to remove left recursion, we now have an issue that we can get a nullable symbol, namely an action code, in front of the fix point in the production and we cannot eliminate it because even though it is a unit recogniser it has an essential side effect on the parser stack.

I believe I have now solved this problem.

The goal here is to be able to execute any grammar and hence parse any context free language to produce any desired parse tree, at least with minimal sane restrictions on the grammar.

So there is no issue of "hand writing" a grammar. My production language system supports user defined syntax already, and the pretty much the whole language (other than a bootstrap core) is defined in the library. Anyone designing actual programming languages today is out of date: that's the client's job not the compiler writers. The system I am using is very good: I'm using a translator based on Dypgen in Ocaml which is dynamically extensible and extremely fast, it is GLR+ and rebuilds the whole automaton each modification, the automaton is an LALR1 kernel used to predictively spawn only viable threads in the GLR model. I use Scheme for the action codes. I would also note unless you're programming in crap like C++ parsing performance is not an issue if its reasonable because you can cache AST for an unmodified file, and program language files can be manually kept reasonably short.

I would like to replace Dypgen, as the first stage of bootstrapping, and in particular this will not only have to parse the same grammar Dypgen can already handle, but it has to support extension, preferably at run time. A long term goal is to replace pattern matching with parsing. Anyhow that's why "designing a grammar to be LL(k)" or whatever isn't an option.

Generalized Parser

Yah nice

Looks nice. But a 10 second parse on one hundred "a"-s?

Here's the github project,

Not sure where you saw that number. Here's the github project where he also discusses performance. He mentions there an 18 second for a highly ambiguous grammar on a 100 char input.

Current GLL implementation has a high constant factor overhead, although in principle it scales as O(n^3) if you're returning the first successful result from an ambiguous parse. Obviously if you enumerate all ambiguous parses, it's exponential.

Uh, I read it in the paper?

It is mentioned in the paper I think. I am not saying 10 seconds is bad, I just can't place it. Is that a showstopper or not? No idea.

Interesting

That's a nice paper, thanks for the reference. It's interesting my technology solves some of the mentioned problems deeply. My actual parser uses coroutines and so all the messing about to get a functional solution isn't necessary: for example the "trampolining" thing is built in to the core of the language. I'm actually writing a parser to discover how effective coroutines are compared to traditional imperative or functional styles, parsing is hard enough to actually place real (rather than Micky Mouse) demands on the system.

So far it looks good, but I've had to provide "role swapping" operators where fibrated (coroutine based) code has to swap to functional code (which is done, interestingly, using an imperative bridge). Parsing is a great thing to play with here: with the idea alternatives are actively explored using fibres of control, failure just causes the fibre to drop dead, that is, to not produce any result. In other words, unlike an option type in a spatial (functional) model, the temporal dual is quite different (and a bit hard to wrap the brain around after doing functional programming). But it works really well, eliminating a lot of housekeeping (no need to handle termination or failure).

A functional parser which explored all alternatives and therefore has functional components which accept a set of starting points in the input, and return a set of ending points, has a control inverted dual which works by changing the spatial set into a sequence of transmissions: if you think "list of positions" in the input then the control inverted dual deals instead with a stream. The spatial cost is replaced by a temporal one in theory, in practice of course spatial structures take time to build, and temporal ones requires space for connections.

Yeah well

Technically you are somewhat correct that that's an attribute grammar. But well, wording. If you mention attribute grammars people, well I, start thinking about compiler generators and affix directed parsing. None of which you want to implement it seems.

It's a recognizer, nothing more.

And, ah, crap C++? Despite all its shortcomings C++ routinely handles projects ocaml or scheme can only dream about. Lets start a language war!

C++ crap

The flame was directed at the fact C++ has a context sensitive parser, so "caching" parses is not really viable (despite some vain attempts at precompiled headers). That's what I meant: my argument is that if you have files which are independently parsable, then you only need to parse changed files, and, the programmer can (subject to language design) still factor a large file into smaller subfiles. In other words, parsing performance is not as critical for a "decent" language as it is for languages like C and C++. [Incidentally, K&R C is context free, the stupid ANSI committee is responsible for screwing up the grammar (by adding typedefs)]

It takes 60 seconds or so for my system to bootstrap itself and then parse the Felix library (the parser actually parses the grammar and then builds the automaton from that). But this doesn't matter because the language design means it is only done once when you upgrade.

Ah, okay, great

I stand corrected.

Eliminate or deal with it?

I might have a solution for you if it is about top-down parsing. It is not about eliminating left recursion, it is about dealing with it, and it is possible. A couple a years ago I stumbled upon this paper: seed growing algorithm and it helped me a big time. Although it is meant only for parsing PEG grammars, I succesfully extrapolated it to CFG grammars (or a kind of them). See it in action for yourself at: http://www.moonyweb.com/metafigure/playgroundMF.html, just look under "simple math" example (or test it with your own grammars). This is just my conceptual site, I'm still working on details, but as for seed growing, it works like a charm, guaranteed.

Anyway, you are welcome to use any ideas seen on the preview site. Let us know how you're doing :)

[Edit]
Attribute grammars? Man, we work on the same task...

Problem

The problem I am trying to solve is basically this: I want the user to write a grammar in some form which can be converted to a parser at run time. So the parser isn't allowed to generate a text file which is then compiled as yacc like parser generators do.

The system has to use composable components, which means some kind of combinators should underly it. As it happens I already have a system which is extensible at compile time and which handles direct left recursion, namely one based on Dypgen in Ocaml, which allows my programming language to put its whole syntax in its own library (the hard coded parser is just a bootstrap that reads a grammar and then builds another parser). But I want to do this *at run time* not just at compile time. And it should be able to at least use a syntax for the grammar close to the current syntax, so it can also parse existing Felix code (so the system is the front end of a bootstrap).

Because I am using coroutines in my parser, a lot of the housekeeping required for a functional parser just evaporates without compromising purity, but the resulting indeterminism in temporal sequencing introduces new problems instead.

The core of the concept of coroutines is to decouple data flow from control flow. This is what makes both traditional imperative and functional programming too hard. Subroutines are just not enough, master/slave relations just don't cut as a complete solution. Coroutines use peer to peer control relations, but retain definite data flow direction, by separating control and data flow. A coroutine doesn't accept an argument and return a result, it reads its input and writes its result: the I/O is not fixed to the start and end of its lifetime like with a function.

So the natural parsing model also decouples control flow from data flow. Where a functional parser handling all alternatives is forced to accept and return sets of positions (for a recogniser) and in addition parse trees (for a parser), coroutines just read and write these elements one at a time.

So basically the whole model lends itself to a shift/reduce parsing model where the parser stack is passed along with the position in the input string (the stack is of course an immutable list and we do functional updates to it not imperative ones).

The core problem is left recursion, and my solution is to observe that you can safely refactor the grammar *without* disturbing the resulting parse tree, if the parser "reduce" operations are encoded in the grammar itself. The reason is that refactoring in effect just plays with the control stack, the parser stack isn't touched.

Maybe LC-parsing?

I will not pretend to know what I am talking about, it's too long ago, but have you looked at left-corner parsing? If I remember correctly it used to be popular as an alternative to Early parsing. (Worse complexity in theory but better performance in practice. And a 'trivial' to implement algorithm.)

It's generalized parser with left recursion support

Code from presented site isn't really a parser generator. It is a generalized parser function that takes JSON grammar as input and outputs JSON AST of parsed text. I bootstrapped it with JSON that takes a string grammar for input and outputs JSON grammar equivalent to the input. Further, I use the returned JSON grammar to parse any text to an AST. Basically, all parsing is done by the following function calls:

JSONRules = parser.parse (parser.bootStrap, grammarStringToParse);
AST = parser.parse (JSONRules, stringToParse);


I used some dirty tricks like function injecting in parser.bootStrap object to convert JSON AST to JSON grammar rules on the fly. But it really doesn't matter what's the story behind the parser, otherwise that it's not a parser generator, yet generalized parser. What is important is that left recursion solution I used can parse even indirect left recursions and null-prefixed left recursions and it does it fast (and in linear time measure for non-ambiguous grammars). Basically, it is a common sense extension of Seed growing algorithm to CFG grammar class. I recommend reading the Seed growing paper, it is a great stuff for inspiration on left recursive solutions. Implementing it took me some six days of further thinking and one day for coding it down.

[Edit]
But, if you use shift-reduce strategy, I don't know how useful this is... Anyway, my first implementation of parser was shift-reduce, but it was too slow, especially when detecting parse errors. Soon I switched to Earley algorithm, but that was a little bit less slow and was resource hungry in some cases. The third, my own top-down version is what you've seen above, but it suffers from some child diseases which will be hopefully overgrown in the final version I'm working on now. I'm adding infinite lookahead for better performance, but I have to update the whole algorithm accordingly to this. I'm hoping for linear parse time in the worst case for non-ambiguous grammars.

Transformation

This page is an expanded explanation of the symbol ordering described on the wiki page that you linked to in the summary. The case that you having trouble with - indirect left recursion hidden behind an empty symbol - is mentioned slightly below equation (30) on that page. This corresponds to your original question about the need to remove empty productions from the algorithms input.

The technique that you need is to push the empty symbol up through the grammar until they do not land on the first symbol of a sentential form. It is explained by example here.

At this point the algorithm in the Dragon book would be suitable for you to use. If your input really is user-supplied, as opposed to machine-generated in practice, then you will run into another problem. These transformations will silently fail on grammars that are ambiguous. If the lack of ambiguity is guaranteed in the input (as it sounds like in your problem description) then this is not an issue.

Priorities

Fwiw (probably not much, but then, two cents has much less purchasing power than it used to), I played with something broadly similar about thirty years ago; I was writing a toy compiler for a language whose expression syntax could be extended somewhat arbitrarily (essentially one could add a new CFG rule for calling a given function). Looking back at it, the parsing algorithm I devised is in itself mildly interesting but what really strikes me about it is the choice of priorities that the algorithm implies. I'm a big believer in things being clear to the user (in this case, the user being a programmer). The grammar to be parsed is kept in the form of the syntax rules specified by the programmer, rather than transformed to eliminate left-recursion (or for any other purpose), because any syntax-error message should be instantly comprehensible to the programmer, and what the programmer specified is surely what will make most sense to the programmer. The parser had to have a very simple way of determining what to do from a given state looking at a given lexeme because if there's a syntax error the parser needs to be able to explain clearly what the problem was. As the grammar table was constructed, one new syntax rule at a time, it was required to unambiguously determine what the parser must do from any given state given any give lexeme because a grammatical ambiguity can be explained clearly at the time it's introduced into the grammar but would create an unexpected and confusing error if left until the middle of trying to parse an expression with actual parser-ambiguity in it.

The algorithm itself was a left-recursion-tolerant form of recursive descent. Each nonterminal in the grammar was represented by an (acyclic) state machine with transitions labeled by keywords or nonterminals. When in the starting state of a nonterminal symbol, the parser would never immediately left-recurse; but when in an accepting state for a nonterminal, if by looking at the next lexeme the parser realized that it should have left-recursed earlier, it would retroactively do so now, by taking the now-completed parse tree of the nonterminal as the first subexpression of a larger expression on that nonterminal. To keep the syntax unambiguous under this parsing algorithm, the two requirements were that  (1) if a keyword could start a given nonterminal, that keyword could not also be an alternative to that nonterminal from any state in the grammar; and  (2) if a keyword could follow a given nonterminal (that is, if there was any state entered by the nonterminal and exited by the keyword), that keyword could not also exit any accepting state of that nonterminal. This was all made much simpler to implement by the fact that there was only one nonterminal that could left-recurse or be extended by the programmer (namely, expression).

What most impressed me about the program as a whole was the lucidity of the syntax-error messages. The parser maintained a record specifying the exact source-file start and end of the expression it was parsing, and the error-message generator would take a source-region parameter, producing an error message that would display about three lines of source leading up to the error with the specified region shaded/highlighted. The cause of the error always just leapt out from these messages, and it was a startling contrast with the difficulty of deciphering error messages in just about every compiler I'd ever used.

Ambiguity, GLL/Actors, Adaptability

The grammar to be parsed is kept in the form of the syntax rules specified by the programmer

Agreed fully. The formalism and the programmer need to meet halfway. Eliminating left-recursion is far on one side of that line.

required to unambiguously determine what the parser must do from any given state given any give lexeme because a grammatical ambiguity can be explained clearly at the time it's introduced into the grammar but would create an unexpected and confusing error if left until the middle of trying to parse an expression with actual parser-ambiguity in it.

In my experience, ambiguity is just not that big a problem and can actually be a feature.
I've recently been working on Ambiparse, a Clojure parsing library. It can return a parse forrest, but also has tools for eliminating ambiguous parses explicitly, rather than forcing the grammar to be unambiguous. For example, you can filter sub-parses that have left-nesting in order to enforce right-associativity. Such a filter is expressed as a node directly in the grammar language.

Parse results are compared on the basis of the abstract syntax as per inline rewrite rules, not the concrete sytnax trees. So if you rewrite away ambiguity, such as in this simple calculator example, then ambiguous parses are gracefully ignored. For example, without an associativity filter, you can parse `2 + 4 + 6` as `(2 + 4) + 6` or as `2 + (4 + 6)` and then rewrite by performing the arithmetic, you get 12 twice and that's considered unambiguous.

In terms of error messages, if the parse (not the parser) is ambiguous, you simply get two results for the full parse (decorated with full source positions, of course). Alternatively, you can banish ambiguity via another kind of filter node. Instead of getting two successful parses, you'll get an error with two successful sub-parses in it.

I haven't added error recovery yet, but I've been quite pleased with the error reporting so far. An error is the set of expected or unexpected properties of the character, token, or subparses one past the set of rightmost successful sub parses. Error recovery would simply extend to a map of positions to sets of errors.

The algorithm itself was a left-recursion-tolerant form of recursive descent.

I went with an actors-inspired interpretation of the GLL algorithm. Essentially each grammar production is a parser actor that spawns actors for nested non-terminals and listens for successful parses. Each subparse is just a message on a queue, which is just run until the actors reach quiescence and the queue is emptied. While actors "tell" their parents about successful parses, if the network reaches quiescence and no parse is produced, you can "ask" the root actor for errors, and it will in turn ask its children.

I've posted a debug visualization of a simple parse.

a language whose expression syntax could be extended somewhat arbitrarily (essentially one could add a new CFG rule for calling a given function)

I've done this too! Your writings were an inspiration here, both for adaptable grammars and for first-class environments.

Each actor in the network has an identity that includes the pattern to be matched, as well as the environment in which to match (both the index in to the source string to look and a set of extensions to the grammar). See the adaptive test for an example of adding and removing grammar rules with scoping.

That's not an ambiguity, it's a feature

In my experience, ambiguity is just not that big a problem and can actually be a feature.

Fascinating; priorities indeed. I believe you that you're finding ways to work with ambiguity, yet it just doesn't come naturally to me — for a PL. My recent focus has been how to maintain sapience in a world increasingly dominated by non-sapient information processing: corporate capitalist motives, government bureaucracies, algorithms, big-data statistical analyses, and so on. I see ambiguity, "grey areas", unstructured situations as the essence of where sapient minds thrive while non-sapient information processing ultimately does not apply (in the Sorcerer's Apprentice, is the problem that the apprentice loses control of the magical servant, or that the magical servant is inherently incapable of stepping outside its orders to judge when they stop making sense?), and if we eliminate those things for the sake of our technology we're liable to marginalize ourselves. But to me ambiguity doesn't belong in the formal grammar of a PL. Why not? Perhaps it's to do with designing technology to expand rather than contract human scope. The art seems to be not pitting technological rigidity against human flexibility; they need to not interfere with each other. The role of ambiguity in PL grammar is then a question of how one arranges the boundaries between the two. My preference, increasingly over many years, has been for an unambiguous grammar with chosen aspects of unfettered flexibility. That extensible-grammar parser thirty years ago was a milestone in the development of my thinking on this; I built flexible operator precedence into it, and was so turned off by how messy that got that I became a disbeliever in operator precedence and developed a habit of fully parenthesizing arithmetic expressions even in languages that don't require me to. It seems you may be pursuing a different balance point between the technological and human elements — which is rather reassuring, as I'm not a fan of putting all one's eggs in one basket.

(decorated with full source positions, of course)

Of course. :-)  It seems to me, broadly, we're both using (differently balanced) processing techniques that preserve various facets of the human element long enough for those elements to reemerge in the error message, so that the error message "speaks" to the human programmer; as opposed to processing techniques that wring the humanity out of the system state, producing an error message with no heart.

ambiguity == concurrency == constraint satisfaction

ambiguity, yet it just doesn't come naturally to me — for a PL

This was also true for me, until I spent a few months working on a deeply concurrent system. The more concurrent code I wrote, the more I came to realize that so many natural things are awkward to program because we lack decent language constructs for working with concurrency.

In the case of this parser, I think about it in terms of communicating sub-processes, but because the language does not have super cheap threads or first class continuations and environments, I have code it in something approximating a continuation-passing or even an "actor-passing" style. Similarly, because the language lacks advanced reflective capabilities, such as for local variables, I must reify state machines as data, so that I can generate visualizations of that data.

My thinking was also heavily influenced by Propagation Networks: A Flexible and Expressive Substrate for Computation. I came to view concurrency and logic programming as two sides of the same coin. A mental leap that requires a much longer rant than this one.

I see ambiguity, "grey areas", unstructured situations [...]

I think this is where our viewpoint differs. I don't view grammar ambiguity as a "gray area". It's quite black and white when multiple parses are produced, or when ambiguity is forcibly converted to an error during parsing, rather than during parser construction.

For me, the gray area is when you prematurely commit to some constraint, such as a total order of operator precedence. To play with your analogy, this is not unlike what happens when an organization spends years building products and businesses via inertia. Making an implicit commitment yields more gray than explicitly refusing to make a commitment. "Why are we doing this?" or "Why does this unrelated operator have higher precedence than that one?".

I believe careful modeling of ambiguity can be leveraged to aid your sapience goals.

developed a habit of fully parenthesizing arithmetic expressions even in languages that don't require me to.

I too developed this habit after programming in Lisps for a while. However, my target design is not at odds with this thinking. The goal is to have an abundance of intentional ambiguity, such that whenever two syntactic forms without obvious relative precedence are encountered, parenthesis or other tools of disambiguation are forced upon the programmer by compiler errors. To this end, I intend to use a directed graph of precedence groups, rather than a total order or table of binding powers.

Fully parenthesizing

developed a habit of fully parenthesizing arithmetic expressions even in languages that don't require me to.

Don't we all? It takes mental effort to keep track of precedences and, worse, someone can break a lot of code if he/she at some point decides to change the order.

In my next interpreter I just fix precedences with the assumption that most people will code around it anyway.

decent language

we lack decent language constructs for working with concurrency.

I'm deeply aware of this. I've heard of propagation networks (and possibly that particular dissertation), I've bookmarked the dissertation and hope to find copious free time to study it properly (some things, I think, aren't worth studying casually — benefit comes only from immersive grokking); but I'm not particularly compelled by claims that some particular model of computation provides a solution. Perhaps partly because that feels like an algorithmic approach to the problem, whereas the approach needed may be cognitive. Also, while our failure to program concurrency well is especially blatant, that may be just a symptom; I'm not convinced we know how to program anything well. I keep in mind the blatant problem of concurrency, as a possibly-important clue, but I also have my eye on declarative-versus-imperative.

I don't view grammar ambiguity as a "gray area".

I agree, formal grammar ambiguity is not a grey area. That's sort of the point, which just goes to show I've expressed myself poorly. I'll try again (articulating things can take a lot of trials). In designing for the interplay of technology and humans, I see a fundamental difference between applying technology to human flexibility, versus applying humans to technological rigidity. Applying technology to human flexibility is fine if done so that the technology is doing all the trying-to-cope (one could go into nuances, there). But asking humans to cope with the limitations of the technology is a waste of the human capacity for thought, and may also encourage humans to limit their thinking for the sake of the technology. One should present humans with simple constraints that afford a wide-open expanse of free-form expression. Formal ambiguity is a technological thing, discrete by nature; from a human perspective it's not all that relevant to processing human language, and humans shouldn't be asked to deal with it at all. I take formal ambiguity completely out of the equation, so that humans will exercise their flexibility in directions where the flexibility isn't technology-shaped. An alternative approach, perhaps yours, is to try to expand the degrees of freedom of the technology so that the humans will be less cramped by the shape. There is always some of that in any programming language, since the whole space of possible programs is ultimately defined by the technology; it's a matter of style in how one lines things up.

re: That's not an ambiguity, it's a feature

True, in my language I rely on ambiguity parsing to automatically construct sets in a manner:

Set <= (
    Element <= SomeDefinition |    // an element is parsed here 

    TransformedElement <= @Set     // this rule starts by feeding the above element to its right 
                                   // side, adding its left side to the set. Then it acts recursively,
                                   // feeding its left side to its right side, gradually populating
                                   // the set
)  // the set inputs both elements as ambiguity, the first
   // element, then the second which inputs its first and its
   // second and so on, causing the set to grow up 

This way I can easily program a term rewriting system (of course, general transformations should be fed as data too). This is an example of using ambiguous parsing in programming languages. By the way, natural languages also rely on ambiguous parsing, although it is later decided which branch is really meant by story teller.

Left recursion example

Look, the above example is mutually left recursion and ambiguity example! Isn't it ambiguous itself, hehe?

Sometimes I even surprise myself :D

Reference

Another reference is the generally excellent Parsing Techniques: A Practical Guide by Grune & Jacobs (link is to out-of-print first edition, on author's website). Left-recursion elimination is covered in section 6.4 on page 128. Like the other references mentioned, it is assumed that epsilon productions have been eliminated first.

Grune

Thanks, very nice!

It doesn't

The dragon book is easy to find online, try "dragon book pdf" in google and there are normally a couple of complete copies in the first few pages. I lost my own copy years ago and keep a pdf in a google drive for reference instead. If you can't find one email me (my name at gmail) and I'll send you a copy.

In the 2nd edition the bit that you want is on pg 184 but it does not cover the case that you ask about. Grammar is non-compositional and the rule you mention doesn't work out of context. I can't think of a non-ambiguous example grammar to embed it in - and the left elimination algorithm only works on non-ambiguous grammars. Do you have a larger example with a definition of N that is non-ambiguous?