User loginNavigation |
A new implementation of recursive-descent parsing?It is well known that the classic way to model a recursive-descent parser is to write (or automatically produce) a set of mutually-recursive functions either by writing procedures or combining parser modules using a DSL (like in Boost::Spirit) or using templates/generics (for those languages that support such a concept). All the different types of implementations work in a similar manner: each procedure invokes one or more procedures. This approach has some drawbacks:
While trying to approach the issue of RD parsers from a different perspective, I think I have found an implementation which solves some of the problems mentioned above. I hope I am not repeating someone else's work (a brief search over the internet found no similar approaches). The idea is that instead of the input data travelling down the tree of parsers, each parser processes the current input and then leaves a continuation on a stack. The main parse engine pops the continuation off that stack and executes it; the new continuation puts another state on the stack etc. Thus an RD parser becomes a sort of state machine. Generally an RD parser consists of the following operations:
The traditional approach handles these operations in the following ways:
For example (in pseudo-Java): class sequence_parser implements parser { list parsers; iterator parse(iterator input) { foreach(p, parser) { input = p.parse(input); } return input; } } class choice_parser implements parser { list parsers; iterator parse(iterator input) { foreach(p, parser) { iterator output = p.parse(input); if (output != null) return output; } return null; } } class loop_parser implements parser { parser other; iterator parse(iterator input) { input = other.parse(input); return parse(input); } } The new approach handles these tasks differently. The approach uses the following context for parsing:
Parsing operations are handled like this:
For example: class sequence_parser implements parser { parser sequence_next; parser this_parser; iterator parse(context c) { c.sequence_stack.push(sequence_next); return this_parser.parse(c); } } class choice_parser implements parser { parser choice_next; parser this_parser; iterator parse(context c) { c.save(); c.choice_stack.push(choice_next); return this_parser.parse(c); } } class loop_parser implements parser { parser this_parser; iterator parse(iterator input) { c.sequence_stack.push(this); return this_parser.parse(c); } } The parsing algorithm then becomes a very simple loop: void parse(context c) { while (true) { parser = c.sequence_stack.pop(); if (parser != null) { iterator output = parser.parse(c); if (output == null) { parser = c.choice_stack.pop(); if (parser) { c.restore(); c.sequence_stack.push(parser); } else throw parse_exception(); } } else (if c.input_exhausted()) { return; else { throw parse_exception(); } } } The above does the following:
This approach solves some of the problems of the traditional approach in the following ways:
By Achilleas Margaritis at 2006-07-03 13:19 | LtU Forum | previous forum topic | next forum topic | other blogs | 24474 reads
|
Browse archives
Active forum topics |
Recent comments
22 weeks 6 days ago
22 weeks 6 days ago
22 weeks 6 days ago
45 weeks 19 hours ago
49 weeks 2 days ago
50 weeks 6 days ago
50 weeks 6 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 6 weeks ago