LtU Forum

Egel Language v0.1

Small notification: I made another language called Egel. It's an experimental toy language based on untyped eager combinator rewriting.

It's a beta release, things will change in the future and I'll likely break 'stuff' but feel free to download it or read the sources.

I have some ideas how to turn it into something useful in the future but nothing really concrete yet.

A refutation of Gödel's first incompleteness theorem

Contradictions in Gödel's First Incompleteness Theorem.

Notes for the second edition. I have edited this post a bit, sorry if it is against the etiquette. The original post can be found here. I have removed a couple of -periopheral- mistakes. However I want to also warn that the argument is a lot clearer when adapted to a modern rendition of Gödel's theorem, based on Turing machines, as suggested in i this comment. That adaptation is carried out in the comments that hang from the previous referenced comment, this an onwards.


This refers to Kurt Gödel's "On formally undecidable propositions of principia mathematica and related systems". The notation used here will be the same as that used by Gödel in that paper.

In that work, Gödel starts with a description of the formal system P, which, according to himself, "is essentially the system obtained by superimposing on the Peano axioms the logic of PM".

Then he goes on to define a map Phi, which is an arithmetization of system P. Phi is a one-to-one correspondence that assigns a natural number, not only to every basic sign in P, but also to every finite series of such signs.


There are alternative arithmetizations of system P. I will later delve on how many.

This is obvious from simply considering a different order in the basic signs when they are assigned numbers. For example, if we assign the number 13 to "(" and the number 11 to ")", we obtain a different Phi.

If we want Gödel's proof to be well founded, it should obviously be independent of which arithmetization is chosen to carry it out. The procedure should be correct for any valid Phi chosen. otherwise it would **not** apply to system P, but to system P **and** some particular Phi.

To take care of this, in Gödel's proof we may use a symbol for Phi that represents abstractly any possible valid choice of Phi, and that we can later substitute for a particular Phi when we want to actually get to numbers. This is so that we can show that substituting for any random Phi will produce the same result.

The common way to do this is to add an index i to Phi, coming from some set I with the same cardinality as the set of all possible valid Phi's, so we can establish a bijection among them - an index. Thus Phi becomes here Phi^i.


Later on, Gödel proceeds to spell out Phi, his Phi, which we might call Phi^0, with his correspondence of signs and numbers and his rules to combine them.

And then Gödel proceeds to define a number of metamathematical concepts about system P, that are arithmetizable with Phi^0, with 45 consecutive definitions, culminating with the definition of provable formula.

Definition of provable formula means, in this context, definition of a subset of the natural numbers, so that each number in this set corresponds biunivocally with a provable formula in P.

Let's now stop at his definition number (10):

  E(x) === R(11) * x * R(13)

Here Gödel defines "bracketing" of an expression x, and this is the first time Gödel makes use of Phi^0, since:

  Phi^0( '(' ) = 11

  Phi^0( ')' ) = 13

If we want to remain general, we may rather do:

   E^i(x) === R(Phi^i( '(' )) * x * R(Phi^i( ')' ))

Two little modifications are made in this definition. First, we substitute 11 and 13 for Phi^i acting on "(" and ")". 11 and 13 would be the case if we instantiate the definition with Phi^0.

And second, E inherits an index i; obviously, different Phi^i will define different E^i. And so do most definitions afterwards.

Since, for the moment, in the RHS of definitions from (10) onwards, we are not encoding in Phi^i the index i, which has sprouted on top of all defined symbols, we cease to have an actual number there (in the RHS); we now have an expresion that, given a particular Phi^i, will produce a number.

So far, none of this means that any of Gödel's 45 deffinitions are in any way inconsistent; we are just building a generalization of his argument.


There is something to be said of the propositions Gödel labels as (3) and (4), immediately after his 1-45 definitions. With them, he establishes that, in his own words, "every recursive relation [among natural numbers] is definable in the [just arithmetized] system P", i.e., with Phi^0.

So in the LHS of these two propositions we have a relation among natural numbers, and in the RHS we have a "number", constructed from Phi^0 and his 45 definitions. Between them, we have an arrow from LHS to RHS. It is not clear to me from the text what Gödel was meaning that arrow to be. But it clearly contains an implicit Phi^0.

If we make it explicit and generalized, we must add indexes to all the mathematical and metamathematical symbols he uses: All Bew, Sb, r, Z, u1... must be generalized with an index i.

Then, if we instantiate with some particular Phi^i, it must somehow be added in both sides: in the RHS to reduce the given expression to an actual number, and in the LHS to indicate that the arrow now goes from the relation in the LHS **and** the particular Phi^i chosen, to that actual number.

Obviously, if we want to produce valid statements about system P, we must use indexes, otherwise the resulting numbers are just talking about P and some chosen Phi^i, together.

Only after we have reached some statement about system P that we want to corroborate, should we instantiate some random Phi^i and see whether it still holds, irrespective of any particularity of that map.

These considerations still do not introduce contradiction in Gödel's reasoning.


So we need to keep the indexes in Gödel's proof. And having indexes provides much trouble in (8.1).

In (8.1), Gödel introduces a trick that plays a central role in his proof. He uses the arithmetization of a formula y to substitute free variables in that same formula, thereby creating a self reference within the resulting expression.

However, given all previous considerations, we must now have an index in y, we need y^i, and so, it ceases to be a number. But Z^i is some function that takes a natural number and produces its representation in Phi^i. It needs a number.

Therefore, to be able to do the trick of expressing y^i with itself within itself, we need to convert y^i to a number, and so, we must also encode the index i with our 45 definitions.

The question is that if we choose some Phi^i, and remove the indexes in the RHS to obtain a number, we should also add Phi^i to the LHS, for it is now the arithmetic relattion **plus** some arizmetization Phi^i which determine the number in the RHS, and this is not wanted.


But to encode the index, we ultimately need to encode the actual Phi^i. In (3) and (4), If in the RHS we are to have a number, in the LHS we need the actual Phi^i to determine that number. If we use a reference to the arithmetization as index, we'll also need the "reference map" providing the concrete arithmetizations that correspond to each index. Otherwise we won't be able to reach the number in the RHS.

Thus, if we want definitions 1-45 to serve for Gödel's proof, we need an arithmetization of Phi^i itself -with itself.

This may seem simple enough, since, after all, the Phi^i are just maps, But it leads to all sorts of problems.

Five one

Now, suppose that we can actually arithmetize any Phi^i with itself, and that we pick some random Phi^i, let's call it Phi^0: we can define Phi^0 with Phi^0, and we can use that definition to further define 10-45.

But since Phi^0 is just a random arithmetization of system P, if it suffices to arithmetize Phi^0, then it must also suffice to arithmetize any other Phi^i equally well. However, with Phi^0, we can only use the arithmetization of Phi^0 as index to build defns 10-45.

This means that, as arithmetizations of system P, the different Phi^i are not identical among them, because each one treats differently the arithmetization of itself from the arithmetization of other Phi^i.

Exactly identical arithmetical statements, such as definition (10) instatiated with some particular Phi^i, acquire different meaning and truth value when expressed in one or another Phi^i.

Among those statements, Gödel's theorem.

Five two [This argument does not hold if we have to consider recursively enumerable arithmetiazations.]

A further argument that shows inconsistency in Gödel's theorem comes from considering that if we are going to somehow encode the index with Phi^i, we should first consider what entropy must that index have, since it will correspond to the size of the numbers that we will need to encode them. And that entropy corresponds to the logarithm of the cardinality of I, i.e., of the number of valid Phi^i.

To get an idea about the magnitude of this entropy, it may suffice to think that variables have 2 degrees of freedom, both with countably many choices. Gödel very conveniently establishes a natural correspondence between the indexes of the variables and the indexes of primes and of their consecutive exponentiations, but in fact any correspondence between both (indexes of vars, and indexes of primes and exponents) should do. For example, we can clearly have a Phi^i that maps the first variable of the first order to the 1000th prime number exponentiated to the 29th power.

This gives us all permutations of pairs of independent natural numbers, and so, uncountably many choices for Phi^i; so I must have at least the same cardinality as the real line. Therefore y^i doesn't correspond to a natural number, since it needs more entropy than a natural number can contain, and cannot be fed into Z^i, terminating the proof.

Making a one-pass compiler by generating fexprs that generate code

I'm starting to write a simple compiler that transcompiles a simple scripting language for general game playing (at least for chess-like and a few other board games) into C which will be compiled in memory with the Tiny C library.

I noticed that there's a mismatch between the order in which parser generators trigger actions and the order in which tree nodes need to be visited in order that identifiers can be type checked and used to generate code.

Parser generators trigger actions from the bottom up. Ideally when you generate code, you visit nodes in the tree in whatever order gives you the type information before you see the identifiers used in an expression. Since fexprs let you control what order you visit parts of the inner expression, they're perfect for that.

So my parser is being written so that the parse generates an s-expression that contains fexprs that when run semantically checks the program and transcompiles in a single pass.

This also suggests a new version of Greenspun's 10th rule:
A sufficiently complex C++ program will contain an informally specified, buggy and incomplete implementation of John Shutt's Kernel language.

New PL forums:

Hello, LtU Community,

I've built a new forum for the programming languages community,!

The forums software is well thought out, fast, and accessible. The forums support Markdown, TeX math, syntax highlighting, @-mentions, email notifications, a moderation system, and more. All of the above is done without requiring any JavaScript. The engine is MIT licensed and development happens in the open.

Later this year I will be inviting more researchers and engineers to post and participate in PL forums.
Today, I am asking the LtU community for feedback on the software and perhaps participation!

The engineering community can benefit from the research that is often light years ahead of the current practices. The research community can in turn benefit from lessons the engineering community learned the hard way. I've built PL forums because I want a well-designed space where theory and practice cross-pollinate.

You might know me from software such as Twitter Bootstrap, or various libraries for Ruby on Rails such as i18n-tasks (static analysis) and order_query (keyset pagination).

Thanks, Gleb

Compiler IDE API

As I am working (slowly) on a compiler at the moment, I have started thinking about IDE integration. It occurs to me (as I am sure it has to everyone else) that a compiler should not be a command-line program, but a library that can be integrated with applications and IDEs. The command line interface for the compiler should be just another client of the library.

So my question is, in an ideal environment (starting from scratch) what should an API for a compiler (targeting IDE integration as well as other runtime uses) look like? Obviously some method to build the code, some methods for auto-completion, type checking, type inference.

Other thoughts are that all the above functions need to operate in a dynamic environment, where incomplete code fragments exist, without breaking the compiler, and hopefully still offering useful feedback.

Logic Production Systems (LPS)

A new mixed logic/imperative programming language by Bob Kowalski and Fariba Sadri:

“Logic-based Production System" is a new computer language that combines the characteristics of an imperative programming language with those of a declarative database and knowledge representation language. It is the result of over a decade of research led by Bob Kowalski and Fariba Sadri at Imperial College London.

The kernel of LPS consists of a database, together with reactive rules of the form if antecedent then consequent. The database changes destructively in response to actions and external events, according to a domain-specific causal theory. Computation consists in making the reactive rules true, by performing actions to make the consequent true whenever the antecedent becomes true. In addition, LPS includes Prolog-like logic programs both to recognise when antecedents become true and to generate plans of actions that make consequents true.

The page links to papers and slides that describe the language in more detail and there is an online playground to try it out.

Rust's language ergonomics initiative

Aaron Turon has a nice blog post about Rust's language ergonomics initiative. It contains a description of how the Rust language design team currently thinks about "implicit" features (those that do something without the user having to write anything explicitly at this point in the program), how to design such features or detect issues with existing features.

There are three dimensions of the reasoning footprint for implicitness:

Where are you allowed to elide implied information? Is there any heads-up that this might be happening?
What influence does the elided information have? Can it radically change program behavior or its types?
How much of do you have to know about the rest of the code to know what is being implied, i.e. how elided details will be filled in? Is there always a clear place to look?

Various existing or proposed Rust features are then discussed and critiqued along these evaluation axes. I found the blog post interesting on several levels and thought it could be a nice starting point for discussion.

One thing that is important to keep in mind is that, despite the generality of the claims made about programming language design, this text is mostly about a set of design tools that has been conceived to solve a specific category of ergonomics problem. LtU regulars will be quick to point out that the interaction with programming tools, for example, is not discussed or considered at all while it is a central aspect of ergonomics in many cases (including, I think, implicitness decision). I would personally recommend trying to stay within the approximate scope of the Rust discussion (generalizing to other languages and systems), rather than discussing all aspects of "Language Ergonomics" at once, which would result in discussions all over the place (but with less people interested in each sub-discussion).

A parting remark on the end paragraph which I foud interesting:

And, perhaps most importantly, we need empathy for each other. Transformative insights can be fragile; they can start out embedded in ideas that have lots of problems. If we’re too quick to shut down a line of thought based on those problems, we risk foreclosing on avenues to something better. We’ve got to have the patience to sit with ideas that are foreign and uncomfortable, and gain some new perspective from them.

Clearly Aaron Turon is getting at the tone of online discussions which can become heated, even for details of programming language feature design. But what I think of reading those lines is that such discussion are helped by one aspect of our work as language designers, which is to produce tests and criterions to evaluate language features (and, yes, shut down some proposals more quickly than others). I fully agree that, in a foreign land of creativity, one should not judge too quickly or too harshly. But I think that the development and wider adoption of these design tests would not only result in seeing the flaws in some proposals more quickly, it would also let people iterate more quickly on their own designs, applying the tests themselves to recognize the issues without having to go through public feedback (and potential harshness).

I see similarities with the notion that mechanized proof assistants made it much easier to teach how to write rigorous mathematical proofs to beginners, as they can get direct feedback (from the computer) on whether their proof attempts are correct, without having to ask for the explicit approval of another human (their teacher or their student colleagues), with the various inhibition effects that it created. Having this internal feedback is what the design principles, laws, and various meta-theorems that our community developed enable, and it is one of the side-benefits of formalism.

notes on a C-ish memory manager design

I am (re)designing a C runtime memory manager for hosting fibers, DSLs, and lightweight daemons. I started teaching my early-20s sons to program (in C++ and C), and the idea is to give them a library allowing progressive elaboration of tools written in Unix philosophy: i.e. simple and composeable processes. The idea is to use interfaces that allow embedding, so pipelines can run in threads at first, then later fibers within a thread, without code rewrites beyond automatic translation for CPS (continuation passing style) to run in fibers. The "app" would be a daemon hosting a user space operating system, with services like an http agent on a port for a browser-based UI. The idea is to give my sons an undergraduate CS education in terms of what happens here, and in stuff they write to execute inside.

My only aim in this post is to briefly describe its memory management, and the relation to programming language implementation, since a couple different (Smalltalk-ish) Lisp interpreters will run inside, as DSL examples, one simple and one fancy. The paragraph above was just context, since otherwise you might wonder "why in the world I want to do that?" about each detail. This is for your entertainment only, since I would find a helpful comment surprising. My current version of this sort of thing is much nicer than any I did before. I was trying to solve the problem of making everything refcounted, without something terribly awkward happening. (Embedded sub-processes can garbage collect as they like without refcounting, though; my fancy interpreter would use stop-and-copy.)

Each runtime instance uses interfaces defined in the library only. The world outside is abstracted as an environment, so all calls to the outside world go through an env instance. This helps later automatic code rewriting, so blocking calls can be handled as async calls to a worker thread pool to avoid blocking a fiber. (Such calls park instead of blocking.)

The idea is to have multiple independent runtimes in one host operating system process, so there is no global memory manager unless glue code for the environment does that (whatever mechanism gets memory from native interfaces). There is at least one instance of environment E in the OS process, which has function pointers for every external interface supported. For memory management purposes, it must have a memalign() function pointer, to allocate large blocks of aligned memory. (This can be simulated with malloc() by allocating very big blocks, then chopping off misaligned start and end parts, using those for other purposes.)

Each runtime instance has at least one vat, acting as the heap, which allocates only aligned memory blocks. For example, each thread would have its own dedicated vat, which is single-threaded, so no races exist between threads. (All dataflow between threads is copied from one vat to another, mediated by thread-safe queues in the runtime.) Each vat allocates only large uniform blocks from the env, perhaps several at once for efficiency, where every block is 1MiB in size, or 2^20 bytes, and aligned to a 1MiB address. (The standard block size could be another power of two instead of 1MiB, but I will speak as if it could never change.) Since there will be fragmentation, this design only makes sense when allocating a lot of total memory, on the order of gigabytes. So you would not want this runtime on a system with limited resources.

Each vat describes its address space as a book, a set of 1MiB pages, where each 2^20-aligned block of 2^20 bytes is called a bp for book page -- or just page for short when book is understood. The main purpose of each bp is usually sub-allocation of smaller aligned memory blocks to satisfy vat allocation requests, but a garbage-collected vat would use its book any way it liked (an address space with page granularity). A vat used for C style allocation would return blocks that are powers-of-two in size, each aligned to the same power-of-two bytes, as small as 16 bytes and as large as, say, 256K. You can cheaply test any address for membership in a vat by seeing if the aligned bp address is in the book hashmap of bp addresses. Each allocated aligned block is called a rod, short for refcounted, because each one is reference counted by metainfo at the start of each book page.

Each bp begins with a standard bp header, describing page format and detail common to the page, followed by as many rh rod headers as needed to describe each rod sub-block in the rest of the page. (A garbage collected vat would have only the bp header for each bp.) Each rod head is 16 bytes in size and 16-byte aligned, with a 32-bit refcount a 32-bit checksum, and several descriptive fields, including a 16-bit generation number and a pair of 16-bit ids characterizing format and optional allocation context (naming a backtrace for memory usage analysis). Starting from the address of a rod -- some block inside a bp -- getting its refcount touches two cache lines in the general case, one for the bp head and one for the rh. Except for very small rods, this is only one cache line worse than co-locating the refcount inside the block itself. And that would only happen when the refcount was actually needed. (In my model, refcount is NOT linear in aliases; it is only linear in official handles which other aliases can borrow when their scope is clearly inside the handle scope.)

The purpose of a checksum in the rh for each rod is dynamic support for immutable values. When a rod is frozen to immutability, so it cannot be changed again before deallocation, the checksum must still be unchanged when the last reference goes away. You might only audit this in debug mode, but you an ask a rod if it is immutable, and enforce absence of mutation by accident.

Now here's the interesting part: each bp header has a pointer to the parent vat, and to the env. So the cost of those pointers is amortized across all the rods in the same page. In effect, every object allocated from a vat is a subclass of rod, and every rod has a pointer to the allocating vat and to the runtime env interface at zero space cost. It becomes unnecessary to pass vat and env as separate function arguments, since they are always available from any object in a vat. You can also add a handle reference to any sub-object inside a rod, because you can find the associated rh for any address in the page that is part of the rod space. So you can hold references to sub-fields which keep the parent object alive, without any problem, as long as you avoid accidental fragmentation cost (keeping something larger alive only to use a small sub-field).

What about details related to programming languages?

My overall objective is to write tools converting content into different forms, including source code rewrite transformations. (I tell my sons that most of programming is turning inputs into different outputs.) But that is little affected much by memory management per se. Except the i/o model of embedded languages can stream refcounted immutable rods between agents in a process pipeline. That helps get operating system style features into programming language semantics, where a language can also have a process model.

I expect to write a garbage collected Lisp using a vat as the heap, which allocates new bp instances for stop-and-copy before releasing old instances. The uniform use of standard book pages makes re-use efficient without fragmentation across multiple vats.

For bootstrapping, I probably want a simpler and dumber Lisp first, just to perform operations on S-expressions used to express whatever high level policies are used by the system in configuration, or in manifests describing large scale transformations. This can be done with a primitive AST interpreter over a refcounted tree of nodes allocated by C-oriented code. Graph cycles would be pathological unless manually handled or automatically detected. But it's not worse than other C manual memory schemes.

One field in each rh with metainfo describing each rod is the ID of a schema descriptor: the plan of any struct inside the rod. Among other things, it would exactly describe the location of embedded handles in the rod, so reference releasing can be automated. The idea is to be able to release all resources when a lightweight process is killed, without requiring as much manual intervention by a user. A secondary purpose is to make releasing large graphs incremental and uniform, managed by a background rod releaser, which incrementally walks a graph using the rod metainfo as a guide.

A second purpose in putting a plan ID in the rh for each rod is generic debug printing support: I want to be able to print everything. But manually writing a print method for everything is a pain. I would rather write a standard metainfo description of everything, especially since this can be emitted from a source code analysis tool. Then a single generic printer util can process this standard format guide when walking graphs. (Note: to stop huge graphs from being printed, a field in a plan can be marked as best cited briefly, rather than printed in full, and this also helps prevent following cycles due to back pointers.)

Note I am not describing handles, which are like smart pointers, as they little affect memory management and PL topics. But each rod refcount comes from an alias in a handle, which associated metainfo with the rod pointer, including a copy of the plan ID and generation number, which must match when the ref is released. (That detects failures to keep refcounted objects alive as expected.) Refcounting is not manual. It occurs as a side effect of aliasing via handles. When auditing is enabled, each handle can hold the ID of a backtrace that added a reference, and this finds refcount leaks. On a 64-bit system, the size of a handle is twice the size of a pointer: the metainfo is pointer-sized in total. This requires a library provide standard collection classes that work with handles too, when ownership of objects is expressed as collection membership. Each handle includes a bitflag field which, among other things, allows a reference to be denoted readonly, even if the rod is mutable, for copy-on-write data structures like btrees that share structure. (This is useful for transient collections expressing "the original collection plus any patches that might occur" without altering the original collection.)

Controlling Reductions

Haskell has an option for user defined reductions. I have no idea how it works.

Felix also has two ways to do this.

The first way is complete but hard to use: the parser accepts user defined patterns in the form of grammar productions, and then the user writes Scheme code to manipulate the parse tree to produce any combination of a fixed set of AST terms they like.

The second way is simpler but typed, and looks roughly like this:

reduce strings 
  | (a:string,b:string) : a + b=> scat ([b,a]) ;
reduce scat2 
  | (x:list[string], y:string) : scat ([y, scat x]) => scat (Snoc (x,y)) ;

Its polymorphic though that's not seen in this example. The vertical bar indicates you can provide several alternatives which are tried in sequence. The algorithm applies each reduction, and each is applied top down. The above example doesn't work! The intent was to replace string concatenation using a left associative binary operator with a faster function scat, operating on a list. Clearly I needed the algo to work bottom up.

The contrast between these two methods is that the first method has a complete general purpose programming language available to perform the reductions, the pattern matching uses a fixed algorithm (the GLR+ parser). The second method recognises terms and performs the reductions using a fixed algorithm so is much weaker.

What I want is something less complex than having to hand write the whole reduction machinery in a general purpose language (I could do that, using dynamically compiled and loaded Ocaml modules but it's too hard for end users).

So I'm looking for a *compromise* which is reasonably simple, but also reasonably capable. I added alternatives for this reason and am thinking to add an adjective which specifies if a reduction should be applied top down or bottom up. A more complex option is to allow for nested reductions (so that if one reduction succeeds it triggers another one).

Any ideas?

CFL parsing, and another way to look at the CNF...

I'll implement it and try to break it when I get a chance, but it seems the below should exhibit O(|G| |U|2 n2 log n) worst case behavior (in time), where |G| is the total size of the grammar, |U| is the number of distinct nonterminals in the subset of unary productions U -> w in G, and n the input's length.

Or, what am I missing?

We consider only unambiguous CFGs in Chomsky Normal Form (CNF).

We devise a sort of "type system" over rewritings of the input as we process it from right to left, from end to beginning.

We'll look at the algorithm's work environment as being made of a left context, say Lc, a current symbol, say w (talking about a terminal) and a right context, say Rc.

The left context (Lc) is always a (length-decreasing*) prefix string w0...wi...w of terminals of the actual input (equal to the entire input s = w0...wn-1 initially, and empty at the end, when no more production type rewritings can be applied).

(* albeit, non-strictly)

The right context (Rc) is a mutating list of computed S-expressions storing what has been consumed and "typed" by reading the input from right to left, from end to beginning.

In this scheme, it is constructed so as to be, also, the recorded value of the rightmost-derivative of the parse tree, at the "point" Lc \ w, before Lc gets empty and reduction to the grammar start symbol (if possible) happens, eventually finalizing Rc as the single S-expr (S w0...wn-1) (or failing to do so).

Initially, Rc is empty, and Lc equals the entire input.

In the CNF grammar, we treat the two distinct types of productions thusly:

Productions of the form U -> w (aka rule R1)

Rule R1: U -> w  will be said to "type" w between Lc and Rc as... w: U ("the utterance of w is of type U between any Lc and Rc")

that is, whenever Lc ends with w


Rewriting via R1

Lc Rc ~> (Lc \ w) (U w) Rc
 new Lc> \------/ \------/ <new Rc

(takes a list of S-exprs as Rc and prepends the new S-expr (U w) to it, making a new Rc longer by one S-expr, while Lc gets shorter by one terminal, w)

Productions of the form U -> P R (aka rule R2)

Rule R2: U -> P R  will be said to "type" P followed by R in Rc as... P: R -> U ("the utterance of P applied to Rc typed as R promotes Rc to U")

that is, whenever head(Rc) = (P x) and head(tail(Rc)) = (R y)


Rewriting via R2

Lc Rc ~> Lc (U head(Rc) head(tail(Rc))) tail(tail(Rc))
            \----------------------------------------/ <new Rc

(takes a list of S-exprs as Rc and turns it into a new Rc shorter by one S-expr, Lc left unchanged)

Notice the one to one correspondence between mutually ambiguous grammar productions and mutually unsound production types, btw:

E.g., having, in the grammar, the mutually ambiguous

A -> BC
D -> BC

would yield, in this "type system", attempts at applying the mutually unsound (if only signature-wise)

B: C -> A
B: C -> D

Which would obviously be intrinsically problematic from either / both perspectives, and no matter what the input may be, as soon as said input will contain utterances of (a substring already reduced to) B, followed by (a substring already reduced to) C.


Now for a concrete example (full parse / derivation); consider the grammar in CNF (informal meaning given in parentheses):

(nonterminal-only RHS)

S  -> NP  VP  ("a noun phrase followed by a verb phrase reduces as a sentence")
VP -> Vt  NP  ("a transitive verb followed by a noun phrase reduces as a verb phrase")
NP -> Det N   ("a determinative followed by a noun reduces as a noun phrase")
N  -> Adj N   ("an adjective followed by a noun reduces as a noun")


(terminal-only RHS)

Det-> a       ("'a' reduces as a determinative")
Det-> the     ("'the' reduces as a determinative")
Adj-> young   ("'young' reduces as an adjective")
N  -> boy     ("'boy' reduces as a noun")
N  -> dragon  ("'dragon' reduces as a noun")
Vt -> saw     ("'saw' reduces as a transitive verb")

Which we'll use to "type" thru rewritings of S-exprs in Rc, as:

NP : VP -> S  ("the utterance of 'NP' applied to Rc typed as 'VP' promotes Rc to 'S'")
Vt : NP -> VP ("the utterance of 'Vt' applied to Rc typed as 'NP' promotes Rc to 'VP'")
Det: N  -> NP ("the utterance of 'Det' applied to Rc typed as 'N' promotes Rc to 'NP'")
Adj: N  -> N  ("the utterance of 'Adj' applied to Rc typed as 'N' promotes Rc to 'N'")


a     : Det   ("the utterance of 'a' is of type 'Det' between any Lc and Rc")  
the   : Det   ("the utterance of 'the' is of type 'Det' between any Lc and Rc")
young : Adj   ("the utterance of 'young' is of type 'Adj' between any Lc and Rc")
boy   : N     ("the utterance of 'boy' is of type 'N' between any Lc and Rc")
dragon: N     ("the utterance of 'dragon' is of type 'N' between any Lc and Rc")
saw   : Vt    ("the utterance of 'saw' is of type 'Vt' between any Lc and Rc")

After iteration 0 (init)

   (the          young          boy         saw         a         dragon)
Lc>\--------------------------------------------------------------------/ ( ) <initial Rc

(NB: Rc = empty list)

After iteration 1, via R1

   (the          young          boy         saw         a)        dragon
Lc>\-----------------------------------------------------/       (N dragon)1
                                                                 \--------/ <new Rc
(NB: Rc = list of one root S-expression)

After iteration 2, via R1

   (the          young          boy         saw)        a          dragon
Lc>\-------------------------------------------/       (Det a)1   (N dragon)2
                                                       \-------------------/ <new Rc

(NB: Rc = list of two root S-expressions)

After iteration 3, via R2

   (the          young          boy         saw)        a         dragon
Lc>\-------------------------------------------/   (NP (Det a)   (N dragon))1
                                                   \-----------------------/ <new Rc
(NB: Rc = list of one root S-expression)

After iteration 4, via R1

   (the          young          boy)        saw          a         dragon
Lc>\-------------------------------/       (Vt saw)1(NP (Det a)   (N dragon))2
                                           \--------------------------------/ <new Rc

(NB: Rc = list of two root S-expressions)

After iteration 5, via R2

   (the          young          boy)        saw         a         dragon
    the          young          boy    (VP (Vt saw)(NP (Det a)   (N dragon)))1
                                       \------------------------------------/ <new Rc

(NB: Rc = list of one root S-expression)

After iteration 6, via R1

   (the          young)         boy          saw         a         dragon
    the          young       (N boy)1   (VP (Vt saw)(NP (Det a)   (N dragon)))2
                             \-----------------------------------------------/ <new Rc

(NB: Rc = list of two root S-expressions)

After iteration 7, via R1

   (the)         young           boy          saw         a         dragon
    the         (Adj young)1  (N boy)2   (VP (Vt saw)(NP (Det a)   (N dragon)))3
                \-------------------------------------------------------------/ <new Rc

(NB: Rc = list of three root S-expressions)

After iteration 8, via R2

   (the)         young          boy          saw         a         dragon
    the      (N (Adj young)  (N boy))1  (VP (Vt saw)(NP (Det a)   (N dragon)))2
             \---------------------------------------------------------------/ <new Rc

(NB: Rc = list of two root S-expressions)

After iteration 9, via R1 (final Lc is empty)

    the           young          boy          saw         a         dragon
   (Det the)1 (N (Adj young)  (N boy))2  (VP (Vt saw)(NP (Det a)   (N dragon)))3
   \--------------------------------------------------------------------------/ <new Rc

(NB: Rc = list of three root S-expressions)

After iteration 10, via R2

    the          young          boy          saw         a         dragon
(NP(Det the) (N (Adj young)  (N boy)))1 (VP (Vt saw)(NP (Det a)   (N dragon)))2
\----------------------------------------------------------------------------/ <new Rc

(NB: Rc = list of two root S-expressions)

After iteration 11, via R2

       the          young          boy         saw         a         dragon
(S (NP(Det the) (N (Adj young)  (N boy))) (VP (Vt saw)(NP (Det a)   (N dragon))))1
\-------------------------------------------------------------------------------/ <new Rc

(NB: Rc = list of one root S-expression)

Success, with the expected parse binary tree:

       the          young          boy         saw         a         dragon
       Det          Adj            N           Vt          Det       N
                            N                                   NP
               NP                                     VP

Disclaimer edit:

this is just genuine amateur research from a practitioner who's never been in academia (that would be me). I've obviously posted it here on a friendly LtU only because I'm (still) wondering if I'm on to something / or what am I missing. Bear with me (and thank you in advance).

XML feed