LtU Forum

The question of the possibility of a simple formal foundation to the natural languages.

I think that it can safely be said that the natural languages can transport any formal structure; that we can communicate any mathematical structure using the natural language. That's the essence of metamathematics.

But then we are led to believe that the natural language has no proper formal structure. It is informal. In the sense that it is not possible to feed the Don Quijote to an algorithm that will be able to pinpoint any formal inconsistency implicit in the text (without recurring to any other text or knowledge). An example imaginary inconsistency would be if at one point Cervantes says that Don Quijote always likes lo love Dulcinea, and at another he says that he does not like to love her. The kind of inconsistencies that would destroy any formal structure if inadvertently injected in transit.

These 2 previous paragraphs seem a bit paradoxical to me, since, if there is no procedure to decide whether a natural text is inconsistent (or if there are only heuristic fallible procedures) metamathematics should be impossible, and mathematics could not have been born.

If we take a formal structure expressed in some mathematical formalism, and express it with the natural language, metamathematically, we need to be sure that there are no inconsistencies in any of the cases. In the former case, we can check, and there are algrithms that can check. In the later case, we can check. Can no algorithm generally check?

I know that there is a history to this, from Llull and Leibniz, reaching a summit with Frege, to be toppled by Russell, and followed by the neopositivists etc., and later experiments like the semantic web.

So my question here is: Is there some kind of proof or argument showing that the natural languages cannot be provided with a simple mathematical foundation? Some recognizable fundamental property of the natural language that is inherently inconsistent? (Are both questions the same?)

What I mean with a "simple" mathematical foundation is that it must consist on a core formal theory (that can then be suplemented with a number of ad-hoc rules that provide for shortcuts and phrases in the natural languages). A priori, it should be totally disprovided of semantics. Then we should be able to interpret extensions built on that foundation in natural language texts, taken as abstract structures of words; and then, by proxy, we might provide those extensions with the semantics of the natural language texts in which they are interpreted. (Perhaps we need some a priori semantics, for things like time [edited to add: or I? That would seem to lead us towards Gödel's pit, damm. However, I think that we can settle in principle for a descriptive language without person], that are deeply involved in the basic structure of the natural language?)

So, simple in the sense that Frege's proposal was simple, or that the semantic web is simple (my previous paragraph is meant to be interpreted in their intended use), but a huge neural network with crazy amounts of delicately balanced branches trained by all books ever published is not simple. Notwithstanding the problem that with the neural network solution we are dealing with an ungodly mixture of syntax and semantics.

I really don't know whether I am using the right terminology to phrase my question - or even whether there exists a terminological framework where it can be meaningful and exact. So apologies if I have taken a couple of poetic licences in trying to lay it out.

Question: do you have to climb the tower of interpreters?

You know what is interesting about meta-circular interpreters? The infinite recursion of eval-apply is always broken in the end by the application of some primitives. So in the end, no matter what your computational model is, you are still executing a stream of primitives in your "ground" language. And, usually, "lambda" is all you'll ever need as a primitive. But looking at so many toy interpreters you'd see that "lambda" is not quite so "primitive" and requires a lot of machinery to work.
So instead of providing some means of abstraction directly as a ground mechanism to build everything else around it, I wanted to do a simple thought exercise and to go from another direction: if we have a fixed set of primitives, without means of abstraction, what are the ways to compose/combine them?

Imagine we have a really old CPU without a support for a call instruction, but it has a conditional jump, ALU instructions, loads/stores. It's a simple exercise to define how would a call instruction would look like (in pseudo-assembler):

define CALL(target_addr):
load stack_pointer, tmp
store tmp, the-address-of-the-next-instruction-after-call //don't care now how we can get this
dec tmp
store stack_pointer, tmp
//skipped the frame pointer manipulation
branch target_addr

So, in general, our CPU IS capable of doing necessary steps to create an abstraction mechanism, but there is no way to "plug it in" by the CPU itself unless our preprocessor rewrites all "CALL" tokens into these primitives. Not good.
The only other way I could think of is to interpret another language, which has this CALL as primitive. I.e. the CPU itself becomes such a "preprocessor". But does that mean that the only way to escape deficiencies of your current computational model is to climb the tower of interpreters? (aka interpret another language that has that capability. My favorite examples: a Scheme interpreter with first-class macros in scheme by Matt Might, Kernel interpreter in Scheme by J.Shutt, 3-LISP by Brian Smith)

Are there any other ways, besides interpreting another language? Maybe there is a way to build such a reflexive virtual machine that can modify and extend itself? I haven't seen such yet.

I've recently stumbled upon Ian Piumarta's "Open,extensible composition models" VPRI Technical Report TR-2011-002 which describes a meta-circular evaluator, where instead of your usual big cond-statement in eval you select your evaluator based on the type of expression you are evaluating (same for applicators) and this mapping is available for the programmer. I've found this a very interesting read, maybe someone could comment on this paper, as I've surely missed a lot.

I have surely asked a lot of questions in a very unstructed form, but if anyone would maybe give some useful pointers (something to read on the topic, maybe) it would be greatly appreciated.

isomorƒ: an experimental structured editor for witing/deploying functional code

isomorƒ is attempting to bridge the divide between functional programming, serverless architecture, and cloud-based structured-editing with the grand hope of creating a simplified, beginner-friendly, but powerful IDE experience with easy microservice deployment.

The platform runs on a compact, pure, statically-typed functional AST with all the power in the IDE including a syntactic sugar layer, passively identified reuse ideas, exposte optimization, automatic versioning, and immediate cross-user sharing (all enabled by the guarantees of functional purity / referential transparency).

We currently have a prototype sandbox of the IDE available and a high-level vision at our blog.

We would love any feedback on the sandbox, the idea/implementation, academic/educational/commercial applications or anything else!


hobbes, Morgan Stanley OSS

Over the last few years, I have been developing hobbes -- a programming language, JIT compiler, and database system -- as part of my work for Morgan Stanley. It has become a critical piece of infrastructure in our low-latency, high-volume trading applications, and we have decided to release the source code to the public on github (currently can be built for recent Linux and macOS platforms):

The database system is a lightweight (self-contained, header-only) library used by applications to efficiently log structured (binary) data over shared memory to minimize application latency and reflect application type structure as accurately as possible in log files. We use this to record a very detailed image of application state over time, which we analyze/query out of band.

The JIT compiler can be used embedded in another application (e.g. to "hot patch" an application with very efficient, precisely typed intercepts) or as a standalone interactive interpreter (e.g.: to monitor and query application log data).

The programming language is a variant of Haskell (HM type inference, algebraic data types, qualified types / type classes) with some adjustments to help reduce boilerplate and derive very efficient machine code. For example, we use "structural" record, variant, and (iso-)recursive types to represent data as it's naturally represented in applications and can be deconstructed generically at compile time (e.g. a record can be printed if its first field can be printed and its rest can be printed, a variant can be printed if its first case can be printed and its rest cases can be printed, a recursive type can be printed if its one-step unrolling can be printed, etc).

We are actively using this on major projects, we will continue to develop this github project as we need new features, and we are interested in engaging others outside of the firm for their thoughts, feedback, and hopefully pull requests. :)

/join the #proglangdesign channel on Freenode

The #proglangdesign channel on Freenode hosts many discussions about programming language design, compilers, interesting algorithms, and anything else useful for language implementation. The channel is pretty active; say 'hello' if you join.

Many of the members are working on compilers, interpreters, whether they are just experimenting to learn, or advancing the state of the art.

Come join us!

More information can be found at our community website, which includes a list of our projects; submit a pull request to add your own.

Just entertainment: Click, click, click!

Just a silly website I thought you people might enjoy. Click, click, click!

Is Datalog negation(¬) similar to the built-in predicate (≠)?

I was reading "Principles of Database & Knowledge-Base Systems, Vol. 1" by Jeffrey D. Ullman. There is a chapter about Datalog negation and as I was seeing the problems of negation I kept thinking that using the predicate ≠ would solve those problems.


bachelor(X) :- male(X) & ¬married(X,Y).

would become:

bachelor(X) :- male(X) & married(Y,Z) & X ≠ Y.

but then I see the following:

p(X) :- r(X) & ¬q(X).
q(X) :- r(X) & ¬p(X).

The problem is this has 2 minimal models and if I'm not mistaken so does this:

p(X) :- r(X) & q(Y) & X ≠ Y.
q(X) :- r(X) & p(Y) & X ≠ Y.

Is there an equivalence between these 2 operators? If so, did I miss it or is it not mentioned that it's unsafe to use ≠ with recursion?

Please submit to LIVE! 2017 (SPLASH Vancouver)

LIVE 2017 aims to bring together people who are interested in live programming. Live programming systems abandon the traditional edit-compile-run cycle in favor of fluid user experiences that encourage powerful new ways of “thinking to code” and enable programmers to see and understand their program executions. Programming today requires much mental effort with broken stuttering feedback loops: programmers carefully plan their abstractions, simulating program execution in their heads; the computer is merely a receptacle for the resulting code with a means of executing that code. Live programming aims to create a tighter more fluid feedback loop between the programmer and computer, allowing the computer to augment more of the programming process by, for example, allowing programmers to progressively mine abstractions from concrete examples and providing continuous feedback about how their code will execute. Meanwhile, under the radar of the PL community at large, a nascent community has formed around the related idea of “live coding”—live audiovisual performances which use computers and algorithms as instruments and include live audiences in their programming experiences.

We encourage short research papers, position papers, web essays, tool demonstrations (as videos), and performance proposals in areas such as:

Recent work in REPLs, language environments , code playgrounds, and interactive notebooks.

  • Live visual programming.
  • Programming by example.
  • Programming tools for creative experiences and interactive audio visual performances.
  • Live programming as a learning aid.
  • Fluid debugging experiences
  • Language design in support of the above.

Submissions are due on August 1st and will go through HotCRP @ The workshop is open to various kinds of media: you can write a traditional short paper (PDF), a web essay with embedded videos, a narrated video, or whatever else you think can explain your work best! Content should be consumable in 30 minutes to an hour of casual reader’s time, which means around 5-10 pages for a paper, a video from 10-20 minutes (assuming the viewer would need to pause to contemplate), and essays of around a few pages length. Video and non-paper submissions can by listed as URLs (e.g. to a web page, file locker, or streaming site) in the submission’s abstract.

Any questions or trouble with submitting, please contact This CFP is hosted @

C runtime type info gimmick which supports scripting

I am experimenting with runtime type information in C, so that values in memory have types, rather like Lisp and Smalltalk values have types, and not just variables. This post is intended to be fun. (The idea is for you to enjoy this, and I don't want to argue. But I might respond to remarks with short dialogs in lieu of argument, since that would be fun.) The rest of this is explanation and some detail.

In an earlier post I described a kind of heap I call a vat. It has the effect of associating metainfo with every block of memory allocated, in a different position in memory. (There is a constant time arithmetic calculation that turns a block pointer into a metainfo pointer, and vice versa; so the main penalty in access is a cache line miss if this happens more often when lacking adjacency to its block.) A continuous memory block is a rod, while the fixed size metainfo describing it is a knob. A counted reference to rod (or any point within) is a hand. On a 64-bit platform, a knob is 128 bits, a hand is also 128 bits, and rods vary in size.

Some bits of a hand are a copy of bits in the knob, so a reference is only valid when it agrees with the current self description of an object, particularly the current 16-bit generation number, so dangling refs are seen. Of the bits copied, another 20 bits are the id of the object's plan, which describes the type, layout, and any dynamic behavior if you also supply methods for that type in hashmap entries. You can think of plan as morally identical to class in Smalltalk, but initially I was aiming for perfect format description on a field by field basis, with recursion grounded in primitive native types. I wanted to write centralized code to debug print anything based on the plan. But then I sort of noticed I would be able to write Smalltalk style code that works on the C objects, if you were willing to call a dynamic api to send messages (method selectors with parameters).

If you point at a subfield somewhere inside the middle of a rod, you can eventually figure out exactly what it is by drilling down in the rod's plan, after fetching the plan for the id in the rod's metainfo knob. But since this lookup is offset-based, it aliases anything with the same offset. That is, a struct named foobar that contains a foo and then a bar will have the same offset for foobar and foo. If you send a message, did you mean to foobar or to foo? I would start a search in method tables for the largest containing object, then work through nesting until I found a match.

At the moment I am writing metainfo by hand, but generating it from C declarations later is the idea. On 64-bit platforms a plan is 256 bits, and each field requires 128 bits, plus however long the string names get, and method tables if there are any. When a field is composed of bit fields, I use a plan that says all its fields are bitfields, and then each field describes bit sizes and shifts instead of byte sizes and offsets.

Even if most of an app is comprised of static code, I will want to write dynamic and interactive scripts to explore runtime state in tests, and to inspect things at runtime in ways I did not anticipate earlier. I would normally think of glomming on a scripting language to do this, on the side. But I can use C as the scripting language instead, if memory state is universally annotated this way with type metainfo.

What would be involved in moving logic beyond FOL?

Carl Hewitt has opined that first-order logic (FOL) should not be regarded as logic of choice for various people such as computer scientists (e.g., he says "First-order theories are entirely inadequate for Computer Science" in a reply to me in Mathematics self-proves its own Consistency (contra Gödel et. al.). He recommends instead his Inconsistency-Robust Logic for mathematicians, computer scientists and others.

What would it take to really change our perspective to be grounded in a logic other than FOL? I think that looking at John Corcoran's First days of a logic course provides the beginnings of an answer (the article treats Aristotelian logic rather than FOL): an alternative logic should provide a natural basis grounding all of the concepts that we expect a new student of logic to grasp, and a rival to the kind of traditional logic course based on first-order logic should be comparably good or better for building on.

What would an introductory logic course look like for, e.g., Inconsistency-Robust Logic? Would it start with Aristotelian logic or would it start with something else? What would having taken such a course be good for, as a prerequisite?

XML feed