Inform 7: A relational DSL for interactive fiction with natural language syntax

Inform 7 is a radical revision of Graham Nelson's Inform language for interactive fiction (such as Zork). Whereas Inform 6 and its predecessors were (IMO) very low-level languages with a C-like syntax, Inform 7 is a relational language based on natural language syntax and a semantics based on predicate logic. Nelson describes Inform 7 in his usual erudite style, in:

Graham Nelson. Natural Language, Semantic Analysis and Interactive Fiction. 2005.

The Inform 7 implementation comes with a slick graphical interface (currently available for Mac OS X and Windows), and adopts the metaphor of a book, as indeed do aspects of the language itself.

Well worth taking a look at it.

(Credit to Peter J. Wasilko, from whose forum post on Human Factors I cherry-picked this link.)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Also discussed here

Ah, my error.

It deserves a front-page mention, though, I think.

If LtU readers have known of I7 for a while, I suppose by now some of them have experimented with it and formed some opinions based on experience. What are they?

Experiences with I7

I experimented with a couple of mini-games in I7. I haven't written a large-scale game (Emily Short's musings may be more informative) but I made a few observations.

The good: The IDE is fantastic -- using it just gives you a fuzzy feeling. Few PL environments can claim to be "inspirational", but I think this qualifies. It is designed for learning -- integrating the docs and a dynamic index of valid terms.

Certain things are easier to do than others. Dynamic descriptions mostly pass the test for least-surprise, and the bracket syntax is very readable: "Mr Wickham looks speculatively at [list of women in the location]." Of course, complex descriptions will need to be decomposed into loops and conditional structures.

World-modeling is very natural, and it's nice how you can defer specifics until they're needed. For instance, if I say someone "wears a white Oxford button-down shirt", the language will create a wearable thing called "white Oxford button-down shirt" and place it on the character. If I have defined "shirt" as a kind of clothing, the created object will be a subclass of shirt instead of the generic thing. If I have also defined "white" as a color, the language will assign that attribute to the shirt instead of making it part of the object name. And so on.

What's also interesting about this is that identifiers of objects are both used in the source code and in the final game. If I say "some pants are here" vs "the pants are here" the game will remember the indefinite article and use it when describing the object. This is a unique requirement for I.F.

The ugly: Guess-the-syntax is unfortunately prevelant. There's just so many more valid English sentences than parsable I7 statements, you have to follow the recipes in the manual or ask someone why you're statement isn't working.

There's also a lot of machinery behind the scenes that you have to learn -- actions, rulebooks, instead of/before/after rules, activities, and so on. These work wonderfully in the examples, but it's not always clear what construct to use to accomplish a given task. For example, to define a new verb, you usually have to declare an action, declare an alias for the action (which defines the syntax the player uses), and then declare rules for that action. It's not always clear what goes in the definition of an action vs. the rules applied to an action. So again, back to the cookbook.

Conclusion: I7 is a real no-kidding DSL, with no misconceptions about being anything else. It gives you the kitchen sink and more -- after reading the manual, I can't think of much else I'd want for writing games in the style of Infocom. The challenge is that the language is very big, and the English syntax can be a red herring at times. I wonder if this will improve as the language matures, or with experience.

Guess the syntax

Guess-the-syntax is unfortunately prevelant.

I hit that problem immediately. Does subset-of-natural-language in fact make you worse off than when using a language that bears little relation to a natural language? With the subset language you suffer from a lot of 'interference' from the full language as it leads you to make all kinds of incorrect guesses. I think the obvious prediction is that the subset-of-natural-language approach will make things easier for complete programming beginners, but after they've become more than beginners they'll find the syntax clumsy and want to use the 'real' Inform language. But I do think Inform 7 is a great experiment and I look forward to seeing what happens with it over the next few years.

Just shortly after I7 was

Just shortly after I7 was released, there came up a nice example of "guess the syntax" on usenet. In short, somebody had written

If the spacepod is not in the Landing Site, instead of going out from
the spacepod, say "There's no way I could leave the pod before we had
landed, unless I wanted to experience rapid decompression or
something."

which should have been

Instead of exiting when the player is in the spacepod and the spacepod
is not in the Landing Site, say "There's no way I could leave the pod
before we had landed, unless I wanted to experience rapid decompression
or something."

This could be rather confusing. For someone not familiar with Inform 7 programming, both sentences would mean the same. It would be interesting to know how non-programmers get along with Inform 7 programming (I think the betatesters all had some programming experience), if I consider that the ultimate goal of IF programming should be to make story creation easier for non-programmers (writers usually aren't and programming geeks rarely write the best stories).

Other than that, I think it's really cool for a first step into that direction.

English

There are much more sinister examples. For example, "asking someone about the person in the presence of the person" is meant to fire when you ask about someone who's in the same room. Unfortunately, I7 interprets the second "the person" as "any person" and always fires. I think the lesson here is that every word counts in an utterance -- ignoring even a "the" can get you into trouble!

I am excited to see the next steps too -- the first extension libraries are coming out, it'll be interesting to see how powerful the language will get. Being lazy, I'd like a way to make goal-oriented NPCs without multi-pages of code.

Definite vs indefinite articles

For example, "asking someone about the person in the presence of the person" is meant to fire when you ask about someone who's in the same room. Unfortunately, I7 interprets the second "the person" as "any person" and always fires.

I'd consider that a bug in the language, myself. In English, the phrase "the person" always refers to a specific, known person. If you want to refer to any person, you would use "a person".

A month with I7

I also found (edit: new link) Andrew Plotkin's thread on rec.arts.int-fiction, which contains an apples-to-apples comparison with I6, and also the quote "I don't know the syntax yet."

A few comments

Neither the paper nor the manual seem to contain a complete account of the language or its syntax, and I7 is obviously only a sublanguage of English; so I suspect users will run into the problem where the obvious English does not parse, or does the wrong thing. Shades of Applescript.

But who knows? IF programmers are all avid IF players, and IF players are not (usually) given a grammar and semantics for the story input. Of course, you need more expressiveness and control over a language which is used to write IF than you do in one used to play it. Will "guess the verb" be replaced with "guess the syntax"? Actually it occurs to me that not revealing the language definition might enthuse rather than discourage users — not only because a definition might be daunting, but also because trading techniques can turn into a kind of pastime. (Consider sites such as macosxhints.)

The paper is naive in some places, but on the whole engaging. Nelson spent a year reading up on NL topics, and it shows.

The paper seems to lack any account of dynamics. There is some discussion of how the IF world is constructed by finding a least model for the theory presented by program text, but no mention of how state is handled and how the world evolves.

The discussion of kinds was interesting. They are apparently something like types with single inheritance. He separates kinds into two disjoint sorts: objects (concrete) and values (abstract), and argues (as I recall — cannot find the bit now) that object hierarchies should be "open", while value hierarchies should be in a sense "closed" in order to support "case-based reasoning". This, together with the emphasis on the containment relation, reminded me a bit of algebra...

I am interested to hear what Ken and Marc think.

Some thoughts

I am interested to hear what Ken and Marc think.

I half looked at this when it first came around LtU, but read it more thoroughly this time around.

I quite enjoyed reading it and walked away with a few points.

On the plus side, I think this is an interesting case study in DSL design and domain-specific modelling, and he presents a thoughtful exploration of the issues and challenges bedevilling such undertakings. (People interested in the philosophies of mind and language will find some tidbits, too.)

On the down side, he seems to be equating the fairly focused domain of IF with the bigger notion of NLP (a fairly forgivable sin, since, as he observes, the state of knowledge about NL is pretty sad, in spite of all the effort that has gone into its study.)

I think his choice of "natural English" syntax is partly motivated by this over-generalization. It serves as an object lesson that just because the syntactic elements are familiar, confusion can arise when those elements resolve to quite a different semantic model.

A lot of the frustration that comes from playing with IF can be attributed to this mismatch: because you are using "natural language", it seems as though you should be able to do a bunch of things that you can't, and "guess the verb" is partly a hunt for the right syntactic element, and partly a hunt for the right semantic model of the IF world.

This also accounts for the short-comings of the "Principle of Least Astonishment" in syntax design: it works great for reading code, but is lousy for writing it.

If I read a piece of code that calls a function called findMyDog(), I have a pretty good intuition that it will tell me the location of my dog. But because the set of semantic notions in my mind, of which this example only borrows few, is so much richer than this, I'm slightly irritated when my memory fails me and I type instead locateRover() and get no joy.

If we are going to use English as the basis of a syntax (which most PLs do to some degree) I think it is actually safer to acknowledge that we are just borrowing it to construct a different, more formal language, and we will end up with less misunderstanding.

Impressions

I've still only played around with the implementation a little bit now, but I took a longer look at (the source code of) two of the sample games written in I7: Emily Short's Damnatio Memoriae and Nelson's own The Reliques of Tolti-Aph.

At first when I looked at the hypertextized version of DM I was reminded of Euclid's Elements — which, being the mathematics nerd that I am, I found quite exciting, even though I obviously knew I7 would be far more limited even than any theorem prover — but after a while I was able to see, mostly because of repetition, some patterns, and it all started to look like Prolog. Once that happened, I started to get annoyed at the verbosity, and the fact that I had to read the I7 text much more carefully (and linearly) than I would have to read Prolog, which, if you step back and squint, you can still make out the broad structure of.

I was reminded more than a bit of Dennis Merritt's Amzi! Interactive Fiction Toolkit for Prolog, but I get the impression that I7 does not really perform resolution and — though it may be only that Short has not yet mastered the language — it seems that I7 is lacking in some very basic abstraction facilities. For example, in

After reverse linking an inscribed thing to an inscribed thing: 
    if the noun mentions someone and the second noun is damning 
    begin; 
        now the second noun does not incriminate all the people
           incriminated by the second noun; 
        now the second noun incriminates all the people who are
           mentioned by the noun; 
    end if;  ...

(Note: "reverse linking" is a form of magic in the game world.) Short repeatedly refers to "the noun" and "the second noun". This refers to two items in an input by position, whereas I would rather name them and forget their position. (I would also prefer shorter names.) Perhaps you can do this with the parentheses construction I have seen "x (called y)", but this sort of repetition seems quite prevalent, even in Nelson's code.

Actually, reading this I started wondering if an improvement of the whole code-as-fiction idea might not be code-as-LaTeX. :) Instead of writing the above, you might write:

After reverse linking an inscribed thing $x$ to an inscribed thing $y$: ...

In general, in fact, I would prefer mathematical vulgate for difficult constructions, since it at least has the virtue of being clear, concise and reasonably precise. Taylor's Practical Mathematics has a few sections that show how the vulgate expresses proofs in predicate logic (things like, "put x = 4; let s be a set; suppose x is a member of s; ...". (Of course, I7 doesn't try to express proofs, but rather propositions.)

There also seems to be (on the surface) confusion of syntactic elements with their denotations. For example, the second noun is damning: it is not the noun which is damning but rather its denotation, which is some diegetic object that would incriminate the player character. Better might be: the second noun denotes something damning. This is not just a matter of naming, since code elsewhere clearly suggests that "damning" holds of the noun's denotation not the noun: Definition: a thing is damning if it incriminates someone..

Actually, this adumbrates a much larger issue, namely that it is best to separate concerns of syntax from concerns of semantics. Ideally, we would like to describe the game world itself entirely in terms of denotations, in the same way that we separate the parsing phase of a compiler from its semantic analysis and code emitter phases. This does not even really entail giving up natural language syntax; after all, you can use LaTeX to talk about semantics.

Another thing I observed is that in practice there really are quite a lot of imperative constructions. I'm not sure how much this has to do with the language itself versus how much it is an inevitable consequence of the fact that the language has to talk about diegetic actions.

I found it a bit maddening that the language, lacking a definition, seemed so obscure to me. Usually I can look at a definition or summary of a nonacademic language and fairly quickly come to a conclusion about the expressiveness of the language. Here, I am forced to make suppositions and guesses, and it is hard to form a critique one way or the other as I can't get a bird's eye-view; what claims I do make are hard to support because I can't be concrete. In a way, the language's obscurity makes it resistant to critique (cf., "security through obscurity"). While I don't suggest this is intentional, I do think it can be an obstacle to improvement by constructive criticism. (And you cannot argue that I find the language difficult only because it's innovative: although the syntax is unusual, the semantic model is familiar, and it should be set out clearly enough that an expert can give it the once-over.)

After reading the article, I

After reading the article, I am fairly confident that no unification is being done. In the section on quantifiers, Nelson mentions that these are implemented by looping over the relevant collections. The system reminds me quite a bit of a SQL interface to a database. You have a collection of relations. In addition to querying (SELECTing from )the database, you can issue commands to perform updates on it.

One difference from SQL is that Inform 7 is apparently smart enough to figure out (some kinds of?) transitive closure, though the limits of this facility are opaque to me.

Transitive closure, searching

One difference from SQL is that Inform 7 is apparently smart enough to figure out (some kinds of?) transitive closure, though the limits of this facility are opaque to me.

It can do transitive closures for relations when explicitly asked to do; however, it will not make implicit inferences aside from those implied by the hard-coded 'kind' relation. My impression is that the main purpose of the 'kind' concept is to allow some flexibility without having to do general-purpose inferencing. So when you say something like "A brush is a kind of instrument. A brush has a color. The color is usually red. An old paint brush is a brush." I7 infers that an old paint brush has a color property, which is initially set to red. You could have achieved the same result in a Prolog-style system with a general inference engine by making brush, instrument and color predicates with inference rules "instrument(X) :- brush(X). color(X, 'red') :- brush(X). brush(X) :- old_paint_brush(X)."

Aliases

You can alias objects with the 'called' syntax as you indicated -- but sometimes I find it's just easier to use 'noun' and 'second noun' rather than come up with meaningful names. The sample code gets quite creative:

Instead of kissing someone (called the blessed one):
say "Smack!";
now the player helps the blessed one.

Re: denotation -- it is a little weird that "noun" always denotes an object, but I think it is reasonable in the context of I.F. The language hides the parser (unless you escape to I6 code) so you will never typically reason about nouns, but only what they denotate. That said, I7 does let you declare actions that operate on raw strings (handy for verbs like "say") and it gets a little confusing when you aren't sure if you're dealing with a word or the object it denotates.

I have the impression that the language semantics are obscure because of the many layers. Inform 7 compiles to Inform 6 code, which in turn compiles to the 25-year old Z-Machine format. Both of the intermediate layers are DSLs unto themselves, and have their own set of peculiarities. A diagram of how various I7 concepts map onto each layer might be illuminating (but I'm sure Graham is sick of writing docs by now :) ).

"noun" and "second" (or what

"noun" and "second" (or what I7 now calls "second noun") are Inform jargon for the first and second objects given in a command. (A rival IF system, TADS, uses the terms "direct object" and "indirect object" which possibly make more sense).

I7 is an interesting beast, since as you point out it compiles to Inform 6, which is a curious language in its own right but mostly an object-oriented C-like language with some quirks for easy declaration of single-instance, precreated objects, and it compiles down to bytecode for the Infocom Z-machine. The design limitations of the Z-machine control a lot of the design of both the Inform 6 language and the standard Inform 6 world model, and Inform 7 mostly follows these conventions while adding its own layers to the language and the world model.

Although I7 does a lot of clever things with its object specifications, in some respects it is very old-school - many terms like 'the noun' and 'the second noun', for instance, are in fact simply good old global variables. As well as object and kind declarations, a large part of I7 coding is writing 'rules', but these rules compile down to fairly simple functions - which are written in imperative form.

Did you look at the Phrasebook?

Hi Frank,

Most of the Documentation is embedded in the GUI. If you enter a small game and hit GO, you can then select the vertical "Index" tab in one of the panes. This will bring up a panel that has lists of Actions, Contents, Kinds, Phrasebook, Rules, Scenes, and World. The Phrasebook has all of the currently recognized grammar fragments in it with hyperlinks back to the manual. Likewise, the other tabs index various class instantiations with hyperlinks back to the source text that defined them.

I don't think there is a formal grammar yet, because the system is still in beta.