Semantic Distance: NLP Not a Resource Sink

Following the story on Mind Mappers and other RDF comments of late, I thought this NLP slide show (PDF) should get a story. Dr. Adrian Walker offers an interesting perspective in a friendly crayon-colored format, including a critique of RDF. Source site Internet Business Logic has other offerings including an online demo.

Constraint Programming

Constraint Programming

I will not quote this introduction/manifesto/historical overview, as every page of it is worth reading.

It is not only a nice introduction into a promising field, but also a demonstration of how language design issues can be (to some extent) separated from high-level fundamental intuitions.

It is also quite interesting to follow the historical lines of the paper, it reads like an epic!

Ah, and by the way, that's the same constraint programming that underlies Oz.

Mind Mappers

OS and web search vendors are merging desktop search into their offerings. Vendor solutions vaguely worry me. They seem too focused on the home PC and not on business needs, while needlessly bypassing RDF. There's also vendor lock, bad EULAs, privacy negligence, and lost boundaries between OS, applications, and data - proprietary black boxes tempting us into dependence.

That thinking led me to the open-source Mind Raider program. It's one of the few that makes RDF useful for normal people. It compares to Chandler but focuses less on email and calendars. As far as I know, Chandler doesn't expose RDF or even use it, necessarily. However the Mind Raider Big Picture shows similarity to Chandler's vision.

So why should this stuff matter to LtU. Well, compare formal organization between data that only computers inspect and data that people use daily. Many database systems exist to store data in the former category. Employee and customer address data serves little purpose beyond printing paychecks and shipping labels. A human will not care about values except that they not be empty. Granted that people do use databases to track sales figures and other aggregates. Still even those folks use data in the latter category: stray thoughts and reminders, sticky notes, social and business correlations, restaurant napkin sketches, collaborative data, recorded conversations, news clippings. A large cloud of miscellany doesn't rise to the level of application documents or the formality of enterprise systems.

Few systems exist to aggregate and organize that stuff. If your brain suffices, then good for you. The rest of us need a crutch. Some people use spreadsheets to store lists simply because there's little else available. I've used software which imitates sticky notes on screen. It leaves much to be desired. There are dozens of little programs for narrow data types - address books, internet bookmark apps, password managers, photo albums, etc. How do you tell the address book that the photo album has pictures of the guy, and that his web link lives in the bookmark manager? Right now, you don't. And programs never organize data just the way you want. Besides, exceptions to the common format always arise. So the problem is not just searching documents and email, nice as that is, but organizing human details in useful ways. Moleskin notebooks and Dictaphones have been around a long while. It's time for cool software.

Somehow RDF seems primed for the role, but it needs less abstract public relations. Raw RDF may not be the ideal presentation but still seems a likely candidate for the underlying data model. Each individual develops a personal ontology (aka "working style" if you will) over years of time. RDF can capture that, but it will take friendly programs like Mind Raider. What do you think?

An Operational Foundation for Delimited Continuations in the CPS Hierarchy

An Operational Foundation for Delimited Continuations in the CPS Hierarchy
We present an abstract machine and a reduction semantics for the lambda-calculus extended with control operators that give access to delimited continuations in the CPS hierarchy. The abstract machine is derived from an evaluator in continuation-passing style (CPS); the reduction semantics (i.e., a small-step operational semantics with an explicit representation of evaluation contexts) is constructed from the abstract machine; and the control operators are the shift and reset family. We also present new applications of delimited continuations in the CPS hierarchy: finding list prefixes and normalization by evaluation for a hierarchical language of units and products.
...or in other words - a view on the delimited continuations from another side (as compared to "A Monadic Framework for Subcontinuations" or Oleg's posts on Hakell list). I find it useful to learn about the same concept from different sources - and the delimited continuations are still promising to become an important concept.

Also, I suspect that defunctionalized approach might be more straightforward for people coming from imperative languages.

Module Mania: A Type-Safe, Separately Compiled, Extensible Interpreter

Module Mania: A Type-Safe, Separately Compiled, Extensible Interpreter

To illustrate the utility of a powerful modules language, this paper presents the embedded interpreter Lua-ML. The interpreter combines extensibility and separate compilation without compromising type safety. Its types are extended by applying a sum constructor to built-in types and to extensions, then tying a recursive knot using a two-level type; the sum constructor is written using an ML functor. The initial basis is extended by composing initialization functions from individual extensions, also using ML functors.

This is an excellent example of how the ML module language doesn't merely provide encapsulation but also strictly adds expressive power. It also demonstrates how a dynamic language (Lua) can be embedded in the statically-typed context of ML. Finally, it demonstrates that none of this need come at the expense of separate compilation or extensibility. Norman Ramsey's work is always highly recommended.

ClassicJava in PLT Redex

Classic Java

This collection is an implementation of (most of) ClassicJava, as defined
in "A Programmer's Reduction Semantics for Classes and Mixins," by Matthew
Flatt, Shriram Krishnamurthi, and Matthias Felleisen; in _Formal Syntax and
Semantics of Java_, Springer-Verlag LNCS 1523, pp. 241-269, 1999. A
tech-report version of the paper is also available at
< The implementation is
written in PLT Redex, also available through PLaneT. Please consult that
package's documentation for further details.

This might be interesting to folks curious about how to formalize a real language, or about how PLT Redex works in practice.

In the beginning was game semantics

In the beginning was game semantics

Giorgi Japaridze

[...] the story and philosophy of computability logic (CL) [...]
According to its philosophy, syntax — the study of axiomatizations or any other, deductive or nondeductive string-manipulation systems — exclusively owes its right on existence to semantics, and is thus secondary to it. CL believes that logic is meant to be the most basic, general-purpose formal tool potentially usable by intelligent agents in successfully navigating real life. And it is semantics that establishes that ultimate real-life meaning of logic. Syntax is important, yet it is so not in its own right but only as much as it serves a meaningful semantics, allowing us to realize the potential of that semantics in some systematic and perhaps convenient or efficient way.
the philosophy of CL relies on two beliefs that, together, present what can be considered an interactive version of the Church-Turing thesis:
Belief 1.
The concept of static games is an adequate formal counterpart of our intuition of ("pure", speed-independent) interactive computational problems.
Belief 2.
The concept of winnability is an adequate formal counterpart of our intuition of algorithmic solvability of such problems.
I already posted links to papers of Giorgi Japaridze several times, but most of them were pretty technical, and also that was before the latest expansion of LtU's readership.

In short, CL is about trying to generalize traditional (both classical and intuitionistic) logic beyond batch computation (well, I hope everybody knows why logic is relevant to computation, if not - look for Curry-Howard isomorphism, or CHI). There are several approaches to doing that, but Giorgi believes they go the wrong way by trying to build upon the syntax, while it's semantics that is primary.

If you believe that computation is more than calculating a function, and that logic is a good way to understand computation - then I recommend to read at least the introduction and the references' list of this paper (the paper is a single chapter for a book, would love to see the whole book).

Combining computational effects

While some researchers seek to generalize monads (to arrows), others try to narrow the focus to achieve a richer theory (and probably deeper understanding).

Combining computational effects: commutativity and sum

We begin to develop a unified account of modularity for computational effects. We use the notion of enriched Lawvere theory, together with its relationship with strong monads, to reformulate Moggi's paradigm for modelling computational effects; we emphasise the importance here of the operations that induce computational effects. Effects qua theories are then combined by appropriate bifunctors (on the category of theories). We give a theory of the commutative combination of effects, which in particular yields Moggi's side-effects monad transformer (an application is the combination of side-effects with nondeterminism). And we give a theory for the sum of computational effects, which in particular yields Moggi's exceptions monad transformer (an application is the combination of exceptions with other effects).
A longer version: Combining Effects: Sum and Tensor

An Operational Semantics for R5RS Scheme

Jacob Matthews and Robby Findler bring us An Operational Semantics for R5RS Scheme (also available in PostScript):

This paper presents an operational semantics for the core of Scheme. Our specification improves over the existing R5RS denotational specification in four ways:

  1. It is more complete, since it contains eval, quote, and dynamic-wind.
  2. It models multiple values in a way that does not require changes to unrelated parts of the language.
  3. It provides a more faithful model of Scheme's undefined order of evaluation.
  4. It is executable, because it is encoded as a program in PLT Redex, a domain-specific language for writing operational semantics.

The executable specification allows others to experiment with our specification and allows us to build a specification test suite, which improves our confidence that our system is a faithful model of Scheme. In addition to contributing a specification of Scheme, this paper presents several novel modeling techniques for Felleisen Hieb-style rewriting semantics that we discovered while developing our R5RS Scheme semantics. All are applicable to a wider range of problems than the specific uses we have for them, and the fact that they combine seamlessly in our full R5RS model shows that they scale to real languages.

This looks like excellent work, as we've come to expect from the PLT folk.

The PLT Redex tool was previously discussed on LtU. If you're interested in checking it out, this page provides links to "everything you need to run PLT Redex and executable versions of every reduction system in the paper".

To be presented at the 2005 Workshop on Scheme and Functional Programming, Tallinn, Estonia, 24 September, 2005.

Thanks to Jens Axel Søgaard for bringing this paper to my attention.

A Typed, Compositional Logic for a Stack-Based Abstract Machine

A Typed, Compositional Logic for a Stack-Based Abstract Machine. Nick Benton. MSR-TR-2005-84. June 2005.

We define a compositional program logic in the style of Floyd and Hoare for a simple, typed, stack-based abstract machine with unstructured control flow, global variables and mutually recursive procedure calls. Notable features of the logic include a careful treatment of auxiliary variables and quantification and the use of substructural typing to permit local, modular reasoning about program fragments. Semantic soundness is established using an interpretation of types and assertions defined by orthogonality with respect to sets of contexts.

Also related to PCC, TAL etc.

XML feed