Theory of a Declarative Language

I'm working on a declarative configuration language for an application. The language is mostly structured data (think JSON). It also has variables, so you can define variables in one part of it and use them in another. You can import definitions from another file and its contents is merged with the local data in a predictable way. And you can extend substructures by referencing other substructures (a simple form of data inheritance, a bit like prototype-based objects without methods). The only control flow-like construct is a sequence, no branching, loops or conditionals.

The language is developed on a make-do basis, adding features as new requirements pop up. But I would like to base this against a more disciplined approach. Is there a theory for such primitive languages? Literature?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Primitive?

You say your language is 'primitive', but does that also mean restrictive? According to the principle of least power, we can know a lot more about a restricted, 'primitive' program than a general, Turing-complete program. You mention a lack of loops and branches, which brings to mind McBride's "Applicative Functors" (AKA "idioms"). These are a model of computation, similar to monads but less powerful. They can perform arbitrary effects in a sequence, but later effects cannot access the results of earlier effects.

For example, if we use applicative functors for stdio we can write to a log file with the guarantee that the lines will be written in chronological order. However, we cannot use inputted strings in our outputs, eg.:


> What is your name?
Chris
> Hello Chris!

We also cannot write an application which branches based on its input, eg.:


1. Do foo
2. Do bar
3. Quit
> Enter your choice:

This allows our logic, eg. branching, to be separate from our IO. It also gives a lot of freedom to the language implementation, for example this works well with call-by-need (eliminating unneccessary computation), would tolerate parallel/speculative evaluation (although a guaranteed partial ordering would be more efficient than the total ordering applicative functors guarantee) and could presumably work in a logical unification language like Prolog.

You mention variables and structures referencing each other. This makes me wonder

1) Can variables be redefined/mutated? If not, you may want to look at Single Static Assignment languages, and data-flow languages in general, for things like efficient evaluation order, re-ordering of instructions, etc.
2) Can structures reference each other/themselves recursively? For example, in ML and Scheme (probably others, but those are what I know) there is a distinction between regular "let" and recursive/mutually-recursive "let". These allow immutable, self-referential datastructures, eg. doubly-linked lists. Whether your language allows this or not, it may be interesting to see how, eg. Scheme implementations handle each type of "let", especially the assumptions they can and cannot make. Laziness and Prolog-style unification might also be well-trodden implementation paths for recursive data.

Great, thanks for the many

Great, thanks for the many pointers. As for your questions,

1) Redefined. This is the algorithmic idea: An "object" (a Json map with keys and - pot. structured - values) can have its own table of bindings. When evaluating the object, a file-global table of bindings is merged with the local table. Next, bindings of other objects which the local object says it "extends" are brought in. Those objects are evaluated in the same way recursively. In any case, if a new table brings in a binding for an already bound variable, the local value wins.

After all this extending has been done, the resulting table of bindings is used to replace references to the variables in other parts of the object. So resolving variable references is late, and the local object can inject its own value in a property that was actually inherited from a "parent" object.

2) No. Variables can reference each other, but only in a non-recursive way. There is also a cycle-check for objects' "extend" property, so they're not referencing each other indefinitely. I completely evaluate/expand the object before I use it in the application.

(1) sounds a lot like

(1) sounds a lot like dynamic scope (found in many non-Scheme LISPs, for example).

I'm also reminded of the Smarty template language for PHP which I used at my old job. It provides a kind of dynamic scope when one template embeds another, although this seemed to be rarely used compared to a more explicit (and convoluted) way of 'inheriting' parameters from 'parent' templates (cargo cult OO, IMHO). I got into the habit of using this dynamic scope heavily, treating each template file as a function. It lead to much less boilerplate and more modularity.

(1) sounds a lot like

(1) sounds a lot like dynamic scope (found in many non-Scheme LISPs, for example).

Definitely. How could I have missed this?!

I'm also reminded of the Smarty template language for PHP which I used at my old job.

Template systems - very good hint. Maybe there is even some literature about them, not just implementations.

It provides a kind of dynamic scope when one template embeds another, although this seemed to be rarely used compared to a more explicit (and convoluted) way of 'inheriting' parameters from 'parent' templates (cargo cult OO, IMHO).

What can I say ...

I got into the habit of using this dynamic scope heavily, treating each template file as a function. It lead to much less boilerplate and more modularity.

The dynamic scoping of variables is use throughout our configuration system, while establishing a relation between two objects is either explicit (said "extend" property) or implicit (when two objects have the same name). I will try to figure out how Smarty establishes these relations. (Maybe the system is just constructing a single big template eventually ...).

I will try to figure out

I will try to figure out how Smarty establishes these relations. (Maybe the system is just constructing a single big template eventually ...).

I certainly wouldn't hold Smarty up as an example of good language design ;)

I think my point was more about 'primitive' languages being more powerful than they might seem. In the case of Smarty, it doesn't claim to provide functions, but actually the way it embeds templates is pretty close. Using features as they're intended doesn't necessarily tell us everything that's possible.

Any decent module system

Any decent module system would be a good basis.

I'm not quite getting it,

I'm not quite getting it, can you elaborate?! How would module systems represent primitive languages? I only know module systems as a small part of a more complete language. Is there some literature about module systems alone?

Types and definitions are

Types and definitions are effectively simple data structures. Module systems cover references, simple parameterizations, and the construction of these data structures.

Typically, modules are used in the construction of a final 'definition' representing the application, but there is no need to go that far. One can stop at getting a set of useful types, for example, representing a configuration.

Though, using a scriptable PL (like Haskell, Lua, FORTH, Python, Ruby, or JavaScript) as a configuration language can be very nice. One can then have a simple configuration 'API'. This is most useful if you ever find yourself representing functions or conditionals in your configuration.

Ok, I see. Are you aware if

Ok, I see. Are you aware if anybody has written a paper or book chapter that covers what you sketch here? (I faintly remember that module systems came up here on LtU as well, but I don't dare enter 'module system' in the search ... Wait, I shouldn't be so faint-hearted, this looks promising).

I totally agree on the use of a proper PL for configuration purposes, for exactly the reasons you mention. Alas, the management decision was "JSON only". So now I'm looking down a road that might eventually wind up in a Json version of the Ant config XML application ...

If you're stuck with JSON,

If you're stuck with JSON, try Duncan Cragg's "What Not How" blog. He's created a number of rich compositional systems using JSON (NetMash, Object Network, Cyrus).

That looks really

That looks really interesting.

One thing that concerns me

One thing that concerns me about module systems is: Module systems have a nice story about how you can reference definitions in other files and how to bring them into the current, handle name conflicts in the process etc. But as far as I know they don't say anything about how - after bringing in the external definitions - types can then interact with each other, e.g. in the sense of one extending the other. This is usually left to the rest of the language. Is that right?

You do need to address how

You do need to address how to use the modules you import, whether it is some composition of types or definitions. Some languages couple this interaction with the module system (e.g. modules in OCaml or Newspeak).

I'll try to look into those

Thanks, I'll try to look into those module systems.

But as far as I know they

But as far as I know they don't say anything about how - after bringing in the external definitions - types can then interact with each other, e.g. in the sense of one extending the other.

Read up on mixin modules.

Cheers, will do. As I have

Cheers, will do. As I have zero experience with ML in general and the ML module system in particular, could you highlight a few features of MixML that are particularly relevant?!