Visual Programming Language Syntax Formalisms

What formalisms are available for specification of visual (i.e., graphical, two-dimensional, diagrammatic) programming languages? Which of these, if any, are popular?
If readers feel that few are popular because visual languages are of limited use, do these readers refute the premise that visual syntax is a superset of character sequence syntax?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Is this homework by any

Is this homework by any chance?

Not Homework

Now that you mention it, I realize this is formatted a bit like a homework question. No, this is not homework, I am not enrolled in any school.
I am sincerely looking for answers to the first two questions. The third question anticipates what I estimate to be the normal reaction to the question ('who cares'). This is an attempt to give a reason for why the question of visual language formalisms is relevant to LtU readers and to steer any actual discussion that ensues (fingers crossed) towards the last idea about visual syntax being a superset which I'm sure is not completely novel but I believe may touch on a useful perspective that warrants more popularity.

Ok, then. I think you should

Ok, then.

I think you should search the archives for discussions of visual programing languages and 2D languages. Both terms appeared quite a bit in the first couple of years.

VPL syntax definition systems are a superset of character ...

My topic is not about visual programming or two-dimensional programming languages in general, but rather about mainstream formalisms used to define such languages. In other words, I am looking for a commonly used system akin to EBNF that is generally used by visual programming language developers to specify the syntax of their languages. Through my research I have not been able to any such thing specifically (though of course I have come upon numerous two-dimensional and visual programming languages and many have fundamental approaches with some elements somewhat along the lines of EBNF but as far as I can tell there is no standard and formal approach and especially none that is popular which I think is key).
VPL syntax definition systems are a superset of character sequence syntax definition systems. Of course I will search the archives for similar notions (and since reading your reply I have begun that search in vain) but I doubt that I will find this specific topic. Even if there were a similar topic available in the archives that wouldn't mean there was less reason to attempt to steer the discussion in this specific direction since I feel it is so important.
Anyway, just to emphasize again I think that my question may have been misunderstood because this is not a question asking for general help on VPLs, it is quite specific.

EBNF is (occasionally) good

EBNF is (occasionally) good for specification of syntax, not semantics.

Most 'visual' languages are actually variants of data flow ones. Take a look at Ptolemy II, a modeling environment for them -- they provide a nice IDE / libraries / layering scheme for plugging in your own semantics. Peter van Roy's book CTM might help you a little, but he mostly focuses on assign once variables, not really live variables. As for languages used in the wild, labview and Max's patcher system are dominant.

Beyond those, look at kahn processes, clock calculus, frp, and some of walid taha's recent work in tying lambda calculus into it all. Some 'live' programming systems like superglue and subtext were formally specified as well.


The most common approach I'm aware of for specifying visual language syntax is metamodeling. The OMG defines UML syntax via a (somewhat self-referential) metamodel. Vanderbilt's Generic Modeling Environment allows new visual languages to be specified by constructing a model of the syntax using a mix of UML and OCL. DOME, originally developed at Honeywell, seems to use a similar approach (although a different notation). Granted, the metamodels focus more on abstract syntax than concrete syntax (although the AST definition can usually be annotated with information about concrete representations). But then it's not clear to me from your question which aspect you're most interested in.

Will look into

See my other comment below. I will look into these but I am initially doubtful that they are really akin to EBNF. I think I've been down that path before (metamodeling) and felt like it wasn't at a high enough level of abstraction and formalization for what I want but I will look again thank you.

There's no good ideas out there

Syntax in two dimensions is really tricky. I think the main 2d programming systems that get used for practical purposes are all data-flow systems (e.g. LabView, Max/msp, pd, Slim, Houdini's graphical geometry notation, TouchDesigner, Shake and nxt-g (the Lego Mindstorms programming system)). All of these use the obvious boxes-and-lines notation. The first Lego Mindstorms system, no longer available, was not a data-flow language, and used a notation that looked a lot like pictures of ASTs. So in all cases, the syntax is trivial.

That's because (he avers, reading multiple minds simultaneously) the existing formalisms for dealing with 2d syntax are all completely unusable. There's been a lot of work on graph, tree and picture grammars over the last 35 or 40 years, but it's only a mild exaggeration to say that no two papers use the same formalism. And most of the results say that anything interesting is uncomputable. (If you're lucky the interesting things are only P-space complete.) The notable exception is L-systems, which are widely used to model growth and morphology of plants, and have a rich theory and lots of interesting applications. But L-systems aren't really 2d grammars (i.e. they don't generate 2 dimensional structures directly), but (parallel) string grammars with geometric interpretation rules to convert to 2d pictures. Even so, I believe the parsing problem for L-systems is Turing-complete in general.

None of these systems save programs as pictures. They all use the underlying graph or tree structure with some annotations indicating where to put them on the screen, bypassing the parsing problem. On the other hand, there's a fair number of hardware design systems whose native representation is geometric (circuit diagrams or even raw integrated circuit masks), and that do something a lot like parsing of the drawings to extract circuit graphs. (But again, hardware has dataflow semantics, so the extracted representations are again boxes-and-lines.)

Thanks | Computability | Motivation

Thanks very much for your answer. And thank you to everyone else for responding as well.

Re: storage, the usefulness of formal common syntax definitions is not necessarily dependent upon the final storage format.

Re: computability for parsing, the systems I envision would not necessarily parse visual programs in the traditional sense. I was thinking of using this non-existent syntax definition form as a starting point for brainstorming the implementation of an application that would leverage GUI functionality to reduce or eliminate the (initial) parsing step.

My motivation for starting this topic is the idea that 2d representations of formulas, tables, circuits, domain objects, data, spreadsheets, ontologies, entity models, UML, algorithms, etc. are necessary and that having a standard metasyntax notation that could specify syntax for some or all of these things might bring some of the benefits that EBNF (or similar formalisms) brought to textual computer languages. I see the response on metamodeling which I must now investigate further but I have a feeling these are all variations on boxes and lines (might be a good place to start but different scope/approach I believe and I think they are relying heavily on natural language descriptions of diagram features in a diverse and somewhat ad hoc way).

I guess the idea is that I want my programming language to allow, for one example, simple embedded equations in a more natural readable way without having to serialize them to text but there is no common formal approach to the definition of this syntax. Everything up until that part I could clearly explain the syntax of with EBNF for example but by virtue of the fact that I want some text above the bar and some below it (as a representation of a numerator and denominator for example), the entire formalism and related tool support breaks down.

I think I get the picture...

I think I see where you're trying to get to now. It sounds like a worthwhile goal. Unfortunately, I'm afraid that based on what you're trying to do (which involves concrete as much as abstract syntax) that metamodels probably won't help you that much. It sounds like you're looking for something like a mix of a graph grammar (e.g. Hyperedge Replacement Grammars) and a layout language (e.g. TeX or CSS). I'm not sure if anything like that exists. I'm pretty sure there's nothing like it that's "mainstream" or "standard" in the way that EBNF has become. Might make a neat research project for some eager PhD student though...

Context-Sensitive Grammars?

Visual programming languages may be context-sensitive in which case they fall outside much of current programming language theory. See Chomsky Hierarchy for some details.

OTOH since everything you mentioned:

2d representations of formulas, tables, circuits, domain objects, data, spreadsheets, ontologies, entity models...>

are frequently represented in context-free** languages I suggest that you look into such representations. In particular representations in the languages Lisp or Prolog may be sufficient to satisfy your needs.

[** Sorry, I had "context-sensitive" here before and have corrected it. I apologize for any confusion caused by my initial error.]

Context-free graph grammars

There's a long history into parsing graph and picture representations, and if the theory is not as well developed as for sequences of characters, there's lot that is known. In particular, many attempts to represent the Chomksy hierarchy for graph grammars have been made, cf. the survey (GR 1999).

(Scarpazzi 2004) gives an interesting technique in a grammar that can reasonably be called context free, which gives a kind of picture grammar represented by tile rewriting. I don't think the technique is, as it stands, general enough to stand as a theoretical foundation for visual languages, but I think that picture grammars which can be parsed by applying a set of picture productions in reverse do provide the most promising foundation for visual languages; if this result can be appropriately extended.

Giammarresi & Restivo (1999). Extending formal languages hierarchies to higher dimensions. In ACM Computing Surveys 31.
Scarpazzi (2004). A Parsing Technique for Tile Rewriting Picture Grammars. Tech. report. Politechnico di Milano, Dipar timento di Elettronica e Informazione, 2004.11.


You may be interested in Faust, which has both textual and block-diagram representations.

Faust Block Diagram Algebra

To give some details, the textual syntax of Faust is based on an algebra of five operations on block-diagrams ( : , <: :> ~ ). For example a one-pole filter (corresponding to the difference equation y(n) = x(n)+k*y(n-1)) can be written in Faust + ~ *(k) using the recursive composition operator ~ that allows to create block-diagrams with cycles.

A description of the algebra can be found here :

Another formalism is Gheorghe Stefanescu's algebra of flownomials :


Thanks this is exactly the type of thing I was asking about (although obviously no standard). The block diagrams in Faust are along the type I imagined previously to correspond to similar representations and Faust follows a semantic compilation approach that is probably a pretty useful approach in general to VPLs and the type of thing I was thinking of. Flownomials sounds like it could be very useful as a tool for analyzing data flow languages.. but looking at that and taking a step back on the whole topic,

I have been starting to think that maybe rather than looking at academic formalisms I just want to go straight to devising a user interface that would allow users to enter new notations or diagram/graphic types etc. by pulling from a set of primitives and/or relating to previously defined notations and then translating to terms from some semantic representation or to code. Either that or break down graphical features and dimensions into some set of functions that are composed (maybe visually represented with just some ordinary data flow or some variation, or something inspired by Eros) and use that as a base level.. but the goal is to try to make most manipulation and display use representations that are closest or most natural/convenient and to be able to mix them and then generate (high (or low) level) code.



Visual languages exist for multiple programming paradigms

Visual languages are highly popular in computer music, where the users of these programming systems are often no formally trained programmers. There are two popular families. The first family are dataflow languages, examples are Max (or Max/MSP), Pure Data and the meanwhile obsolete jmax. The second family implements functional programming or even multi-paradigm programming. Examples include the now obsolete PatchWork, and its successors OpenMusic and PWGL. All these systems of the second family basically provide a visual programming syntax for their underlying language Common Lisp (e.g., OpenMusic provides visual counterparts for first-class functions, object-oriented programming, and even some support for Lisp's metaobject protocol).

Unfortunately, I cannot answering your more specific question on something like a EBNF for visual langauges (I am interested in an answer to that too). Anyway, here are some pointers related to visual programming.

A survey
Boshernitsan, M. and M. Downes (2004). Visual Programming Languages: A Survey. Technical Report UCB/CSD-04-1368, Computer Science Division (EECS), University of California, Berkeley.

An example for another paradigm (logic programming)
Jaume Agust I, Jordi Puigsegur, Dave Robertson. A Visual Syntax for Logic and Logic Programming. Journal of Visual Languages and Computing, 1998. To appear.

Also, here is a bibliography on visual programming languages (which includes many papers on formal definition of visual programming languages)


Thanks very much.. I checked out some of those links that is an excellent survey..but again to clarify I wasn't looking for a survey of VPLs because I have done web research on my own.
I guess my interest is towards ways to precisely denote or input various different types of representations and actually as Tom Duff pointed out most of these systems use simple boxes and arrows formats which although proven extremely useful is only one type of notation.

Visual Core Forms/Automata

I've been thinking about this problem over the past few months. Let's say we're going to draw a small computation "cell" in 2d. If we start from a Lisp standpoint, we known we need a half-dozen or so core forms that manifest as s-expressions. So we can create a single visual shape that's just an s-expression, tag it with the core form, and we're off and running. There are some papers from 15 years ago that do precisely that.

But it didn't get us very far. In 2D we have lots of preattentive visual attributes, like shape, hue, saturation, luminance, thickness, position, rotation, can we distribute these across a somewhat more complex computation cell such that the core forms are encoded to preattentive form?

Some of these attributes (like hue) are easily combined. If "type" is assigned to hue, we can visually encode a tuple type with a visual "combinator". With tuples this might be averaging the hues of the type components, or creating a gradient, or scaling up the result (to indicate that it is more "complex").

The exercise I've been considering is to move through a series of language forms ranging from simple to more complex, trying to create concise visual forms that combine elegantly for each. Last week I picked up "Design Concepts in Programming Languages", which evolves languages from simple to complex, and seems like a good candidate for anchoring the requirements to explore the visual design space.

I think we've just barely scratched the surface of what it means to create a program and to understand what one does. I have a hazy vision of a ZUI shining a light through a murky sea of self-organizing lambda prolog "cells"... :)