Formal semantics for working programmers

Since DSL (Domain Specific Language) is almost an everyday topic for many programmers, I wonder how should a working programmer proceed to define and acturally use a formal semantics for her DSL? I do not mean a group of equations in LaTeX, but an effective method to help to reason about the programs within reasonable constraints for common projects. (Or, is there even bigger advantage to be taken?) To be specific, I was wondering if coq and isabelle and nusmv alike are usable for common projects? Is the cost (exclude the cost of learning) too high? And why? Thank you!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Too Heavyweight

My sense is that Coq and Isabelle are too heavyweight for defining and reasoning with a formal semantics for a DSL, but I might, of course, be mistaken about this. NuSMV seems like it would need considerable extension in order to be useful as a tool for programming language semantics work. Some other similar tools worth looking at might include Twelf, Maude, and MetaPRL. While all of these tools probably share Coq and Isabelle's "heavyweightness," they're all interesting insofar as they make the support of the definition and manipulation of programming language semantics an explicit goal. MetaPRL is particularly interesting because of its application to the Mojave project, in which a complete compiler, from parsing to code generation, is written entirely within the context of MetaPRL. Good stuff.

So what are some more lightweight alternatives? The one I find most interesting at the moment is PLT-Redex, which is a context-sensitive term rewriting system with a nice visual interface, based on DrScheme. Another system that seems very approachable and useful is Pierce and Levin's Tinkertype.

Thanks for the many pointers

You said "heavyweight". What does it mean? Besides cost of learning, isn't a versatile tool always a good thing? Or do you mean some technical things? Thank you again. I was only begin to learn coq and isabelle. As for now, I really appreciate people telling me the possible difficulties ahead, besides of cost of learning. My sole goal is to reason about programs.

Something can be versatile bu

Something can be versatile but incredibly slow to work with. I don't write machine code when a spreadsheet would do.

Over the Learning Curve

Certainly the learning curve for any of these tools is significant, particularly if you aren't already well-versed in whatever logic(s) underpin the tools. Beyond that, though, there tends not to be a royal road to programming language semantics: it never becomes easy; we're fortunate indeed when it becomes less hard.

So when I say "heavyweight," I'm thinking, in this case, about how long I feel it would reasonably take to formalize the definition of a DSL using one of these tools, and for any DSL other than one that was so trivial as to not be worth formalizing, I see such an effort requiring months—probably more than you would realistically have in a commercial setting, for better or for worse.

There is, of course, another approach, which is to express your DSL in terms of rewriting to another language that you can already reason about. This is the approach taken in Graydon Hoare's wonderful One-Day Compilers presentation, which essentially uses camlp4 (the O'Caml Pre-Processor and Pretty-Printer) and, of course, O'Caml, to do a source-to-source transformation from the DSL to C, and then compiles the C to generate the resulting program. The beauty of this approach is that thanks to camlp4's extensibility you can develop a DSL-to-X transformer, where "X" is anything you can write a custom printer similar to Hoare's Cquot for, and take advantage of all of O'Caml's semantics in doing so.

Lucidity: clearness of thought or style

My sole goal is to reason about programs.

This is another goal entirely than stated in your original post.

By the time someone knows how to formalize a domain, or a language, in an interactive theorem prover my guess is that the burden of actually doing so becomes too big in a setting where money is to be made with a DSL. It's not about proving, it's about choosing the right design and your ability to choose the design right. Even for rapid prototyping I would first go for implementing prototypes in, well, rapid prototyping languages like Lisp, Haskell, or Python. Especially since formal proofs of algorithmic properties of programs are not interesting in most industrial settings; and I don't see formal rigorous development becoming cost-effective any time soon.

But if you want to learn/reason about programs, well, yeah, you could give Coq or Isabelle a spin (but be prepared for the rather steep learning curve).

SMV is really for modelling s

SMV is really for modelling synchronous circuits. I tried to model a router information protocol (a simplified BGP), and every implementation of SMV crashed from being asked to do something so complex.