Multi-Artifact Languages

Lately I've been wondering about a trend in software development: We seem to be grappling not so much with algorithm specification, but rather with the organization of large systems developed by large teams. As such, documentation and testing become almost as relevant as the source code itself. I was wondering whether there were any good languages for writing (1) source code, (2) documentation, and (3) unit tests all in the same location, and in an elegant way?

I'm thinking something like this:

(define (factorial n)
  (returns "the mathematical factorial of n.")
  (if (> n 0)
    (* n (factorial (- n 1)))
    1)
  (such-that
    (and (= (factorial 0) 1)
         (= (factorial 5) 120))))

And it would translate into (1) a function that computed the factorial, (2) a unit test for that function, and (3) this documentation (or something similar):

Factorial (n): Returns the mathematical factorial of n.

Examples: Factorial (0) = 1 (passed), Factorial (5) = 120 (passed).

My initial thought is that it would have to be some sort of Lisp-type macro system, but that's just a guess. Any thoughts would be appreciated.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Python's doctest module

Python's doctest module does this. It won't infer examples, as in your post, but if you include them in the inline documentation it will automatically execute them.

I hadn't really looked at doctest up 'til now, but thanks to your post I think I will!

thoughts

requires and ensures clauses would be awesome.

(requires (>= n 0))
(ensures (>= (factorial n) 0))

also, for object systems, invariants should be able to be declared.

example usage would be awesome too. the generated documentation could syntax highlight or whatever to these.

(example '(defun (choose n r) (/ (factorial n) (* (factorial r) (factorial (- n r)))

or something like that.

MISC, by Will Thimblebly,

MISC, by Will Thimblebly, keeps examples in the function metadata.

Eiffel

Eiffel combines source code with documentation with pre- and postconditions and invariants. The "short" tool, provided on most systems, will remove the method implementations and provide you with the framework of your classes: the class invariant, the calling signature of the methods along with the user-written documentation, and the assertions that logically define them. Assertions are inherited from superclasses and the short form will reflect that.

Eiffel doesn't provide a way to include unit tests quite as you asked for (though the fine-grained control over visibility it provides might let you just write them into the class body); however assertions are checked (they can be turned off too) and will provide verification of correct functionality in most cases that a unit test would, so you might not miss them too much.

Since this is all built deeply into the language, it's quite elegant.

Bertrand Meyer's book Object-Oriented Software Construction is thought-provoking and worth a read, even if you have no intention of writing any Eiffel.

Deployment

The thing that strikes me about large (typically web) projects is the abundance of xml configuration files. This also somewhat correlates to the abundance of configuration files in a typical *nix system. In many areas, defining where and how the components are to be run is quiet complex nowadays.

abundance of xml configuration files

Are you suggesting that the configuration could be written directly into the application? I do not believe that the large web app administrators would be happy with changing the code when they want to change the configuration.

On the other hand, there is surely a better way to manage that complexity.

Clearly one wants to

Clearly one wants to separate configuration from application code (that's the whole point of the distinction), but surely XML is not the ultimate way to represent configuration information.

About XML

XML in itself is well suited to represent structured data, such as configuration options. It has several advantages:
- Standardised
- Widely used
- Human-readable
- Resilient (easy to correct)

However it does not look like we are using the right tools when dealing with it. There should be a middle way between relying on the vendor tools to edit the configuration and editing XML by hand in all its textual glory. Why can I not find a generic SQL-based front-end for XML files scattered around my filesystem like so many mini-databases? For all major platforms and with no additional cost, please.

XML

No, XML is definitely not the best way to represent this configuration information. Having just completed a (relatively small) Java project, which had config files for spring, hibernate, J2EE web app, Ant, and log4j (and probably others), I don't see what benefit XML has brought. Spring in particular strikes me as an overengineered and underflexible scripting language with lots of XML overhead. Configuration for our application was spread over perhaps a dozen separate XML files in various dialects. Not to mention that the lack of flexibility also lead to some hacks (e.g. a bean that exists purely to call methods on other beans from Spring) and to some custom code being written to wrap classes just to make them Spring-friendly. All of this could have been specified much more concisely and clearly using a scripting language (Lua, Tcl, Python, Scheme, anything), and in a single short file too. In particular, a good config language needs at least some mechanism for abstraction: cut and paste config is just as bad as cut+paste code.

Obviously, what is needed is

Obviously, what is needed is a DSL that would generate the configuration files, etc. right?

DSLs

I'd rather the DSL replaced the configuration files, rather than generated them. Indeed, the XML files are DSLs, so adding another one only helps if it can subsume all the others. These various XML files are essentially doing exactly the same thing: instantiating and configuring some components, i.e. exactly what scripting languages were designed for. My feeling is that 3rd party libraries shouldn't read configuration files at all, but instead plug into the application's configuration mechanism. That way all the DSLs can be embedded in a single host language, and common definitions/config can be shared. Java's support for reflection and things like the java.scripting APIs should make this fairly straightforward.

I do not believe that the

I do not believe that the large web app administrators would be happy with changing the code when they want to change the configuration.

Depends whether your language supports easy live upgrade. If so, there is little disincentive to using the same language, or some embedded DSL, for configuration too.

Tool support

If you use the same language, you get the benefit of your IDE and other tools supporting it too. This can be a really big win - in some cases (enterprise Java applications especially) the configuration is just as complex as the code, but lacks the tools we have with code to help manage that complexity.

Exactly. The interface could

Exactly. The interface could be declarative when it needs to be, but also supports hairier relationships should they be needed.

Why not?

"Are you suggesting that the configuration could be written directly into the application?"

Actually why not do exactly that? The xmonad folks do it and it seems to work well. The 'configuration' is just an ordinary code file that loads the application as a module and overrides default values as necessary. Then the application is recompiled and reloaded with a single keystroke. It's incredibly slick as long as you already know how to write (or at least read) Haskell. Assuming you do, the configurations can be extremely simple and intuitive, or it can be somewhat less simple and intuitive but also leverage the full power of the host language.

Finally, the biggest benefit is that the application writer can spend that much less time parsing, marshalling, and error-checking the configuration (since it'll come to you type-checked and in a 'native' format).

I did not find the

I did not find the documentation on the algorithms used by xmonad (does it simply shut off and come back instantly?)

That approach would certainly work when the modules of the application are sufficiently decoupled, which is unlikely unless it is a piece of code got refined for 11 years (age of the X11 protocol). Otherwise you will probably need to restart the whole application. I think it is well worth decoupling in the view of allowing such techniques but it is not that easy (and traditionally that means one more configuration file for the module, ugh)

How about Dynamic