Literate Programming: Retrospect and Prospects

LP has been mentioned a number of times on LtU but never featured as a topic of discussion in its own right. On the face of it, it seems like an eminently sensible way to program. Why hasn't it taken the whole world by storm? Knuth puts forward Jon Bentley's observation as one possible answer: "a small percentage of the world's population is good at programming, and a small percentage is good at writing; apparently [Knuth is] asking everybody to be in both subsets."

To discuss this and other theories on their merits, a quick refresher on the basics of LP is in order. As usual, the relevant Wikipedia article is informative but bland. As Knuth pointed out, original sources are often best. Here are two good ones:

  1. Programming Pearls: Literate Programming by Jon Bentley and Don Knuth; CACM, Vol. 29, No. 5, May 1986. (A bootleg copy available here.)
  2. Programming Pearls: a Literate Program, by Jon Bentley, Don Knuth, and Doug McIlroy; CACM, Vol. 29, No. 6, June 1986. (Bootleg copies available here and here.)

The second paper is the more interesting of the two. It contains a literate program by Knuth and a review of the same by McIlroy:

Knuth has shown us here how to program intelligibly, but not wisely. I buy the discipline. I do not buy the result. He has fashioned a sort of industrial-strength Fabergé egg -- intricate, wonderfully worked, refined beyond all ordinary desires, a museum piece from the start.

I, too, buy the discipline for programming in the small but can't really see how CWEB-like systems can be adapted to and adopted by multi-hacker teams working on very large code bases written in a mixture of different languages. Ramsey's Literate Programming on a Team Project enumerates some of the problems.

Can LP be used for anything other than small-to-medium programs written by a single person in a single language?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

what's next?

It's pretty clear that CWEB doesn't quite cut it. If you were to try to bring the "essence of WEB" to the unwashed masses of professional coders, in which direction would you take the original WEB? Porting it to yet another language doesn't seem like a big step forward. Really now, would we benefit from having SqueakWEB or somesuch? (What would that even look like? Source code in files?) Making it easier to port WEB to arbitrary languages seems like a good move. But Ramsey has already partially accomplished that with SPIDER -- the programming-language half of WEB is pluggable but the typesetting language is hardwired to be TEX. (Lout would be a more sensible choice these days, if you didn't care about Unicode.)

Will these and other questions be asked of and answered by Alex Plotnick? We'll find out tomorrow night.

Poor TeX!

I know TeX is not much loved as a programming language, but it (oops, Charles Stewart correctly points out that I mean LaTeX2e) is still very much the markup language of choice in many professional disciplines, with my own (mathematics) perhaps the most insistent on this choice. I think that means that this:

Lout would be a more sensible choice these days, if you didn't care about Unicode.

at least deserves some explanation. I'm not saying it's untrue—I've no experience with Lout (and quite a bit with TeX), so no basis on which to speak—just that I don't think that it's self evident.


The markup language usually chosen is not Tex but Latex 2e: while the reference implementation of Latex 2e is Tex 3, there are two main reasons to count Latex as not just being a purely macro-realised Tex dialect:

  1. Source-to-source transformations: publishers typically want to rerepresent Latex code so that it conforms to their house style, to gather together several Latex articles into one Latex book, &c. Unrestricted use of Tex can subvert some assumptions about the structure of the document.
  2. Translations of Latex into other formats, e.g., SGML, for typesetting: these translations typically don't support all of the Tex language. E.g., the Handbook for the Digital Library of Mathematical Functions uses a Perl-based Latex-to-Xml converter, LaTeXML, which simulates the execution of a subset of the Tex language.

Duly quibbled

You make a very good point, that one should not conflate LaTeX and TeX, any more than one should call Perl C because the Perl interpreter is written in C. Thanks!

I hadn't heard of LaTeXML before; I only knew about JadeTeX, which goes the other way (and therefore supports TeX more or less perfectly, and XML less perfectly). It does remind me of my one gripe about TeX/LaTeX, that, with all the bizarre powers it offers—like manipulating catcodes—that make it a less-than-ideal programming language, it doesn't offer what seems like the natural power to swap out the back-end. That is, it wants to output DVIs, or at least PDF or PS or some other sort of final-product document.

I have wished many many times to be able to have access to a TeX document just after it was compiled down to ‘primitives’ (by which I mean something a little less primitive than the actual primitives—I don't directly care about \kern, for example) and just before it started lumping the text into boxes to be laid out on a page; that is, to be able to use TeX as something like an m4-style macro expander. It seems to me that being able to do this would make both of the applications you mention, to which TeX is exactly not suited and even LaTeX is not perfectly suited, essentially trivial.

Why literate programs?

Axiom is a large (about 1M things of code) computer algebra program
written in common lisp. It is being rewritten into a literate style.

I was one of the original authors at IBM Research. The code was sold
to another company and was a commercial product for years. It was
withdrawn from the market and given to me to open source it.

I discovered that I was unable to understand programs I wrote 15 years
ago. I knew WHAT it did but I did not know WHY it did it or WHY it did
it that way.

The issue of literate programming is an issue of writing a program
that LIVES rather than writing a program that WORKS. In a commercial
setting you pay to train new people on programs but in an open source
setting there is no training. Beyond a certain point of complexity
you can only understand a program by talking to the author about why
they wrote it the way they did.

Axiom, for instance, contains algorithms that are backed by research
papers. However, the algorithm gets changed over time so it no longer
matches the research due to optimizations or other rewrites. These
changes can be important but very obscure. No amount of clever naming
and javadoc-API commenting will help.

So I searched for a long term solution where hundreds of authors can
communicate over dozens of years. Literate programming is that

Literate programming is not a good idea for programs that just need
to work rather than live. If the program is going to be rewritten or
thrown away in less than 5 years then don't bother with literate
programming. But if your program needs to live forever then you
really need literate code.

Thank you

Good to know somebody else feels my pain.

I've been encouraged to write about this. See:

My Re-thinking of Literate Programming
Flexible Order of Elaboration
Planned Obsolesence

Warning: this is very much brainstorming personal diary style writing.

Feedback, here or via e-mail, appreciated.

One thing to note is I don't typically share your pain about why I coded something some way. Instead, my concerns are data schema related. I want to be able to ask, "If I wiggle this in the front, what wiggles in the back?" For me, design and optimization tend to be orthogonal.

I plan to add a recommended reading list soon, as well as an assessment of current tools.

Revision control

It seems to me that a better answer to your kind of WHY question would be a well-maintained revision history of the code base, with properly explanatory commit messages and a toolset that makes it easy to browse the code's history.

Not exactly

What rcs system lets you input searchable rich hypermedia or searchable typeset commit messages?

Moreover, this is like comparing ACM Portal Search for "tree pattern matching" and "tree acceptance" and "tree parsing" to Loek Cleophas compilation of tree algorithms into a coherent taxonomy with FireWOOD and ForestFIRE. Yeah, you could do things the hard way, but over the next 200 years, your way will have much more bit rot.

Commit Messages

Axiom, for instance, contains algorithms that are backed by research
papers. However, the algorithm gets changed over time so it no longer
matches the research due to optimizations or other rewrites. These
changes can be important but very obscure. No amount of clever naming
and javadoc-API commenting will help.

In my experience code repository histories with good commit messages often do a decent job of documenting the WHY behind small-to-medium sized code changes.


Didn't see boston lisp group had this as a discussion... I'll probably stop by if I can finish early today.

Can LP be used for anything other

Can LP be used for anything other than small-to-medium programs written by a single person in a single language?

How would you classify the GHC compiler & library source code, which includes about 148,000 lines of code+text in Literate Haskell files? Not all of that is heavily annotated, but many significant bits are. There's also another 197,000 lines of ordinary Haskell code, some of which has inline documentation (Haddock).

Literate Haskell is not traditional literate programming in some respects, but that may actually make it more practical. The fact that Haskell compilers can handle literate sources directly is an important feature.

One traditional LP feature that Haskell rightfully rejects is that of allowing the LP system to define its own code abstractions, i.e. being able to name code fragments and reuse them by name. That feature of older LP systems seems to have just been a workaround for languages with limited abstraction mechanisms, and detracted from what the focus of those systems should have been.

In the classic systems like (C)WEB, this contributed to a problematic distance between the LP sources and the code which the compiler or interpreter saw. Any system that involves running some sort of tangle & weave operations as separate tasks so that the compiler sees something different than the programmer is ultimately doomed.

This seems like one of those areas that'll evolve very incrementally, possibly evolving from the current inline code documentation systems. It'd take a major breakthrough in terms of features and usability to overcome the problem that traditional literate programming has had to date: that it involves a paradigm shift in dealing with source code and even how to write code, but it hasn't seemed to offer the clear, convincing benefits, and lack of drawbacks, that would be needed to justify that shift.

Literate Haskell is not

Literate Haskell is not traditional literate programming in some respects...

I planned to mention the Haskell case, and ask why it is so much more successful. Thanks for saving me the need to raise the question. The traditional explanation (== the multiplication of two small probabilities) does seem to apply to the Haskell case, as well.

Haskell is more successful,

Haskell is more successful, IMHO, b/c of the tools MODEL. The compilation model for Haskell fits the natural ebb and flow of a diary.

In short, Haskell allows very rich flexible order of elaboration. Still not nirvana, though.

LP and aspect orientation

Don't forget that besides LP abstraction (aka chunking) you can also extend chunks (multiple definitions are flattened in sequence). This is a powerful tool to support separation of concerns, at least when features can be composed linearly.

Any system that involves

Any system that involves running some sort of tangle & weave operations as separate tasks so that the compiler sees something different than the programmer is ultimately doomed.

More generally, any composition-based approach that uses Web/Tangle/Weave is ultimately doomed (a slight alteration of my original position; see the LtU thread What is the best literate programming tool/environment/research work? and in particular this comment). It is trying to solve problems at the wrong levels of abstraction, at every level of abstraction, thus creating Abstraction Inversions everywhere that are downright counterproductive and unnecessary.

Again, Haskell does well mainly because its programming model is one of the best for flexible order of elaboration.

The Great Wheel of Reincarnation must be turned, again - THIS TIME for programming environments. (A point not mentioned by Barbara Liskov at OOPSLA when she was dissing aspect-oriented and giving the audience a CS History 101 lesson).

LP and Algorithmic Code

I've always thought that LP is primarily good for heavily algorithmic code, e.g., just the sort of thing that TeX is full of. Code that implements complex algorithms is often horribly opaque, and that's *before* you start optimizing it. Then it gets worse. But lots of modern software isn't complicated in that way: it's just lost of simple things piled on top of each other. LP doesn't help that much with the simple things, and it's not really directed towards capturing the large architectural details.

What about visuals?

I think you can consider any real LP system (e.g., CWEB or NOWEB) as an command-line alternative to a graphical development environment: it trades obscurity of control commands for obscurity of toolkit options. Both give a visually very much subjectively pleasant way of looking at programs, in the sense that you can include any figures, formulas in the text and structure it in any way you like. Doesn't this help capture some architectural details?

Also, inversion of code and docs containment helps to abstract away some incidental syntactic clutter that you get with specific languages such as ML or Haskell or alleviate some idiosyncrasies of Scheme. That is, you can start with the document/program structure (using named chunks) and refine it all in the appendix with the actual code.

But I have to say that I do miss the power of a Turing-complete language at the LP chunking level...

LP could/should blossom in internet/wiki/Web2.0 world

I always thought that LP was a great idea much before its time. With the Internet, HTTP/HTML/XML, blogging, and the long tail of context on the WWW, we *now* have sufficient technology to host our code in rich content, maintaining context to make it properly "live". Even code which is capital equipment, destined to be used up and thrown away, could be (and sometimes is) "opened" onto wikis and blog/tutorials which intermix natural language explication with the code itself. This can and (rarely) does include multi-media explication.

We just don't develop and deliver code that way, even open-source code which has a purpose to attract users and support is hidebound with concepts of the installer separate from the support web pages and documentation. Partly this is because coders are just as rarely good movie makers as they are superb writers, but also this is because no one has taken the time to re-design LP into a modern context:

I want a "CodePress" which supports LP in multiple languages across distributed teams of developers in a way analogous to the way that Drupal and "WordPress" supports the interchange what we are engaged in this minute!

Have you been paying

Have you been paying attention to new open-source project hosting companies/websites like Gitorious and GitHub?

I'll probably write a blog post about this in the future as part of my LP diary, but the bottom line is that new DVCS hosted websites are completely changing the way software is done, and it will have dramatic effects throughout industry!

Here is one example: Microsoft just launched CodePlex Foundation and put Sam Ramji as its lead. CPF is now independent of MS and Ramji thus no longer works there as an open source evangelist. He has clearly done his job in that capacity, and now it is time to move on to something bigger. Yet CPF is behind the times in terms of how to charter a massive online collaborative code outsourcing system.

Punditry: I think this spells the death of the concept of a SDK as we know it, and permanently changes what qualities we use as an industry to define 'platform'.

Adoption issues

Naively, I think that literate programming will/does mostly suffer from adoption issues more than anything. Not to put too fine a point on it, but a number of programmers like neigh unreadable code. They like the 'magic' of it, especially starting out. Even experienced programmers I find have a unhealthy obsession with terseness. Even if a good language is made to support the concept, the adoption roadblocks will make it hard for the language to reach critical mass.

Choose your literacy

I think discussions of the wonders of literate programming tend to vastly underestimate how hard it is to practice in your typical industrial project environment, even if you are a decent writer and coder and value clear code.

Just as a new idea late in an essay or blog post can call for a substantial rewrite of earlier material, relatively small changes made to the code can require a pretty major rewrite of your literate text.

When you are making dozens of changes daily (bug fixes and features), the text that made perfect sense this morning could be lying gibberish by quitting time.

On top of that, unless your literate code is a valued deliverable as it can be in an academic context, the cost of the upkeep isn't going to be worth it.

This is why agile approaches to documentation focus on clarity and comprehensibility of the code itself and of its supporting tests: the pay-offs are similar but more closely aligned with the deliverables.

On a happy PL note, this is actually a problem that PL expressiveness features can help you with: offering better abstractions, such as rich type systems, helps you to make what the code means and what the code says come closer together, without having to resort to separate literature.

Code is just Data

Software development processes need to be more than just a neat homoiconicity trick.

Data has no value in and of itself; put it into a context, and you get information. The programmer should be a knowledge worker, not a punch card operator.

Big idea in software architecture right now: Put your codebase to work to help you solve problems, rather than just a developer taking swags at solving problems in your codebase.

Code is Language

I'm not sure if the content of your message was intended to respond to the content of mine or not, but the sentiment expressed by your title

Code is just Data

strikes me as an odd one for a PL/PLT enthusiast.

A big part of the appeal of this subject area for me is exactly because code is a very elegant language that at its best is rigorous enough to be understood by a machine, but clear enough to be understood by a person.

A competent writer needs to master the structure of their ideas to be able to express them clearly, and also needs to master the structures and idioms of their language to express those ideas effectively.

I think of a competent programmer as just a competent writer in a specialized domain using a specialized type of language.

Relying on non-code prose to express the clarity of your programming is like relying on being able to wave your hands to clarify things when you giving an oral presentation: it may help on occasion, but the effort might be better spent making the main content stronger.

For some reason I am

For some reason I am reminded of the discussion of pseudo-code...

Pseudo-prose's equally evil twin

I am reminded of the discussion of pseudo-code...

That was lurking in the back of my mind too... ;-)

I should've said: I agree

I should've said:

I agree that automated testing is important, and strongly believe in doing BDD or TDD or contracts-first where it makes sense.

At the Boston Lisp meeting, somebody actually asked CLWEB's author, Alex Plotnick, about including tests as part of the literate program's WEB. Alex said he chose not to do this directly, but there was a separate output stream for testing the code.

Another good point Alex made was that he found he didn't write long, monolithic subroutines like Knuth did, because he felt Common Lisp provided him with more code-oriented ways to chunk his program. Alex then directly compared Common Lisp's modularization facilities, and noted he used fewer WEB named sections than Knuth, because his programming language eradicated many uses Knuth had for named sections. In turn, Alex said he was also then able to use named sections in different ways than Knuth had.

Relying on non-code prose to express the clarity of your programming is like relying on being able to wave your hands to clarify things when you giving an oral presentation: it may help on occasion, but the effort might be better spent making the main content stronger.

And relying on humans to read your code, when a machine can do it for you as you program, is premature quality gate optimization. Just as a code review doesn't really need a human element, literate programs don't necessarily need an uninterpretable English prose element. Natural language researchers have been studying the Bible and other classical works forever, discovering new forms of how phrases in one verse of prose can make an avert reference back to another verse of prose. Sometimes, these references aren't even in the same "book" or "chapter". For this reason, linked commenting is very useful.

Automated tests, especially done in a BDD-style, tend to serve as a form of linked comments. Except the linking is done in build.xml or hoisted into a Continuous Integration server. (Before Alex's talks, Daniel Herring gave a talk that basically said, among other things, yeah, great build systems are still a research problem.)

Now, you might argue that in most cases this is overkill. You're right. Ted Nelson was simply wrong about transclusions being that necessary, as the design of the world wide web has proven. On the other hand, the failure of the semantic web suggests things like transclusions are very useful, as when you only have one-directional links you often end up inventing isolated subsystems like RDF and OWL to relate documents to one another.

Relying on non-code prose [...] may help on occasion, but the effort might be better spent making the main content stronger.

I agree. David hit a major pain point of most programmers being afraid to use good abstractions due to trying to address deep composition of orthogonal concerns in very non-orthogonal, tightly coupled ways.

Going back to premature quality gate optimization...

Code is just Data

strikes me as an odd one for a PL/PLT enthusiast.

Brooks said you can't make a baby 1 month with 9 women. He may not be a medical doctor, but I believe him. He also said you should eliminate communication between people as much as possible, and suggested you make a team of 7 members. One of these members he called a Language Lawyer, responsible for peer review of code and other duties. Well, if your Language Lawyer is a computer with a ferociously acute antennae for detail as Herb Sutter or Andrei Alexandrescu, then you can pair program with this Language Lawyer and also have very fine control over when you ask for advice and when you listen to advice. You can't get this with humans. Moreover, a computer can paint a flourish on your 30" wide screen computer screen, and bring up associated documentation for you on your 20" vertically positioned side monitor. Herb Sutter would instead maybe point a bulbous finger at some line of code and say something. Moreover, his message might not be the same thing every time, so you've got way more to interpet. Finally, because computers' brains can be copied easily, you can share Herb's Laws with every programmer in your office.

I am a usability freak first and foremost, then a theory enthusiast.



Oblique references to LP

Apart from my thread on LP earlier this year, there have been some past discussions of LP, just not easy to search for;

Wouter van Oortmerssen's Abstractionless Programming.

Also, Living it up with a Live Programming Language also mentions a project for advanced literate haskell: Vital/Pivotal - good examples of Haskell's tools model making literacy easier. Compare this to the Stylesheets approach of Fortress!

LP perspective

Literate programming ultimately is a testament to the limitations of language expression. We should be seeking a way to be rid of 'cold comments' and 'dead documentation' entirely. This means:

  • Support clean syntactic expression of 'intent'. This requires adapting the syntax of the language to the domain(s) most immediately relevant to the developer. That in turn suggests use of a rich extensible parser, supporting embedded DSLs and domain-specific syntactic 'tweaks'. Graphical manipulations of code should also be supported where appropriate (i.e. a figure or image might reflect code). Support drill-down in IDE to teach programmers the underlying 'meaning' of any syntax, and let the DSL itself teach the domain to programmers.
  • Allow rich and meaningful 'annotations'. To be rich, comments should possess deep structure of their own. To be meaningful, the annotations must be kept in the AST and one must be able to (from within the language) extend or transform language post-parse processing pipelines to leverage these annotations (i.e. to utilize optimization hints, to make assertions where possible, etc.). Annotations could include display markup considerations (including interactive elements such as hovertext or suggesting IDE represent code as graph).
  • Provide zero-button continuous testing as one edits, with simple visual display of which unit and integration tests are breaking or repairing in remote code elements as one edits, along with a continuous stream of parser, type-system, compiler, etc. warnings and errors related to the user edits. This encourages exploratory programming and rapidly teaches relationships between code without need to document the relationships.
  • Develop a programming environment for widespread collaboration across many projects, such that refactoring and integration testing can readily occur across projects and users, etc. This extends comprehension of how people are using library code, and allows one to refactor APIs and DSLs with fewer backwards compatibility concerns (i.e. you go in and tweak everyone's use of the library, allowing you to develop the API or DSL itself without quite so much big design up front). Encourage this further by supporting abstraction-smashing optimizations and effective dead-code elimination and good security abstractions (too many people reinvent the wheel or resist deep abstractions due to orthogonal concerns, such as performance, persistence, security).

I'd like to see a language where 'cold comments' and other forms of 'dead documentation' are strongly discouraged, where explanations can be automatically justified (or at least subject to automated analysis for apparent contradictions with one another, the tests, the code they explain), where the editing tools and language design reduce need for commenting the 'why' by automating programmer-education of the relationships between code.

Intent is not enough

Code can't explain who should be contacted if a bug is found, or that this code is due to be superseded in 2010Q2, or point out how it needs to be adapted if moved to another environment, or tell you the practical limitations of the algorithm it implements.

Yes, you could do all these things in annotations, but they'd have to be an open-ended set, just like natural language.

Code can't explain who

Code can't explain who should be contacted if a bug is found, or that this code is due to be superseded in 2010Q2

Just use Intentional Programming Studio and your weirdest code snippet becomes transparent to the reader - maybe even to yourself.

Leo addresses this problem.

Leo - Literate Editing with Outlines was inspired by literate programming and works well in this space. It allows you to write code in an outline, and then, if the comment mechanisms threaten to overwhelm the source code, you can use the fact that it works not on trees but directed acyclic graphs to link the source code into a parallel RST document.


I used Leo for organizing a major rewrite (9? years ago). Literate programming was 'ok', but the major revelation for me was the usefulness of outline editors.

Since then I've been complaining about having to put code (and documentation) in "files".

What about other languages?

How would you translate inline documentation approaches like literate programming into other languages? Seems a tad too monolithic. This is why Mono separates documentation from source code.

Hypermedia "Cards" metaphor

Mono's approach is more easily extensible to the "Cards" metaphor.

Only up until recently was the Javadoc/NDoc/Sandcastle approach was more practical. IDEs are now good enough where they should be able to easily load associated "Cards" when looking at a particular piece of source code.

Most peer code review tools do something like "Cards" anyway, so the Mono approach just supports superior integration of concerns.

Things MSDN does, like displaying the language syntax in the help file, can also be done with "cards" metaphor by just writing a "Card" that takes a CodeDOM Walker and Language as parameters, since for interfaces the mappings are usually very simple.

{edit: forgot to mention that you do want code and comments integrated (but not necessarily tightly coupled to the same file). Somebody did a research study around the time javadoc was built, showing that the locality of documentation with respect to code was a MAJOR factor in detecting avoidable flaws. Unfortunately, I can't find this research paper.]

Is-it still litterate programming?

Is-it still litterate programming?

Sort of

If you're using an IDE and you can bring up the right pieces of information, then it is still jointly connected and should be browsable by any hypermedia-aware system -- such as the IDE's browser. [Edit: Knuth also prophesized this would happen!]

Mono's tool designs are very unusual to most programmers. Mono just places a really high investment in tools compared to most open source projects. This is mainly due, I think, to how C++ caused the Ximian folks to mentally meltdown at times, and they basically said, "Never again." Also, great tools (and licenses) make it easier to scale up a project to thousands of contributors.

What I call the "Knuth Model of Literate programming" views things differently. First, Knuth does things very monolithically. He thinks your prose should all go in one file with your code, and you do things like write alternating weave and tangle escape sequences to make the LP environment "Web/Tangle/Weave"-aware.

The real downside to Mono, when I looked into it, was you had no way to certify or whatever a set of documentation really matched a set of source code. In other words, you had no way to say, "this documentation is English prose for this set of design philosophies, unit tests and integration tests".

I thought that …

The real downside to Mono, when I looked into it, was you had no way to certify or whatever a set of documentation really matched a set of source code.

I thought that the usual LP programming methodology also offered no such certification. Have I misunderstood it?

You are correct - I know of

You are correct - I know of no component or module packaging solution that carries with the component or module a resource that describes all the QA measures used with the code. Proof assistants and automated theorem provers have a concept of "proof-carrying code", but there is no tight integration of concerns for non-heavyweight formal methods like program proofs. I'd say "quality gates" are the best metaphor.

However, that doesn't change the principle movement of the early 2000s of "opinionated software" that code should be test-driven, which later morphed into ever more "opinionated software" called Behavior-Driven Development that tests should be literate specifications that describe use cases and scenarios as "user stories".

A lot of the points I've mentioned (and David Barbour also mentioned this) kind of take a step back and look at what people actually want - as well as what would attract the non-traditional non-stereotype programmer who might prefer a language like Field or Subtext or Processing.

My views are very much a hybrid of Ted Nelson and Donald Knuth, and I try to capture a lot of Ted Nelson's forward-thinking about technology that he did in Computer Lib/Dream Machines.

What do all these activities, if practiced, have in common? They help communicate what (should) happens when code executes.

I hope this doesn't sound ideological. The whole point is to say multiple communities exist, each with a piece to the puzzle.

Reimagining Literate Programming

OOPSLA '09 paper: Reimagining Literate Programming

This paper features a few references I've not seen before, which is nice. However, I don't understand how they are "reimagining" anything apart from what Knuth had already imagined. -- For literate programming, you need a data model. Knuth imagined one over four decades ago. This "reimagining" paper does not do seem to alter that model. It still uses Knuth-style chunks and named sections, but unifies the syntax so that the end-user only must master one language: G-Expressions.

Ginger’s simplifying assumptions that unify a single syntax used for code, documentation and literate glue also simplify the actual implementation. Since G-expressions implement every aspect of the literate program, we can simply manipulate these hierarchical data structures to generate a set of G-expressions that generate code or a set of G-expressions that generate documentation.

Basically, this is a paper about a PL with first-class support for LP, which is neat and cool, but the execution model is still web/tangle/weave.

Also, the formatting of this paper is insane. The long code examples would best be pushed into an appendix -- OOPSLA's required two column layout is not friendly to LP demonstration and is more concerned with tight page count!

Seems to me, we simply need

Seems to me, we simply need a type class-like abstraction for comments. This way we can overload comments for each language to which the comments need to be translated, and the compiler can know exactly which comments are available or missing, and a smart code editor can use that information to provide traditional code views with embedded comments.

Also, the formatting of this

Also, the formatting of this paper is insane. The long code examples would best be pushed into an appendix -- OOPSLA's required two column layout is not friendly to LP demonstration and is more concerned with tight page count!

I wouldn't blame OOPSLA (and I actually much prefer 2 column papers). Perhaps this is a weakness of LP systems? 2-column papers are just one viewing medium: as a developer, I use many when looking at code.


Maybe it is a shortcoming of LP systems.

Systems do have weaknesses. Some systems are based on merely good ideas. Some systems are based on great ideas.

closing thoughts

Re: comment #51863…

Alex Plotnick's CLWEB turned out to be really, really nice for what it is.¹ For everything else, it's lacking a feature whose absence will likely be a showstopper for most Lispers — it doesn't work under SLIME. A lot of said Lispers would be utterly fascinated though by the sickness of the trick by which Plotnick chose to parse his literate *.clw sources. He hacks the daylights out of *readtable* and uses the Lisp reader to do the parsing for him.

Of course, being able to browbeat the Lisp reader into parsing .clw files does not make them valid Lisp source code. You still have to tangle .clw into .lisp, with the resulting output being unsuitable, by design, for comprehension by humans.

Re: comment #51865…

By your own admission, Literate Haskell is anything but. I'm not sure then what to make of your intimation that GHC furnished a counterexample to my assertion that LP was unlikely to work for programming in the large, trivial counterexamples be damned. Literate Haskell goes against the "essence of WEB". (A quick account of its transgressions can be found in Glenn's paper².) Stated briefly, if I feel like moving a chunk code out of a where clause into a literate section of its own, I should be able to do so. This is my inviolable right as a Literate Programmer. I, the author, dictate the order of exposition — not the syntactic restrictions of my programming language.

Let's refer to the section mechanism by Plotnick's term "Parameterless (macro)Expansion With Splicing", abbreviated to PEWS for convenience. Now, PEWS is to LP what eggs are to an omelette. Whether you like it with mushrooms — the omelette, not LP — is your personal choice. You can't, however, remove an essential ingredient and continue to call the dish by its traditional name.

It's abundandly clear that PEWS is not a good feature for any sane programming language. As such, it must remain extralinguistic. It can only exist in a WEB — not in the target programming language.

On the other hand, single-parameter macros introduced by Knuth in the original WEB are clearly an inessential hack whose sole purpose in life was to work around perceived limitations of Pascal as a systems programming language.

What else defines the "essence of WEB" then? Flat organization of sections is as essential to true LP as the flat namespace is to a true WikiWikiWeb. Sure, attempts have been made to bring to LP the hierarchical subdivision into chapters, sections, subsections and suchlike. So, too, wikis have been implemented that allow one to carve out subnamespaces. Clear as day, both of these abominations run contrary to the spirit of their respective progenitors. (By the way, I don't think it's a coincidence that Ward Cunningham was an early LP adopter.³)

Another feature that seems essential is the visually polished nature of the woven output. TEX seems like an overkill, yet something as lame as ReST is not good enough.

Yet another feature that may seem indispensable at first blush is linearization of exposition. To quote Plotnick again, LP offers a royal road to comprehension through "the King's algorithm: 'Begin at the beginning and go on till you come to the end: then stop.'" Knuth says that programs should be literary works. A moment's reflection shows this proposition to be laughable.

First of all, novels usually have a single author. Can you think of any that don't? The Goncourt brothers? Ильф и Петров? Who else? Large programs, on the other hand, are written and, more importantly, regularly rewritten by groups of programmers over a period of many years. Second, you don't have to take the royal road. You can start reading a literate program in the middle, then work way back to the beginning or "go on till you come to the end." You can jump around by chasing references. This is not something you would normally do with a novel. Thus, literate programs are not novels. In fact, as Ramsey pointed out, they are more like car reference manuals. Sure, a reference manual is like a novel to the extent that it consists of a linear sequence of pages but, unlike a novel, much of the manual's linear order is incidental. Not much clarity is lost if you swap a couple of chapters.

Same with literate programs. In a 600-page literate program (which clearly falls into the small-to-medium range), there are a great many places where Section N+1 has absolutely no immediate logical connection to Section N. At its essense then, a literate program should provide a partial order — not a total one. In this too, it is very similar to a wiki. A wiki can be read in any order that the reader finds convenient. If the wiki's diameter (the average distance between pages) is small, then landing on the wrong page is not a big deal. You will quickly navigate to the right spot, if the intra-wiki cross-references are any good.

Finally, the last feature is the close proximity of code to comments (or vice versa). This one seems to be of the essence.

Now, let me summarize briefly the issues that are preventing LP from being usable for programming in the large.

  • Interoperability with languages that insist on splitting their code into a multitude of files the way Java does.

  • Multilingual development. A pretty mundane example (come to think of it, the French mondain seems closer to the mark here) is Java + JSP + Springified XML files. A more esoteric example is SBCL which, if memory serves, is implemented in a mixture of C and Lisp. Show me an LP environment that supports bilingual or trilingual development.

    I find it ironic that CWEB itself is also an example of bilingualism, and an illiterate one at that. I don't doubt for a second that CWEB's cwebmac.tex⁴ is very readable to a moderately competent TEXnician, but it sure would be nice to see it documented in the LP style just for the sake of LP's credibility.

  • Thanks for bringing up Haddock. It reminds me that LP has nothing to say about the distinction between documenting private implementation details and public API.

These things keep LP a niche methodology. Was it Van Wyk who observed over 20 years ago that most literate programs were written by authors of LP environments — people who ported WEB to other languages? I suspect that this remains true to this day.

I have to tell you though that if Plotnick is any indication, authors of LP environments are a pretty enthusiastic bunch. Plotnick showed up in a T-shirt bearing a likeness of Donald Knuth. He referred to the latter as "the master" and dropped phrases like "as Knuth prophesized twenty years ago...". Koolaid was not served (but dinner was provided by the finest employer of Lisp hackers in the Boston area).

Basically, the problem with LP is that it hasn't been properly productized. It's as if people kept on porting Cunningham's original bare-bones WikiWiki from C to Python to Java to PHP to Ocaml and describing it as advancing the state of the art. No one's has put in the effort to write a feature-rich environment that would be to Knuth's original WEB what MediaWiki is to Cunningham's highly influential but, let's face it, barely usable wiki.

… which brings me to the following definition.

The essence of LP is a partially ordered, well typeset wiki where the linking mechanism is PEWS and where comments and code are kept in close proximity to one another.

Re: comment #51864

Although Lout doesn't give you as much control over the finer typographic details, it has a much simpler and saner data model⁵ than TEX. I wouldn't use it for typesetting mathematics for a journal publication, but I'll take it any day over TEX for technical docs. It's basically a functional language that feels more pleasant than the imperative, assembly-like language of TEX.

Re: comment #51878

I'm looking forward to finding out how your rewrite goes. Overhauling "1M code things" is quite a bit of work. Would you mind posting a progress update on this page in a year or so? I wish you the best of luck. Personally, I would try very hard to convert the whole thing into a giant, syntactically correct literate program with a single section, then gradually begin to factor chunks out of it while keeping the tangled output essentially unchanged at all times. Once it becomes fully literate, only then would I attempt to start making functional changes to the code. But that's me. I wonder what strategy you are going to pursue.

Thanks to everyone who commented.


  1. What it is, is a vehicle for writing Plotnick's Ph.D. dissertation.
  2. A Literate Programming Tool for Concurrent Clean by Glenn Strong, May 15, 2001.
  3. Google Scholar: Ward Cunningham "literate programming".
  4. CWEB source code.
  5. Google Scholar: Lout design.

Would you clarify those

Would you clarify those closing thoughts?

Another feature that seems essential is the visually polished nature of the woven output. TEX seems like an overkill, yet something as lame as ReST is not good enough.

Are you referring to Representational State Transfer? If so, then how could you compare this to TEX? It is an architectural style, and not even close to TEX. However, the idea of separating resources from their representation is a very good idea.

What makes sense is to compare the world wide web to Project Xanadu or Chimera or something else, and then compare TEX (a very static, monolithic hypermedia type) to a new hypermedia type.

ReST == ReStructured Text.

ReST == ReStructured Text.

Debunking the named literate chunk myth

By your own admission, Literate Haskell is anything but.

What I wrote was that "Literate Haskell is not traditional literate programming in some respects, but that may actually make it more practical." My own experience working with literate tools[*] is that Literate Haskell is one of the more practical systems, and that it more easily produces more useful and maintainable results than WEB-style tools, partly because it has a more streamlined workflow and a more direct relationship between executable code and literate code.

I'm not sure then what to make of your intimation that GHC furnished a counterexample to my assertion that LP was unlikely to work for programming in the large, trivial counterexamples be damned.

I think we're talking at cross-purposes to some extent, and need to distinguish clearly between the vision of LP, as opposed to what most existing, rather inflexible tools actually achieve.

The idea of being able to associate rich documentation with a codebase and transcend the linear limitations imposed by file-based program source representations is an admirable goal, and I look forward to systems that can do this credibly. Repository-based languages like Smalltalk have demonstrated the possibilities. However, outside of single-language sandboxes, there are also many practical barriers to achieving this goal. Largely as a result of those barriers, existing LP tools tend to be rather flawed.

Re the objections to Literate Haskell:

Literate Haskell goes against the "essence of WEB". (A quick account of its transgressions can be found in Glenn's paper.)

Two of the three objections in Glenn's paper are trivial and have long since been resolved: the lack of "automatic cross reference of functions and types" and "no prettyprinting facilities," both of which are now ably handled by tools like Haddock, HsColour, etc. These were never serious conceptual objections. Some things in this area could still be improved today, but it doesn't require any conceptual breakthroughs.

The other objection of Glenn's is the central one whose status we disagree about: Literate Haskell has "no explicit naming of code chunks." But I consider that its main advantage over traditional LP tools, since it allows it to do away with the tangle & weave layer, and removes one of the main sources of disconnect between the literate program and the underlying source. However, I'm making this criticism in the context of current literate tools, and in the context of programming languages that have sufficient abstraction power to not require extralinguistic aid in this area. In this context, the Literate Haskell approach is preferable to allowing the LP tool to dabble inexpertly in the creation of abstractions.

You gave the following manifesto for the underlying requirement:

Stated briefly, if I feel like moving a chunk code out of a where clause into a literate section of its own, I should be able to do so. This is my inviolable right as a Literate Programmer. I, the author, dictate the order of exposition — not the syntactic restrictions of my programming language.

However, nothing in a functional language stops you from naming a chunk of code in a 'where' clause (or its equivalent) and moving it somewhere else. There are two scenarios, though. The simplest one is where the chunk has no implicit dependencies on its context, i.e. doesn't close over any variables defined in enclosing functions. In this case, the chunk can simply be named and extracted. I don't see any good justification for using an LP tool, rather than the language, to do this. There's no "syntactic restriction" here.

The other scenario is where the chunk does close over variables from its context. In this case, to move the chunk, you either need to manually lambda-lift it, adding the necessary arguments to eliminate the implicit dependencies, or else (particularly in Haskell) use an abstraction like monads or arrows to provide a carrier for the implicit context.

The traditional LP approach to this just doesn't care about context. I don't think I need to expound on the problems with this, since as you point out: "It's abundandly clear that PEWS is not a good feature for any sane programming language." You suggest that because of this, it makes sense to introduce the feature extralinguistically. But to me, that just compounds the insanity you rightly refer to.

Every time you introduce a PEWS that has implicit dependencies on the context it's been extracted from, you're effectively writing a function which, for reasons that are unclear to me, isn't being expressed in the underlying program as a function. This creates a situation in which the reader of the literate program is likely to be left wondering about the meaning of the free variables that are present in that chunk of code, not to mention the potential for error this creates in the ongoing maintenance of the program, particularly in the invoking function where the presence of the dependency is completely invisible. In contrast, turning the dependencies into explicit arguments allows them to be individually documented with an API documentation tool (like Haddock), improving the literateness of the program, and eliminating the semantic danger posed by invisible dependencies.

It seems to me that what we're really dealing with here is a legacy of the time Before Steele (B.S.) when the expensive procedure call "myth" prevailed. (For many languages of the time, it wasn't a myth, of course.) For languages where writing long functions was de rigeur, it's easy to see benefits in being able to break up and rearrange functions for documentation purposes. But in modern functional languages, functions tend to be quite short - often just one or two lines, and when longer, often broken up into natural parts e.g. as pattern match cases which each look like a separate function. Compilers are very good at inlining and otherwise removing overhead introduced by this functional abstraction. I think you'd be hard-pressed to find examples of well-written Haskell, ML, or Scheme code that can't be well documented, literately, without needing to resort to LP quasi-functions. Code that's been written in the FORTRAN style might need some refactoring, but that's not a bad thing, and the last thing you really want is to waste effort on refactoring your program at the level of documentation while leaving the underlying program as a pile of un-abstracted spaghetti.

In conclusion, if you want practical, literate programming today, the Literate Haskell approach is a serious contender. However, I'm not claiming that it's a step forward towards the vision of transcending linearity, it's just that it's more practical than the traditional LP approach to trying to fake that transcendence.

I agree that things like wiki technology, hypertextual approaches in general, integration with API documentation, and exploitation of (distributed) revision control are all promising directions for real ultimate literate programming power. In that context, there's a strong case for being able to click on some name or icon in a piece of code and have it expand in situ into the code represented by the abstraction. But short of that, trying to fake such rich relationships with a non-integrated multiphase toolchain manipulating static bodies of text is an idea whose time doesn't deserve to come.

[*] I have some public examples of the R5RS semantics and the SPJ/Eber/Seward financial contracts language. I've also used LP on numerous occasions to document algorithmic code for the benefit of domain experts who aren't primarily programmers. I'm hoping to put some small examples of that up on the web in the next few months(/years.)


Re: comment-52052:

Alex Plotnick's CLWEB turned out to be really, really nice for what it is.

A new kid on the literate Lisp block: LP/Lisp by Roy Turner.

On an entirely unrelated note, here's an amusing speculation from

One speculation for the reason behind Knuth's pushing of LP is that according to Stanford's intellectual property policy, Stanford would have owned all of Knuth's code, but not his published writing. So the answer's simple: make the code part of the document, and while you're at it, be sure to minimize the appearance and importance of the code itself.

Axiom's literate rewrite

The work is still in progress, as you might expect.

Learning is happening and several interesting system-wide changes have
been added. As with any large project, you eventually discover certain
interesting changes to organization.

First, we've added a general bibliography document to collect all of
the references over the 20 volumes.

Second, there is a slow-moving change to remove noweb and add latex
style chunk names. Thus
<<definition of chunk>>=
\begin{chunk}{definition of chunk}

and the use of the chunk becomes
\getchunk{definition of chunk}

This makes the whole document just pure latex.
You no longer need a weave function.
The tangle function can now be done at (read) time so lisp can
treat the literate document as lisp source code.
Thus, noweb is begin phased out as the rewrite happens.

It is trivial in lisp to read and store document chunks in a hash
table so multi-pass noweb extractions become single pass.

As more of the system is embedded into literate books, more of the
"make" functionality is being written in lisp, the language of the
implementation. Thus, not only is noweb going away, so is "make".

The system organization is much cleaner. Literate programming forces
you to "place" your code into the overall organization in a logical
way. You don't just add a file in some random directory hierarchy,
you add it "where it belongs". The index gives hyperlinked cross
reference information that includes not only where variables and
functions are defined but also where they are used. The combined
table of contents gives an automatically generated overview of the
whole system structure.

Literate programming also forces certain documentation standards.
Help files and test files are kept with the code and automatically
extracted. Eventually it will be an error if these do not exist.

Literate programming is a "large system" discipline where "large"
is either many lines of code or intended to live a very long time.
Individual programmers will only find it a burden since they don't
have the need to communicate with other humans. However, big systems
and long-lived systems like Firefox, Linux, MySQL, could gain a lot
by having the code laid out in book form. New people joining the
project can understand the organization, the reasoning, and the
low-level hacks and performance tweaks. Imagine all of linux in
literate form where you could read about device drivers, their
timing constraints, their data structures, their API, etc. A lot of
that wisdom sits in the heads of those who have done it. I'd rather
just grab Volume 42: Linux Device Drivers and read how to do it.

I think that big commercial vendors of large pieces of software
(e.g. Mathematica and Maple in computer algebra) would be wise to
transition to literate programming. IBM, Google and Microsoft could
use the technology. Imagine the grief that must be happening at Oracle
because they now have to decode the Sun software without the original
authors. I'd bet they would love to be able to give the "Sun books"
to the Oracle employees.

In fact, literate programming opens up a whole new discipline.
There is a need for "Editor-in-chief" on the software teams, that is,
someone who knows how to enforce "readable prose" rather than jargon.

Ultimately, the key issue is quality. Literate programming is a tool
to improve the quality of long-lived software.

progress update on Axiom’s literate rewrite

Note to self: Comment-62437 was apparently in response to the the question I asked in comment-52052:

Re: comment #51878

I'm looking forward to finding out how your rewrite goes. Overhauling “1M code things” is quite a bit of work. Would you mind posting a progress update on this page in a year or so?

It's been a year or so. I'm duly impressed with Tim Daly for keeping track of such things.


ReST ≠ REST. Subtle difference in capitalization.


Interesting perspective you provide to the LP debate. I actually didn't get to stick around for the end of Alex's talk (and came late to Dan's talk due to having a hard time figuring out where Room 34-XXXX was). I wish I could come to more Boston Lisp meetings (this was my first since moving here in April 2008), but they are usually on nights I do community service.

At some point I need to write an explanation of how various IDEs tackle basic issues that form a foundation for writing code well. For example, good continuous error messages. Where do you place the handlers in the overall system? I think this basically echoes your point about productizing LP, although I'm not sold on all of your conclusions. But there really is no guide for people who want to productize programming languages and their associated environments. There is no open explanation of what problems to expect, and ways to solve them. Most state-of-the-art uses a fair amount of heuristics.

I am guessing that you feel Lout is better than the G-expressions in the paper I posted above?

XP/Agile/etc. about Excessive Documentation

I'll play the devil's advocate here (though I actually tend to agree here) and answer the question on why literate programming has not really caught on: developers are lazy, and they hate writing superfluous documentation, and furthermore - that's not necessarily a bad thing. From my understanding of Exterme Programming and other of the so-called "Agile methods", they tend to discourage people from writing a lot of superfluous documentation, and instead encourage people to make their code as self-documenting as possible, using refactoring, giving meaningful names for variables, functions, classes and other identifiers, and by writing automated tests (uni tests, system tests, etc.) that serve as self-documenting code, executable specifications, and (naturally) provide confidence that the code is doing what it's told.

Personally I've seen or heard of cases where even in-line comments became out-of-sync with the code as it evolved, to say nothing of documentation maintained somewhere else. But for example the classic extract method refactoring (or extract subroutine/function in general) can be used to give a label and assign a meaning to a certain portion of long code inside a long code, or you can change an identifier name to make it more obvious and you'll have better confidence that it will remain this way.

While XP tends to use Java as its demonstration language, I am not a Java advocate, and have used those advocated-by-XP practices in Perl 5, C and other languages (and Java using very early and primitive-by-today's standards Java IDEs) which don't or didn't have any of the automated refactorings, code generation stuff, or test-driven-development introspection. Overcoming the standard "Agile development" Java-specific buzz, it still has a lot of good universal advice to give.

Trying to maintain code in sync with documentation is not much better than trying to maintain two distinct codebases in two different languages in sync. At a previous workplace of mine, they decided to market a few Flash-based gadgets and I was instructed to translate a PHP and Flash 9 program (with many PHPisms) to Perl, because as it turned out the marketing department decided that for compatibility it needed to run on all of PHP, ASP and PERLâ„¢. And they expected them to be maintained into the future. And we didn't have something like Fog Creek's Wasabi which compiles a common codebase into several target languages for the web.

I'm not trying to discourage people here from writing documentation like external API documentation, user manuals, functional specs, coding style documents, etc. However, going to excessive extremes of documenting will slow things down considerably, will likely cause a large rift between the documentation and the code, and may confuse or even mislead programmers who try to depend on the documentation. I prefer to make sure my code is readable, properly factored out, with meaningful identifiers and self-documenting.

But naturally, I'll be happy to hear people play the devil's advocate to this opinion of mine.



Some Questions For Literate Programmers

This is a question for people who have used LP on large systems or for extending large systems.

1. Do you always add new text to the bottom of your WEB/NW file or do you go into the file and sometimes change things in the middle of the file?

2. Suppose you have developed a large system and then a new requirement comes along that invalidates some of the design decisions. Do you override the previous definitions by adding new ones to the bottom? Or do you go back into the old ones and rewrite them?

3. How does the WEB/NW file itself evolve? Do you start writing at the beginning and write all the way to the end? Or do you bounce around in the file, elaborating things, or editing them?

Literate thinking

I'm a little puzzled by your questions. It seems you are thinking
of literate programming as a form of "documentation" (which it is)
rather than a form of "communication" (which it is). If you are
writing a literate program, you are trying to communicate to another
human separated in space and time.

Many people make the mistake of thinking that literate programming
is just "better documentation". This mindset is easy to spot, just
look for suggestions about "better variable names", "refactoring",
"javadoc", "self-documenting code". Clearly this misses the point.
This view sees literate programming as another programming "tool"
or another programming "style".

Step away from the machine. Literate programming has nothing to do
with tools or style. It has very little to do with programming.

One of the hard transitions to literate programming is "literate
thinking". Suppose you were writing a novel. One of the primary things
you need to do is make sure that everything that happens in the book
is "motivated", that is, it is introduced because it it needed.

Imagine explaining your software by motivating the need for it, then
motivating the design issues, then motivating the organization, then
motivating the implementation details. Natural divisions are organized
book-by-book and chapter-by-chapter.

The documentation "evolves". If you were writing a novel where the
character had to do something uncommon (recite shakespeare from
memory) then you need to go adjust the prior chapters to hint why
the character might be able to do that. Thus, you tend to "bounce
around", trying to connect the parts into a coherent whole.

Parts that are "invalidated" need to be rewritten. In fact, that is
a good way to know that the underlying software needs to be rewritten.
Other parts of the book (and the underlying software) that use the
rewritten parts need to be reworked. Often this doesn't happen because
the programmer has no idea that someone else depends on his work,
especially in a large project.

Large programs and long-lived programs (IRS tax codes, MS Windows,
Mathematica, Space Shuttle software) written by many people who are
spread out in space and time really need this level of documentation.

Writing your diary can take any form. Writing for a large audience
requires much more discipline. It is not about the tools (pencils?
javadoc?) or the style (first-person? good variable names?), it is
about the audience.

literate programming and code/doc synchronisation

shameless link to my own related work: syncweb tool