Code Reading

LtU readers know that I am long time advocate of code reading. As I've argued here in the past, reading great code is the best way to acquire good programming skills. It's also a pleasure to read good code. Yes - reading code can be fun.

It turns out that I am not alone (though my conception of a code reading workshop is perhaps somewhat different than the things discussed there).

Anyway, this is a chance to continue one of my pet memes. Many of the pieces of great code I've read over the years come from language processing tools (e.g., compilers, meta-programming systems etc.) I don't think this is a coincidence.

Now's your chance to tell us your favorite examples.

The rules: The code must be beautiful and it must be programming language related (and no, being written in a programming language isn't enough).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

'beautiful' in the sense of an iron maiden, perhaps

At a glance, this blast from the past doesn't look compilable on modern stuff, but it is instructive from the standpoint of looking at the problem sideways:
Duff's Device


send(to, from, count)
register short *to, *from;
register count;
{
register n=(count+7)/8;
switch(count%8){
case 0: do{ *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
}while(--n>0);
}
}

What was I reading to write that?

I knew enough to write that bit of code because I had read the source code for Dennis Ritchie's PDP-11 C Compiler, specifically the version shipped with Fifth Edition UNIX. It's a remarkable program -- it compiles a fairly sophisticated language, producing decent code, all in a 64K byte footprint.

Most of the syntax analysis is done by recursive descent, except for expressions, which are parsed by using precedence tables with what we called the Railway Siding algorithm when I was in school. Code generation is done by a table-driven tree-pattern matcher.

The whole compiler runs in two passes (uhh, actually seven if you count one for the preprocessor, one for the optional freestanding peep-hole optimizer and three for the assembler), but the organization is essentially single pass -- it was split in two for space reasons. The first pass does syntax analysis & typechecking and generates code for everything but expressions. (And maybe switch statements? It's been 30 years since I read the code.) Its output is an assembly-language code/data stream with embedded expression trees to be processed in the second pass.

If the machine it compiled for were still of contemporary interest, I'd recommend it to anyone looking for a good introduction to the practicalities of software design and implementation.

Presumably to keep the compiler's size managable, case labels are treated syntactically as ordinary goto labels with a number attached. There was a stack of switch statements, and whenever a case was encountered, a label went in the output and was recorded in the stack (with an error if the stack was empty, i.e. if no switch was active.) Since case labels weren't syntactically bound to switch statements, they could occur anywhere in the switch's body, including, for example, in the middle of a nested do-while statement.

Availability

Dennis Ritchie's PDP-11 C Compiler

Is this available online somewhere?

Old C Compilers

At the bottom of this page there is source for a 1972 version.

www.cs.bell-labs.com/who/dmr/primevalC.html

Oh mama...

Oh mama...

Sorry

Sorry - I forgot the warning.

It's great stuff. I love code

It's great stuff. I love code like that. I really do.

Assembly

It's great stuff. I love code like that. I really do.

Yeah, I spent last evening reading through all the code and it was a lot of fun. It really is portable (or maybe not-so-portable) assembly code. You gotta love this gem:

block(n, op, t, d, p1,p2,p3)
int p1[],p2[],p3[]; {
	extern space[], error, exit, ossiz, ospace[];
	auto p[], ap[];

	p = space;
	ap = &op;
	n =+ 3;
	if(space+n >= ospace+ossiz) {
		error("Expression overflow");
		exit(1);
	}
	while(n--)
		*space++ = *ap++;
	return(p);
}

That's just plain evil (in a good way). I did a double-take, followed by an "aha" after I'd asked myself, "Why the hell isn't he referencing any of the parameters aside from n and op?" :)

That's just plain evil (in a

That's just plain evil (in a good way). I did a double-take, followed by an "aha" after I'd asked myself, "Why the hell isn't he referencing any of the parameters aside from n and op?" :)

If I'm reading this correctly, it won't work if parameters are passed in registers, right?

But they aren't (except as an optimization)

You're allowed to take the address of a parameter, so they'd better appear in memory. Modern compilers that transmit parameters in registers (even Dennis's PDP-11 compiler did this eventually, if I remember) must drop them in memory as soon as you take their addresses. (For this idiom to work, you'd have to flush them all if you took the address of any of them.)

Remember that when this was written, there was exactly one C compiler on Earth, so depending on details of its behavior was a venial, rather than a mortal, sin. More to the point, <stdarg.h> hadn't been invented yet -- it was developed in response to ad hoc ways (like this) of dealing with variable argument lists.

The version I meant

The version of the compiler that I remember was on the Fifth Edition UNIX tape, which I believe is archived as a tar file here.

The compiler is in the directory ./usr/c

Unix V[234567]

There is a set of browsable archives of ancient Unix source code, including something that looks like the V5 C compiler.

Here's another one

The point here is to do some code *in between* iterations (i.e., interleave). This is a simple case which is just printing a delimiter between entries in a list:

void print_joined (const char *first, const char *delim,
                   const char *last, const char *def,
                   char **list) {
  switch (*list == NULL) {
    for (; *list; ++list) {
      printf("%s", delim);
      if (0) {
  case 0:
        printf("%s", first);
      }
      printf("%s", *list);
    }
    printf("%s", last);
    break;
  default:
    printf("%s", def);
    break;
  }
}

And here's the macro version, inspired by protothreads:

#define FOREACH(test, inc) \
  switch (!(test)) { \
    for (; test; inc)

#define BETWEEN 
#define FIRST if (0) { case 0:
#define EACH }
#define LAST }
#define OR_ELSE break; default:
#define DONE }

      
void print_joined (const char *first, const char *delim,
                   const char *last, const char *def,
                   char **list) {

  FOREACH(*list, ++list) {
    BETWEEN {
        printf("%s", delim);
    }
    FIRST {
        printf("%s", first);
    }
    EACH {
        printf("%s", *list);
    }
    LAST {
        printf("%s", last);
    }
    OR_ELSE {
        printf("%s", def);
    }
    DONE
}

switch tricks still work

That kind of thing still works. case labels still seem to be equivalent to goto labels. You can usefully nest ifs inside switches, for example, along the lines of:

switch(mode) {
case M_1:
  if(whatever) {
    //
  } else {
case M_2:
    //
  }
  break;
}

I don't remember exactly why I had to do this (I vaguely recall the if being a late insertion into a very large switch due to some messy logic) but the correct code was produced.

protothreads

Interleaving case labels is also the key trick that makes protothreads work.

"Protothreads are extremely lightweight stackless threads designed for severely memory constrained systems, such as small embedded systems or wireless sensor network nodes."

Duff's device v. Safe Languages

It is curious that Duff's device doesn't work in any safe language I can think of. I don't care about the device per se, but given modern optimizers can equivalently compact code be generated in safe languages today?

Haskell Standard Prelude

Always seem to learn something new everytime I dig into the prelude.

The Compiler in Chapter 24 of The Art of Prolog

In the Second Edition of The Art of Prolog : Advanced Programming Techniques, Leon Sterlin and Ehud Shapiro present a complier for a simplified Pascal subset that is firmly grounded in the declarative programming paradigm.

The code is compact enough to wrap your mind around and is presented with a cogent discussion that provides a refreshing compliment to the standard imperative C-based compiler texts.

RABBIT

Guy Steele's Master Thesis (a scheme compiler written in scheme, though an old dialect). Not only is it useful to anyone wanting to know more about compilers or functional programming, it is highly documented because the code and the thesis are interleaved which makes it a little bit magical in my opinion. It's available online.

Guy Lewis Steele, Jr.. "RABBIT: A Compiler for SCHEME". Masters Thesis. MIT AI Lab. AI Lab Technical Report AITR-474. May 1978.

compiler and VM

The OCaml compiler, virtual machine, and C FFI are good readings/studying for thoses interested in such topics.

Many of the pieces of great c

Many of the pieces of great code I've read over the years come from language processing tools (e.g., compilers, meta-programming systems etc.) I don't think this is a coincidence.

Is anyone here who likes reading business software?

Sometimes I like to read certain algorithms related to computer algebra and number theory. But it is mathematics shining through them and according to Ehuds rules they are not part of the game.

according to Ehud's rules the

according to Ehud's rules they are not part of the game.

Not this time. Sorry...

No matter, Ehud. Rules are ne

No matter, Ehud. Rules are necessary.

After the "original papers" where already cited, I'm not ashamed to reference another evergreen, which might be a source of joy for some and annoyance for others: the quine page.

One might dispute whether the befunge quines are beautiful or not. The Java quines obviously aren't very amazing.

Kay

reading business code

I suppose that there could be really great business code out there,however given that business data is often extremely dirty the code itself, while perhaps heroic in coping with the data will not be likely to be 'beautiful'.

LiSP

Great topic!

I enjoyed reading "Lisp in Small Pieces" by Christian
Queinnec. The book features code and explanations for 11 interpreters and 2 compilers! Among them a Scheme to C compiler.
The book is very well written and there is no cheationg: *all* the code is there.

My first programming course a

My first programming course at the university of twente was a course in functional programming but i never really caught on until i read (and transliterated some concepts from) s.p.j's tutorial on implementing a functional language. The book is compilable (in miranda) :) in the sense that it is a literate program, as well as a tutorial explaining how to build that program. Filled with all sorts of nice examples of rec. definitions and the power of ADT's. Doesn't shy away from explaining a larger principle behind something if it leads to a better understanding (such as fold). In the end it totals up to 4 interpreters for a lazy ml like language (template instantiator, g-machine, tim, parallel g-machine and some utility stuff like a lamba lifter etc). A very nice read in my humble opinion.

http://research.microsoft.com/Users/simonpj/Papers/pj%2Dlester%2Dbook/

beautiful code

There's always the meta-circular evaluator.

Snippets vs. "Real Programs"

I was thinking more about larger chunks of code: the kind you find in real systems.

Real Programs

Code in real programs looks like that (we are in the fun department, aren't we?).

Is it impossible to write bea

Is it impossible to write beautiful code in Java or is just real hard? ;-)

Benday dots

Java and C# are just very, very low-density programming languages. It's like a Roy Lichtenstein painting — if you stand too close, all you see are the dots. To see any beauty other than the beauty of the dots themselves, you have to stand further back and look at it from a distance. Further... further... just a bit more... are you seeing it yet? Try squinting...

Good way of putting it

Which is why I wonder whether it's true to say that Java and the like might provide examples of beautiful design and software engineering, where as code in "higher density languages" is where you'll find examples of beautiful "porgramming".

But, really, what we need are more examples.

I think we need to call in Darius and Luke...

PAIP

I can't think of any Java code I'd like to read in bed, either.

My favorite example for this topic is Paradigms of AI Programming, a big fat book packed with superbly crafted code and enlightening text, including (obLtU) both DSLs and some bigger language implementations. A great work of 20th century literature. :-)

I almost didn't buy it, as it looked to be by and large about things I already knew; but what's inspiring is the sustained clarity in code as communication, that's at the same time production-quality code for execution. It doesn't jump out at you with fancy tricks; it's more like reading an unobtrusively excellent prose stylist. Luke used to joke that no matter how proud you were of your code, I would send it back much better. In this case, I'd often have an idea, then after some working-out I'd invariably see that the author knew just what he was doing. Multiply by 900+ pages on a wide variety of interesting topics, and you start to see why this is my paradigm of literate code. (I did find one new bug. Norvig said it had been over a year since the last bug report and he'd started to hope there weren't any more.)

At the other end of the length scale, I've been enjoying Kragen Sitaker's hacks lately, e.g. this object calculus evaluator and this bit of text processing. Well above the usual standard for hacks -- nicely factored, organized, and documented, teaching about both their subjects and intelligent use of Python.

Beauty of the dots?

So you need a tool like this to find the beauty in those programs?

Visual Code Smells

If you move to far away the picture itself becomes pixel. Is there anything you can still see about your code from a far distance? Design pattern, class diagrams, component diagrams... ? I've once recreated an UML diagram from a C++ program and hang it up as a wallpaper in the office. Some colleagues mentioned it to be beautiful. But I'm not really sure about deceptive trapping? Is UML the right technique to detect code smells visually?

Kay

That hurts

Damn, Anton. You nearly had me laughing out of my chair here.

Java: All the beauty of C++, and none of the flexibility.

Irony?

What's funny is that Anton's link above, which presumably once showed a painting, now displays an error with the following stack trace:

java.sql.SQLException: [Macromedia][SQLServer JDBC Driver][SQLServer]Invalid object name 'ArtistWorks'.
at macromedia.jdbc.base.BaseExceptions.createException(Unknown Source)
at macromedia.jdbc.base.BaseExceptions.getException(Unknown Source)
at macromedia.jdbc.sqlserver.tds.TDSRequest.processErrorToken(Unknown Source)
at macromedia.jdbc.sqlserver.tds.TDSRequest.processReplyToken(Unknown Source)
at macromedia.jdbc.sqlserver.tds.TDSRPCRequest.processReplyToken(Unknown Source)
at macromedia.jdbc.sqlserver.tds.TDSRequest.processReply(Unknown Source)
at macromedia.jdbc.sqlserver.SQLServerImplStatement.getNextResultType(Unknown Source)
at macromedia.jdbc.base.BaseStatement.commonTransitionToState(Unknown Source)
at macromedia.jdbc.base.BaseStatement.postImplExecute(Unknown Source)
at macromedia.jdbc.base.BasePreparedStatement.postImplExecute(Unknown Source)
at macromedia.jdbc.base.BaseStatement.commonExecute(Unknown Source)
at macromedia.jdbc.base.BaseStatement.executeInternal(Unknown Source)
at macromedia.jdbc.base.BasePreparedStatement.execute(Unknown Source)
at macromedia.jdbc.base.BasePreparedStatementPoolable.execute(Unknown Source)
at coldfusion.sql.Executive.executeQuery(Executive.java:1202)
at coldfusion.sql.Executive.executeQuery(Executive.java:1008)
at coldfusion.sql.Executive.executeQuery(Executive.java:939)
at coldfusion.sql.SqlImpl.execute(SqlImpl.java:325)
at coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:831)
at coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:521)
at cfQRY2d22ecfm556017016.runPage(D:\inetpub\NGA\InternationalPrints\Application\Includes\QRY-2.cfm:27)
at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:192)
at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:366)
at coldfusion.runtime.CfJspPage._emptyTag(CfJspPage.java:2644)
at cfDetail2ecfm51378331.runPage(D:\inetpub\NGA\InternationalPrints\Tyler\Detail.cfm:3)
at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:192)
at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:366)
at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)
at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:279)
at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:48)
at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)
at coldfusion.filter.PathFilter.invoke(PathFilter.java:86)
at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:70)
at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)
at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)
at coldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:46)
at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)
at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)
at coldfusion.CfmServlet.service(CfmServlet.java:175)
at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:86)
at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)
at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:94)
at jrun.servlet.FilterChain.service(FilterChain.java:101)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106)
at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:284)
at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543)
at jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:428)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)

On the plus side...

...Java doesn't have TCO, so at least the stack trace is easy to read.

[sarcasm is a much easier device than is irony].

Hit by the same exception...

I had to look it up in wikipedia. Apparently the technical term is Benday Dots.

LtU - serving PLT and culture since 2000

Is non-beautiful code boilerplate-ridden code?

Significant whitespace is one of the trademarks of beautiful code for me. Curly braces bother me (too much vertical garbage). I'm sure some people find "some" Java code beautiful.

I guess I could go on a rant about the psychology of computer programming, and how it's very under-appreciated, but I'll leave that for another thread.

Knowing how to format code

Knowing how to format code is another untaught art that separates the master from the apprentice...

It's much more than formatting

I really wish the psychology of computer programming was more part of PLT. Yes type/category theory is important. But it is curious that many scientists(Physicists, Chemists, etc..) really like Python.

Why? It's Pseudocode.

We need to balance density with readibility, and we need studies on this.

Check-out IOCCC entries

Some entries in the International Obfuscated C Code Contest are certainly worth reading. You will find compilers and interpreters written in less than 2K of C source code.

  • 1996 august: subset of C compiler and byte code interpreter
  • 1991 dds: Basic compiler, heavily compressed
  • 1990 dds: Basic interpreter and byte code interpreter
  • 1989 jar.2: lisp interpreter, compressed

You are certainly not alone advocating code reading;
In my book titled Code Reading: The Open Source Perspective I argue that only by reading great code can we learn to write good programs. A sequel to appear on March 2006 looks at code quality from the same angle.

Your book looks interesting.

Your book looks interesting.

I also stress the notion of code reading tools and discuss several of the things you mention (e.g., regular expressions).

what is the meaning of beautiful

This is actually something that rankles. In Diomidis' case the obfuscated C code is 'beautiful', but I don't think most of us would agree with that example.

I think the concept of beautiful code relates to a couple of bits of common wisdom in CS, the first is the old chestnut about "Any programming language that doesn't change your ideas about programming isn't worth learning", the other has to do with the feeling among many that brevity relates to beauty.

If brevity of code relates to beauty then this means that there are a number of programming languages that could never produce beautiful code.

I am not completely averse to this idea, but I will note some problems with the assumption that there is a one-to-one relation between brevity and beauty:

1. Some languages are less verbose than others, are we going to argue that J, K or Perl are the most inherently beautiful languages.

2. In human languages brevity can be a component of beauty, dependent on the language. I would say that one of the elements of beauty for latin is brevity. For English however, while brevity may be the soul of wit there is no essential connection from it to beauty.

My thinking is as follows:

The precept "Any programming language that doesn't change your ideas about programming isn't worth learning" is at a higher level than the idea of beauty or the appreciation of brevity, dealing as it does with the very worth of the language in which beauty could be expressed.

Beautiful code is code that is worthwhile to read.

Any programming language that aids in the production of beautiful code must be one that is worthwhile to learn, and is thus one that changes your ideas of programming.

Beautiful code is the actual code that changes your idea of programming in the language that possesses this capacity to change your ideas.

Obviously when we discuss changing ideas of programming we don't mean changing the idea that programming is fun to the idea that programming is a tedious exercise in pointlessness. But I will not be slamming Java at this point.

:)

Be inclusive

Personally, I like to study well designed systems as well as small and cool hacks. I don't think it must be one or the other. The aesthetic pleasure from each is different as well the things you can learn from them.

yes

But do you feel that there is a concept of beauty in programming that these fulfill, or do you study them for other qualities aside from the concept of beauty?

there is a concept of beauty

there is a concept of beauty in programming

Of course!

But I don't think I can make it explicit. I can give examples, which reflect my taste. Seeing them you will come to grasp what I mean by beutiful programming, a qaulity that like other kinds of beauty cannot be explained in words.

an unofficial aesthetic theory?

Well I am somewhat leery of the whole concept that there is not a possibility of ever defining an aesthetic concept apart from the level of individual tastes but anyway I think that if you give some examples of what you find beautiful you would be likely to be aware of a reason that you find them beautiful, and that such reasoning when applied across a number of examples comprises an unofficial aesthetic theory as formulated by Ehud. Have you noticed such a theory in formulation.

And if we go with it far enough will there be seen to be a general consensus Lambda theory of what is beautiful code and what is the most beautiful language (I'm expecting some Lisp variant here)

A useful definition?

The definition of beauty can be (and has been) debated endlessly, but definitions are, in my opinion, good only insofar as they are useful towards some end.

In that light, I like this definition: a system is beautiful when nothing can be either added or taken away without changing some essential part of the system's nature. That is, everything you need is there, and nothing is there that you don't need.

Of course, "need" changes wildy with the author, audience, and users, so a piece of code that may be beautiful in one context might be completely inadequate--or completely overengineered--in another. Thus, measuring the beauty of a piece of code, even if it's just the "measurement" of one's aesthetic reaction, means understanding what a piece of code is meant to be. In the curious case of the IOCCC, being beautiful also means being extremely ugly: a horribly fascinating state of affairs.

I like to imagine beautiful (or elegant) systems as those that occupy points of local maxima on some ability/cost ("bang-for-buck") curve. Moving around on this curve involves tradeoffs between what you want and how much you're paying for it, but the most impressively beautiful systems are those that give you a lot for very little, like LISP's eval, or Hoare's in-place quicksort (as opposed to "Hello World", for example, which occupies a local maximum but does very little).

Of course, all of this is from someone who's never crafted a really beautiful piece of code, so take it as you will :)

PietPiet

Pixels to characters, I bet that Piet interpreter in Piet would look beautiful! :-)

sweet

piet hello world is now my msn messenger display picture

My comment

To me a beautiful program is one that finds just the right words to say what it means. Darius Bacon's programs are all great examples of this. He has the unusual talent of writing good idomatic code in a lot of very different languages. I would pick awklisp as a fun example -- representing conses as integers and the car and cdr functions as dictionaries is to me very unexpected and totally awkish.

Peter Norvig writes beautiful code. Darius had this to say about his book: "ParadigmsOfArtificialIntelligenceProgramming I keep coming back to; it's practically a comfort book to me. It deserves attention outside the Lisp world."

Since people here appreciate obfuscation I will take this opportunity to share my losing entry to the inaugural Obfuscated Erlang Contest. My idea was to take the mind-bending idea from Ken Thompson's Turing Award lecture and implement it as straightforwardly as possible in Erlang. The application was a program which prints a number in the fibonacci sequence -- the next number each time you recompile it. The trick is that the module is its own parse transform which is like a Lisp macro calling itself (or rather its own previous version). The code:

-module(fib).
-author('luke@synap.se').

-export([run/0, parse_transform/2]).
-compile({parse_transform,?MODULE}).	% remove_to_bootstrap
-import(lists, [reverse/1, map/2, mapfoldr/3]).

run() ->
    io:format("~p~n", [a() + b()]),
    halt().

a() -> 0.
b() -> 1.

parse_transform(Ts, _Opts) ->
    map({erl_syntax, revert},
	tree(mapfoldtree(fun({integer,L,_}, [I,J|A]) ->
				 {{integer,L,I+J}, [J|A]};
			    (T,A) ->
				 {T,A}
			 end, [round(0.0),a(),b()], Ts))).

mapfoldtree(F, A0, T0) ->
    mapfoldr(fun(T1, A1) -> erl_syntax_lib:mapfold(F, A1, T1) end, A0, T0).

tree({Tree,_Acc}) -> Tree.
To run it you need the GNUmakefile.

Martin Björklund won the contest by exploiting features of Erlang that nobody else knew existed :-). As an aside, Martin and other previous Bluetail founders have a brand-new Erlang startup that is hiring: tail-f. This is sure to be the second-coolest Erlang startup of the present era :-)

Where are the EOPC results?

Where are the results of the Erlang obfuscated programming competition?

Regards,
Marc

how to teach writing readable code?

You may be interested in this exercise.

Why is nobody citing Knuth?

It is interesting that nobody seemed to cite Knuth yet, although he used that peculiar literate programming style to combine code and documentation for TeX and Metafont development.

Is it because Knuth is just using some Pascal like language which is not appealing to the LtU readers?

Regards,
Marc

Knuth

TeX: The Program looks pretty opaque if you haven't studied TeX already. I imagine Metafont poses a similar hurdle, and tAoCP asks for dedication in another way, by being in archaic assembly language. (I coded up the MIX machine before finding I lacked the energy to actually use it.) So it's not the Pascal -- if you know C, Pascal's not a problem.

The first half of CWEB is well-written in a close-to-the-machine kind of way, but not especially interesting, and I never got around to the rest. (Where he made up a kind of DSL for syntax-directed translation and then converted the code by hand into Pascal/C.)

Literate Programming has some admirable and clever shorter examples, like the hash-trie one. Excellent book.

A problem with some of this

A problem with some of this discussion is that it emphasises the notion that 'beautiful' code consists of well defined and self-contained algorithms. In many cases a 'beautiful' solution is written as small fragments. The 'beauty' is in the way they mesh together when actually used. Trying to force this kind of problem into a compact well-defined N-line algorithm is just another way of creating pain between the ears.

To take an example that pleased me: in implementing the lambda calculus in the lmn metalanguage of the language machine, I had to decide how to deal with currying and variable scope. This meant deciding how to design rules to represent functions so that they could exploit the dynamic scope of the language machine to preserve the lexical scope of the lambda calculus while coping with partial application.

So here's a function

reverse3 = \a \b \c c b a;
and its translation into lmn
 _reverse3 y :Va y :Vb y :Vc <- r f :_reverse3 :{ Va Vb Vc } :{ Vc Vb Va } ;

The translation reads: "_reverse3" followed by "y" and a variable binding "Va" ... is replaced by "r" "f" with a triplet of bindable values - the name of the function, its arguments, and its body (I should have used "arg" for "y", but as Wittgentstein never said, the meaning of a symbol is its use in the grammar).

So far, nothing to deal with currying, which manifests itself as arguments not provided. This is dealt with by rules in the run-time environment - first the rules for arguments actually provided:

 - t :A                    <- y :{ w :A }; // get a "t", treat it as a "y" packaged as a "w"
 w :A                      <- y :{ w :A }; // preserve "w" packagings
 b :A                      <- y :{ b :A }; // preserve bracket packagings
And then the rules for arguments not provided:
 -                         <- y :{ } curry ;
 curry                     <- y :{ } curry ;

The first is a something-from-nothing rule: it means 'you can always have an empty "y" followed by "curry"' (interesting restaurant). The second deals with successive missing arguments. The effect is that a function with some missing arguments is transformed to a function with some empty arguments followed by "curry".

The rule which matches functions for evaluation deals with possible currying by requiring "yq" (the "m" "-" on the right-side ties the rule to a context where the goal symbol is "m" but delivers something else - in effect the results of matching "yq"):

f :F :X :B yq              <- m - ;
If the function was curried, "curry" is present and you get the incomplete form of the function. If all arguments were present you get the complete applicable form of the function (another something-from-nothing rule):
 curry                     <- yq h :F :X;
 -                         <- yq g :F :X :B;

Each function continues to carry with it a bag of its arguments (the variable "X"). Actual arguments in the bag are prepackaged, missing arguments are empty. So substituting the sequence 'F X' will try to apply the function again with any arguments it already has - any arguments that were missing are empty, so it gets another chance to acquire them. And the bag also preserves its lexical scope even for arguments that are missing - they exist but are empty, so there is no risk of capturing variables that belong somewhere else.

Apologies if this was too long - but I think it does make my point and at the same give some flavour of the way you can use the unrestricted recognition rules of the language machine.

Concur - consider OS code

A problem with some of this discussion is that it emphasises the notion that 'beautiful' code consists of well defined and self-contained algorithms.

I agree. In kernel codes, beauty consists in part of identifying the right smallish constraint set for maintaining the integrity of the kernel. The implementation of these as code is generally not localized, but rather consists of the right calls made from many locations throughout the system. There are other kinds of constraint-based programs for which this is true, but in my experience it is most commonly true in critical control systems (a class of which, I would argue, well-engineered kernels are members).

So for kernels, there is a level of beauty (sometimes) in the code qua code, but there is another level of beauty in discovering constraints that are not only minimal, but are chosen in such a way that there placement in the code is suitably obvious. In a well-written kernel, ugly code is almost invariably dictated by hardware constraints or hardware bugs. Or BIOS legacy (boot sectors, anyone?), which is merely a pass-through hardware constraint.

Once you become accustomed to the style and rhythm of these things, one finds in reading the UNIX v7 code (or, I am told, the EROS code, though I'm blind to it myself -- too close) is that a missing or misplaced check screams out to the reader that something is wrong.

CLIPS expert system shell

The best piece of C code that I've had the pleasure to study is the source for the CLIPS expert system shell. It's been around forever and is rock-solid. Additionally, as a programming language interpreter, it shows the implementation of numerous language paradigms.

http://clipsrules.sourceforge.net/

-m