Lambda the Ultimate

inactiveTopic Reading, Writing, and Code
started 12/15/2003; 7:30:28 AM - last post 3/18/2004; 12:57:04 AM
Ehud Lamm - Reading, Writing, and Code  blueArrow
12/15/2003; 7:30:28 AM (reads: 10861, responses: 45)
Reading, Writing, and Code
Another ACM Queue piece, this time on code readability.

The author spends a fair fraction of the essay discussing various programming languages (and even manages to mention Ada), making this on topic for LtU.

Personally, I think reading code, and writing readable code, is much more than just a language issue. It is firstly about mastering programming techniques, and about developing a sense of style. It is about understanding micro-decisions (like naming), module design, and finally about understanding software architecture and system design.

These skills can only be developed with practice, and require years of experience. But that's not enough.

Reading code, and learning to appreciate what great code looks like, is essential.

That's why I like to conduct code reading workshops. Not all programmers have day to day access to excellent code. Not everyone is going to look for good examples on the net, and beginners are not always able to distinguish good code code from bad.

You cannot become a great author without reading. Nor can you become a great programmer without reading great code.

One final thought. Read the great code you can find, even if it is not written in the language you currently use. The experience will make you a better programmer, whatever the language you use.


Posted to Software-Eng by Ehud Lamm on 12/15/03; 7:33:09 AM

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 7:44:41 AM (reads: 1657, responses: 0)
This posting being off the home page, allow me to be a bit more blunt.

If anyone is interested in code reading workshop for his team etc., I am available (especially if you are in Israel, or when I visit your country) . Worhshops can be based on material you provide (i.e., code walks for subtle parts of your company's framework) or chosen by me per you specification. A representative from the client needs coordinate the session(s), and help tailor the workshop content.

Hands-on workshops that include exercises in code maintanence are also possible to arrange.

For more details, if you are interested, contact me by email.

Darius Bacon - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 8:20:13 AM (reads: 1648, responses: 1)
I'd be interested in an informal workshop in the Los Angeles area. Anyone else?

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 8:34:54 AM (reads: 1658, responses: 0)
Maybe next time I'm in the States we can set something up

Todd Coram - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 11:43:10 AM (reads: 1588, responses: 3)
The article does nicely mention how deeply hierarchical OO can be hard to read. I'd take it even further:

I find reading almost any program using lots of classes (shallow or deep) difficult to read.

This is especially the case for languages that allow you to arbitrarily override method definitions. When I am perusing a derived class, I need to have a "mental image" of the parent class in order to see what methods "shine through" to the derived class and which are overridden. This can be maddening (especially when dealing with multiple inheritence or deep hierarchies).

I think that Eiffel requires you to specify that your are overriding a function by using a "redefine" keyword. That could certainly help. Also, ISE's Eiffel IDE allows you to "flatten" the structure of a class so you can see all of the inherited attributes/functions in one place. But that is an IDE feature...

A common Java problem: Given a dozen directories full of dozens (or hundreds!) of class/interface definitions (each in its own file), where do you start reading and how do you navigate?

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 11:48:04 AM (reads: 1597, responses: 1)
how deeply hierarchical OO can be hard to read.

I find that the best way to explain this (obvious) phenomenon to people is to make them realize the inheritance introduces coupling. People know coupling is problematic. Now let them draw their own conclusion about inheritance.

where do you start reading and how do you navigate?

That's a good question. Each programmer has his own tricks and tips. What are yours (tool suggestions are acceptable)?

Dominic Fox - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 1:19:23 PM (reads: 1568, responses: 1)

For Java, my answer would be to go look at the automatically generated documentation (Javadoc or whatever). Same for .Net, in fact, or any of the Python standard libraries. If you can browse the code at an interface level, with comments on each of the methods and hyperlinks to the docs for other classes that are referenced in their signatures, then you can start to build up a picture of the code that way. Then you jump in to specific modules to look at the implementation.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 2:58:34 PM (reads: 1552, responses: 0)
automatically generated documentation

These are useful (and semi language related), but not enough. If the system is large, you need to know where to start looking, even if you are only reading the documented interfaces - there's are simply too many of them.

Dominic Fox - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 3:17:31 PM (reads: 1538, responses: 0)

Sure - but then the problems when things get large are problems outside of OO as well, I think. At the point I'd be looking for some human-generated documentation for orientation. One of the (desirable) consequences of modularised code, regardless of whether OO methods are involved, is that you can't infer the rest of the program universe just by looking at one piece of it (I'm thinking of Adams' Total Perspective Vortex here, obviously). So if you want to know where to start, something other than the bit of code you're looking at right now has to tell you.

I think that Javadoc et al do a reasonable job of making the connections between local clusters of classes visible, so they go some way towards remedying the fragmentation caused by breaking the code up into separate classes in separate files in the first place. But they won't give you the big, big picture - just information about how the pieces of a given module or library fit together.

Personally I prefer to be able to put multiple class definitions together in the same file, as you can in Python and C#. But you could equally well put multiple one-file-per-class class definitions together in the same folder; and there's no reason why an IDE shouldn't be able to knit the contents of a single folder together into one continuous text field for reading and editing.

Dave Herman - Re: Reading, Writing, and Code  blueArrow
12/15/2003; 8:52:40 PM (reads: 1475, responses: 1)
That's a good question. Each programmer has his own tricks and tips. What are yours (tool suggestions are acceptable)?

It's low-tech, but I usually start with something along the lines of

find . -name '*.java' -print | xargs fgrep 'main('

to find the entry point(s), and then start doing a kind of depth-first search (usually just down the happy-path at first) through the code, via a combination of Javadoc and code browsing. It's been a while since my Java days, but I think I remember this being a productive approach.

Todd Coram - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 5:57:09 AM (reads: 1394, responses: 0)
For Java, my answer would be to go look at the automatically generated documentation (Javadoc or whatever).

But, I don't trust documentation--it's not the code. It's (at best) what "cliff notes" are to literature... Plus, documentation may (horrors!) not be updated when the code changes.

Someone once told me (Jim Coplien?): Trust the terrain, not the map.

andrew cooke - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 7:29:56 AM (reads: 1394, responses: 0)
enterprise architect (much cheaper - i guess i should say "lower price" to american readers - than rational rose, but pretty much equivalent in functionality) will read in java code and generate class diagrams. it helps (me) to understand complex inheritance trees (you learn largely during the process of dragging things around until the diagram looks pretty!).

andrew cooke - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 7:37:39 AM (reads: 1383, responses: 0)
for c programs, there's ctag (have i got the name right?) in unix.

i once used this plus a perl script to replace every occurence of an identfier with a link to its definition. that helped browse code (i was starting at a company with a lot of undocumented c code). i'm sure there are tools that do the same these days (maybe doxygen does this? http://www.stack.nl/~dimitri/doxygen/index.html)

Paul Snively - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 7:54:57 AM (reads: 1363, responses: 1)
andrew cooke: i once used this plus a perl script to replace every occurence of an identfier with a link to its definition.

Oh, so you reimplemented metapoint.

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 8:04:22 AM (reads: 1362, responses: 2)
I find Java programs especially hard to get started with. Deep directories, lots of files. You can't tell from the name whether a source file contains lots of meaty logic or just an interface declaration. When I need to see how a Java program works I start reading in order of file size:
find . -name "*.java" | xargs wc -l | sort -n
Ward Cunningham is smarter about this.

What I like is a small number of directories, each with a small number of files, each of which does something significant. The files should have names that indicate what they do, plus a one-line comment at the top spelling it out. That one-line "Purpose" comment is the most important thing! This seems to generally be how C, Lisp, and Erlang programmers do things.

I guess Smalltalkers have a totally different point of view, because they have their standard Browser. Perhaps Java programmers work like this too nowadays, and their sourcecode layout is just annoying from a Unix-centric viewpoint.

When I really can't tell which way is up, I teleport.

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 8:23:27 AM (reads: 1355, responses: 1)
In the past couple of years, I have also become a big fan of printing out programs and reading them. Trying to make a program read well linearly is the best code-improvement technique I know. Lisp programs tend to read well on paper (and generally to be well-written). Even well-written C programs tend to be horrible to read on paper, because subroutines usually come first.

Who else gets a warm and fuzzy feeling when they pull up a source file and see page-break characters in it? :-)

My latest habbit is that, when I want to know how some program on my computer works, I do apt-get source name-of-package and ten seconds later I have the source. I'm really impressed at how easy it usually is to find what I'm looking for in the source to popular (e.g. GNU) programs.

Todd Coram - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 9:45:04 AM (reads: 1330, responses: 0)
The first time I ever printed out (and read) a substantial sized program was in 1985 and the program was the source to Rutger University Pascal (for DEC 20) written in Pascal ;-)

Then came Knuth's TeX and Metafont. I find Literate Programmers a bear to maintain (funky mark up) but a pleasure to read.

I've always printed out my own programs for offline reads. I find so much more inconsistencies, unused variables and poor naming in my code through printed perusal than online browsing.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 9:51:25 AM (reads: 1341, responses: 0)
I start reading in order of file size

I also like this trick. It's easy, doesn't require fancy tools, and often the most effective technique.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 9:57:39 AM (reads: 1338, responses: 0)
In the past couple of years, I have also become a big fan of printing out programs and reading them.

I agree. Drawing arrows with a red pen is also very helpful for figuring things out (especially if you can take points off, whenever the coding style annoys you...)

Todd Coram - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 10:01:21 AM (reads: 1329, responses: 0)
IMHO, The only way to ascertain the "rhythm" of an application's source is to read it in printed form. Its only then when you can look at a program in its totality. Hopping around in a browser only gives you a low level view. You can't exhume structure at that level.

Every once in a while I print out and read the source to AFT. It consists of hobby-time coding over a period of 7 years. It certainly doesn't represent my ideal approach to programming (and I am not a Perl guru), but I can see spots where I improved my Perl coding, got into the rhythm of a particular technique and got hopelessly confused on how to properly use regular expressions.

Is the code maintainable? I think so. Its overall structure captures how features can be added through function dispatching and how global (blech) state is accessed.

Sure, I cringe sometimes at the darker corners of the code, but the overall structure is sound (although not ideal). The interesting bit about all of this (the reason why I bother mentioning my own crufty code), is that I still need to see the "whole picture" every once in a while in order to understand how to keep my localized code changes in order with the rest of the app.

andrew cooke - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 10:17:34 AM (reads: 1331, responses: 0)
Oh, so you reimplemented metapoint.

even with the name, i can't find info on this, so i guess i didn't find it at the time, either...

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 10:27:50 AM (reads: 1327, responses: 0)
Ward Cunningham is smarter about this.

Thanks for the link! I wasn't aware of this work, but it closely resembles my approach. I don't use cgi scripts, but I agree with Ward that ad-hoc, simple, browsing scripts are a great technique.

Mark Hecht - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 11:58:11 AM (reads: 1293, responses: 4)
Would anyone care to suggest some examples of excellent code that are publicly available?

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 3:25:44 PM (reads: 1255, responses: 0)
I think Metapoint is also known as M-., usually bound to find-tag or its equivalent in Emacs. If you press it on the name of a function, it takes you to the definition of that function.

I call it Meta-dot :-)

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/16/2003; 3:40:41 PM (reads: 1252, responses: 0)
Would anyone care to suggest some examples of excellent code that are publicly available?

Since I'm over my Norvig-recommending quota for the month, allow me some flattery. Just download any program from Darius Bacon's homepage. There are a lot of languages to choose from. I recommend Awklisp!

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 5:46:44 AM (reads: 1205, responses: 1)
Great question. To makeit even more appropriate for lambda, let me just say that some of the coolest code out there can be found inside language processing tools (interpreters, partial evaluators etc. etc.)

I am quite fond of Roger Hui's implementation of a J interpeter in C. The first couple of lines introduce macros that allowed Hui to code in a APL-ish style, instead of using the regular C idioms. A non-traditional example of language embedding.

I don't really remember why but I remember that reading agrep was a lot of fun. Keep in my, though, that I like the type of algorithms involved.

andrew cooke - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 6:06:40 AM (reads: 1204, responses: 1)
Would anyone care to suggest some examples of excellent code that are publicly available?

this (self link) has some recommendations.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 6:29:48 AM (reads: 1201, responses: 0)
Amusingly, I just compiled the code for the J interpreter I linked to, and it doesn't work properly (it crashes). I guess, I'll have to read the code, to find out what's going on...

Isaac Gouy - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 7:47:06 AM (reads: 1176, responses: 0)
Luke: I guess Smalltalkers have a totally different point of view
The structure of Smalltalk is different - most Smalltalks keep all the code in a single 'image' repository - so the question of how to organize directories or name files never arises.

OTOH Smalltalks environments typically provide a way to comment a class definition, which corresponds to 'purpose'. This is the start of one class comment: "Instances of class Fraction represent some rational number as a fraction. They can be created as a result of an arithmetic operation if one of the operands is a Fraction and the other is not a Float. All public arithmetic operations return reduced fractional results..."

Smalltalk environments are designed for searching & browsing (for classes & methods) rather than reading sequential program texts.

Isaac Gouy - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 8:04:24 AM (reads: 1191, responses: 0)
I need to have a "mental image" of the parent class
You need to have an IDE that distinguishes methods defined in the derived class, overridden in the derived class, or inherited from a particular superclass! (And allows you to specify the visibility of inherited methods.)

Paul Snively - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 2:26:08 PM (reads: 1124, responses: 0)
As someone else pointed (heh) out, "Metapoint" is a function that also shows up in EMACS, but it first appeared on the old LispMs, with the exact described functionality: put the cursor in a symbol, hit M-. and you'd be taken to the definition for that symbol. It never ceases to amaze me how we moderns think that we invented the problem of "programming in the large" and some tools for helping with that (or not), while the Smalltalk and Lisp systems of the '70s and early '80s were often much larger than entire programs that most programmers write today, and really did explicitly tackle "programming in the large," with considerable success.

Darius Bacon - Re: Reading, Writing, and Code  blueArrow
12/17/2003; 10:02:27 PM (reads: 1068, responses: 0)
That J interpreter looks pretty nifty -- reminds me of Alan Kay's 1-page interpreter for Smalltalk-72. (Which I haven't seen anywhere, I'm afraid, though there's some description in Kay's paper on the history of Smalltalk.)

David B. Wildgoose - Re: Reading, Writing, and Code  blueArrow
12/18/2003; 1:50:23 AM (reads: 1055, responses: 0)
I've tried to find the text for the 1-page Smalltalk interpreter before. If anybody has a link then I'd be grateful to be pointed in the right direction.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/18/2003; 3:54:47 AM (reads: 1068, responses: 0)
this (self link) has some recommendations.

Many good suggestions there. Thanks!

Mark Hecht - Re: Reading, Writing, and Code  blueArrow
12/18/2003; 8:21:58 AM (reads: 1026, responses: 0)
I've browsed some of the recommended code already (awklisp, agrep, J in C, and TCL 7.3), and as I have time I'll study it further.

I was kind of surprised by the J interpreter code. The lack of formatting was disturbing at first, but I can appreciate concise notation too.

Thank you all for your suggestions.

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/18/2003; 2:12:25 PM (reads: 991, responses: 3)
Speaking of Emacs and programming in the large: Emacs is a great example of programming in the large.

I'm not sure that many individual parts are especially breathtaking, but the Emacs design is worth studying. The Sawfish windowmanager became very popular with this design more recently. I'm still hoping the webbrowser folks will jump on board..

Frank Atanassow - Re: Reading, Writing, and Code  blueArrow
12/19/2003; 4:46:08 AM (reads: 982, responses: 2)
Emacs is a great example of programming in the large.

Emacs, like DrScheme, Java and the Express Project, is a programming language implementation extended to be a computing platform. This is a very powerful paradigm, and the best way I know to structure large programs.

It's too bad that every such project has to develop its own runtime. The JVM and .Net are OK, but IMO a serious programming language needs to have a native-compiled implementation.

Patrick Logan - Re: Reading, Writing, and Code  blueArrow
12/19/2003; 7:46:45 AM (reads: 944, responses: 0)
IMO a serious programming language needs to have a native-compiled implementation.

And so the GNU Compiler for Java.

http://gcc.gnu.org/java/

Luke Gorrie - Re: Reading, Writing, and Code  blueArrow
12/19/2003; 9:53:35 AM (reads: 933, responses: 0)
I don't think that every such project really has to write its own language and runtime. Just about any would do. But one or a few of them have to really catch on, and for progress they have to be compellingly better than what people use today.

To be clear, I'm interested in a better "computing environment", i.e. platform for writing programs that includes a coherent way of interacting with the user/programmer. Today we have a lot: Microsoft Windows, Unix shell, GTK/Gnome, KDE, Emacs, Wily, Squeak, to name a few.

"Better" is subjective of course. My favourite of the current crop is Emacs. Squeak Smalltalk seems much closer to my ideal, but I haven't been able to acquire a taste for it (I'm jealous of Squeak users). But I think it's okay to have different interfaces for different people (e.g. programmers vs. "grandmothers"), and only wasteful if a new interface can't attract enough users to sustain itself.

The subjectiveness is obvious since you say that the JVM and .NET are okay, whereas my brain switches off at the first mention of them. People working with those don't seem to share my sense of aesthetics, or are suppressing it because their intended users don't. Likewise people who are into GUIs based on GTK and so on - that's not the way I personally want to use my computer, it feels too low-level compared with Emacs.

The most promising development I see on the horizon is McCLIM. This is (as I understand it) a port of the Lisp Machine user interface to Common Lisp, so that you can run it on your PC. I view it as a step forward on the Emacs path by adding a presentation-based user interface and a more powerful (yet real-world proven) programming language, while otherwise being fairly similar in philosophy.

I would like to hear about other sophisticated computing environments that people are using today.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/20/2003; 1:17:08 PM (reads: 1069, responses: 1)
This is a very powerful paradigm, and the best way I know to structure large programs.

I agree. The Mozila platform (with Javascript, XUL etc.) is, perhaps, another example. Among the systems I chose to show my students this semester I chose emacs for exactly this reason.

Alas, not everyone shares our perspective. Joel Spolsky writes:

Suppose you take a Unix programmer and a Windows programmer and give them each the task of creating the same end-user application. The Unix programmer will create a command-line or text-driven core and occasionally, as an afterthought, build a GUI which drives that core. This way the main operations of the application will be available to other programmers who can invoke the program on the command line and read the results as text. The Windows programmer will tend to start with a GUI, and occasionally, as an afterthought, add a scripting language which can automate the operation of the GUI interface. This is appropriate for a culture in which 99.999% of the users are not programmers in any way, shape, or form, and have no interest in being one.

Er.. no. It's not appropriate, since it leads to programs that are less flexible and modifiable. That's an engineering problem, regardless of the OS you work on.

And, by the way, I don't think this has anything to do with Windows vs. Unix. It's about good design vs. bad design. You can produce good design on Windows, so I don't see why Joel feels he needs to make excuses for bad design, simply because some people think that these designs represent thw "Windows Culture".

Mark Evans - Re: Reading, Writing, and Code  blueArrow
12/21/2003; 4:25:49 PM (reads: 848, responses: 1)
Design Recovery Tool

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
12/22/2003; 1:02:11 AM (reads: 838, responses: 0)
Do people really find this kind of tools useful?

Isaac Gouy - Re: Reading, Writing, and Code  blueArrow
12/22/2003; 10:53:59 AM (reads: 802, responses: 0)
it leads to programs that are less flexible and modifiable
Are these absolutes or qualities of the design to be specified alongside portability, usability, reliability...

Joel Spolsky next paragraph:

There is one significant group of Windows programmers who are primarily coding for other programmers: the Windows team itself, inside Microsoft. The way they tend to do things is to create an API, callable from the C language, which implements the functionality, and then create GUI applications which call that API. Anything you can do from the Windows user interface can also be accomplished using a programming interface callable from any reasonable programming language.

Isaac Gouy - Re: Reading, Writing, and Code  blueArrow
12/22/2003; 11:03:06 AM (reads: 794, responses: 0)
okay to have different interfaces for different people (e.g. programmers vs. "grandmothers")

And grandmothers who program (need for large text?)

Powerful Personas

Pierre Phaneuf - Re: Reading, Writing, and Code  blueArrow
1/29/2004; 9:43:18 AM (reads: 573, responses: 0)
Considering they have such great component systems that allows for all your components to be automatically made scriptable (if you check the right checkboxes, of course) without much work on your part, this is silly.

Ehud Lamm - Re: Reading, Writing, and Code  blueArrow
3/18/2004; 12:57:04 AM (reads: 259, responses: 0)
Some relevant quotes can be found here.