LiteratePrograms wiki

Derrick Coetzee has recently announced an interesting new wiki called LiteratePrograms. LP is based on Wikipedia's MediaWiki system, but adds some capabilities from the noweb literate programming system. Quoting from the LP website:
LiteratePrograms is a unique wiki where every article is simultaneously a document and a piece of code that you can download, compile, and run by simply using the "download code" tab at the top of every article. See Insertion sort (C, simple) for a simple example. To date we have 3 articles.
Based on Donald Knuth's concept of literate programming, LiteratePrograms is a collection of code samples displayed in an easy-to-read way, collaboratively edited and debugged, and all released under the liberal MIT/X11 License (see LiteratePrograms:Copyrights) so that anyone can use our code and text for any purpose.
While it's obviously just getting started, and thus has fairly minimal content, I think that the idea behind LP is an interesting one. It seems like there's a lot of potential for the LP wiki to become both a handy resource for (well-documented) code-snippets, and a great educational tool.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Sounds great!

Does it to do Ada? ;-) (Yes, it does!)

I actually planned something like this, but I want to be able to compile on the server, and integrate compiler erros into the wiki page. This is something I can use in my teaching.

Another use: The LtU-opedia will include a section on great software systems, I hope, and we can use something like this for excerpts worth discussing.

Compiling on the server

Hi Ehud, I'm the site originator. Compiling on the server and supplying compiler warnings/errors (and maybe even running unit tests in a secure sandbox) is on my to-do list. Ideally, it wouldn't allow you to even submit content that isn't clean. I'm just getting started, but I appreciate your feedback!

Uncompilable code

I've been thinking about that, actually. I guess you should report errors, and let the page editor decide if he wants to go forward and post the faulty code.

Since I am thinking about using such a thing for teaching purposes, I want students to post ideas, even if they aren't sure how to make the work - so that their fellow students can fix errors, refactor the code, or fix the design.

Compiling vs. Running?

I don't see why code that emits errors and warnings shouldn't be permitted to be posted. Indeed, I'd think that the server wouldn't care whether there are errors or not. Just capture whatever it is that the compiler emits and spit that out somewhere. The output could be anything from: "Compilation successful" to "Maximum number of Errors exceeded".

It'd be quite a different thing if we were talking about code that actually ran on the server. There yo would need the concept of clean and safe code - something that'd be a little hard to prove. The only danger in compiling code is feeding it code that causes the compiler to crash or go into endless loops (or just spend way too much time groking the code).

Just capture whatever it is

Just capture whatever it is that the compiler emits and spit that out somewhere.

In my imagination, compiler messages (let's not forget warnings) will appear below the line of code they relate to (similar to what compilers listings look like).

Compiler messages and environments

One issue with compiler messages is that the server can only operate in a single environment (a particular setup of Gentoo Linux), and I expect to take code submissions for Windows and other OS's I don't even own a copy of (ref List of environments). Some of this can be done on other machines over the LAN. This isn't such a big deal for portable code, but more so for stuff using the Win32 API or the .NET Framework like on The Code Project.

You are right, of course.

You are right, of course. You don't want to set up an infrastructure of the kind big companies QA departments use. I say, support compilation only for code that uses the language+standard libraries.

Vandalism prevention

For the LP wiki I suspect that the best approach will be to have any compiler errors reported directly on the page in question. That would make it a lot easier for editors to find and fix any vandalism that might occur. I suppose that a similar rationale would apply to educational uses like Ehud is suggesting, except that the "vandals" in that case would be students who didn't fully understand what they were doing :-)

What about a user

What about a user downloading and running what he thinks is the code for the "Hello World!" python example when in fact the code has been replaced with the equivalent of 'rm -rf /*' ? Compiles fine, no errors. However the output is not quite what the user is expecting.

More than one way to skin a cat

Well, I'm not suggesting that compiler errors would be the only way to catch vandalism. It simply aids the process.

Good point, just be careful

This is a good point, and I don't have a really good story for this right now other than "read before you run". Since it's targetted more at programmers than end customers I don't think this is too big of a deal, although a vandal could obscure malicious code. Maybe the MediaWiki validation feature, where a new version is not made available for download until someone verifies it's okay, could be helpful. I don't think I'm going to take any preemptive action just now though.

Running code on the server

Actually I think the risk of code running on the server is less than you think as long as it runs in a jail. It's okay if the code or compiler crashes or runs forever. If it crashes, it only kills that process, and I can use gdb to collect a stack trace for display. If it runs too long, I just kill it. I can create a special restricted user account and use chroot and quotas to limit what it can do. Of course I can't run code that only works in other environments (especially if those environments can't be made secure like Windows 95).

ptrace

When running code on the server one should also prevent network access and probably a handful of other things. A friend of mine running a public test-server for alogorithmic contests (where you supply a solution as source code) uses ptrace syscall to disallow file and network access. This is quite easy under Linux, where you can for example only allow a few syscalls, that are going to be needed.

Xen & VMWare

Running untrusted code on the server is a problem tailor-made for Xen and VMWare. Both have free versions for something like this; Xen is OpenSource, VMWare has a "VM player" version. Setup a virtual machine with policies for network access, disk space, etc. and let the code run in it. When it's done (or enough time has elapsed), just tell the Hypervisor to kill the VM.

You could also setup the VM to have read-only access to an NFS share on the server, running code directly off it. This way you wouldn't need to set *any* limits on the running code or set any special policies. Anything the code did to the VM would dissapear when the VM was killed.

It's also OS-neutral; Linux, BSD, Solaris, Windows, etc..

The only downside is the heavier processing requirements; corporate servers typically run 15-20 VMs per server box (~4 VMs per CPU core).

--Bryan

Re: Uncompilable code

That's a good point. It also fits with the general wiki spirit that it's okay to leave things undone. I'm also considering ideas for allowing articles to use code from other articles, or even extend/modify another article's code (sort of a "literate patch"). If you're interested in setting up your own copy for teaching I could post a patch of my changes, but it'd be even better if they could contribute directly to the main site. Thanks for your interest.

Thanks

I think setting something local is the best approach, if I want students to feel safe. But I'll have to see how are tech people will react. I'll get in touch...

Thanks.

Compromise

Perhaps the finished version of student pages (maybe reviewed by you first) could be moved onto the main LP site? That would allow the students to feel safe while they were learning and developing a particular piece of code, but also help to fill out the content on LP.

Is the patch available?

Can I download the changes you made to MediaWiki from somewhere?

Cheers!

svn

This page explains how to get the lp source from svn.

Teaching tool

My own thoughts on how LP might be used in an educational setting were more along the lines of a reference for well-written and (hopefully) very well-explained code. Kind of a dynamically generated, online textbook. I hadn't thought about getting students to collaborate on code. But I can see that being a useful teaching tool too.

Try it out in the Sandbox

Hi all. I just wanted to make sure everybody knew that if you just want to try out the features and markup, you can make any edit you like in the Sandbox, which is for experimentation. You can also read about the new markup at How to write an article.

literate programming

Visit wiki.axiom-developer.org.

Axiom is a large, free, general purpose computer algebra system
written in common lisp.

Axiom uses noweb as its literate programming tool and has for the
last few years. Everything in Axiom is a literate document. There
is no C, Lisp, Spad, Boot, Aldor, shell scripts or Makefiles. They
are all Latex documents and the code is automatically extracted at
build time as well as the documentation.

The wiki has an example of the code at
http://wiki.axiom-developer.org/SandBoxPamphlet
which is the actual source text of the axiom sources.

Axiom is involved in a second project at
http://sourceforge.net/projects/doyencd
which is building a LiveCD (boot linux from CD and run).
Of note here is that the liveCD version has a browser and
a web server built in. It will be possible to drag and drop
a pamphlet file (literate program) onto the page and have
the code extracted, compiled, and added to the system as
well as having the documentation extracted, latexed and
added to the system.

The noweb literate programming effort is being driven by the
experience of not being able to understand my own code after
20 years. I now believe that it is much more important to write
code intended for a human than it is to write code intended for
a machine. The only thing that keeps code "live" will be that
someone can maintain, modify, and extend it after you finish
with it. Code dies because it can't be maintained.

Write code that people want to sit and read. It's the only viable
strategy for code to live past the 30 year horizon.

Tim Daly
Axiom Lead Developer

web-based ide

Why just use this as a teaching tool. Couldn't this idea be used to create a collaborate multi-user web-based ide. Then programming would become something similar to editing the wikipedia. The requirements for the software project, documentation, tutorial etc could be developed as in the wikipedia. The code would be automatically run through unit tests and a compiler. Anyone would be able to edit the program. This would lower the barriers to contributing to an open source project and also make open source projects more collaborative and tranparent. I think this could be a more effective method of implementing open source projects.

Excellent point

The LP backend could ultimately become quite a useful collaborative development tool. And a kind of version control is already built in to the system via page histories. Something to think about for the future I suppose...

Large projects and fragility

This is an interesting idea, but there's some risk with scaling up from one-article code samples to larger projects and it's not yet clear to me how to address this. Even if changes are restricted to a fixed list of trusted developers, large projects are inherently more fragile in that a change to one area can break many other areas. If the software had some kind of branching support, it might be possible to have a scheme where modules are independently edited on private branches without breaking other modules, then committed to a public view when they're believed to be safe.