archives

Faith, Hope, and Love: An essay on software science’s neglect of human factors

Continuing with the crazy PL/HCI papers, here is another one, this time as a recent Onward essay, abstract (sorry, the ACM link was the only thing I could dig up):

Research in the area of programming languages has different facets – from formal reasoning about new programming language constructs (such as type soundness proofs for new type systems) over inventions of new abstractions, up to performance measurements of virtual machines. A closer look into the underlying research methods reveals a distressing characteristic of programming language research: developers, which are the main audience for new language constructs, are hardly considered in the research process. As a consequence, it is simply not possible to state whether a new construct that requires some kind of interaction with the developer has any positive impact on the construction of software. This paper argues for appropriate research methods in programming language research that rely on studies of developers – and argues that the introduction of corresponding empirical methods not only requires a new understanding of research but also a different view on how to teach software science to students.

more of the same

I recommend the development of a Lisp like language. I admit that I do not understand much of the discussion here, but I do not assume that organization of software is intrinsically tied to type safety. To explore the possibilities for programming language design, I emphasize expressiveness and simplicity. First, base types can be more expressive than machine types. String objects reduce the complications with relying on internationalization libraries. Bignums supply the behavior that the programmer expects from numbers. Even logical data can benefit as it can have more than two values.

Encapsulation should be added. A naive implementation is to add a tag with every cons and to check the tag with every car and cdr. A more restrictive means of enforcing encapsulation would likely interfere with the flexibility of the language. Name spaces are a practical necessity for reuse. An extensible form of name space may be prudent. Traditional lisp dialects have difficulty with interoperability (including separate compilation and separate memory management). A purely functional language has simpler semantics (also no need for quote).

While such a language would not need macros (except encapsulation which can be moved up, see placeholder below), it may be useful to switch between alternative notations. We can use a regular grammar for out language, so that most of the work of creating a homomorphism with another grammar can be done by a standard parsing algorithm. Some extra rules will need to be added because of ambiguity of the reverse mapping.

This language gives us a lot of flexibility, but if we want different semantics we can write an interpreter. In keeping with our goal, it would be good to include a library for creating interpreters. With a standard library there is increased opportunity for debugging and for optimized code paths (even if only occasionally applicable).

It should be clear that a compiler for our language will need all the help it can get. One method of helping the compiler is to use a placeholder function in the standard library. Such a placeholder function would return its first argument, so that the other arguments are available to encode information outside the semantics of the core language. Depending on compiler flags, the compiler (or other parser: IDE, SCM/VC, test framework, bug tracker, embedded, etc.) would direct the extra information to some algorithm. Obvious examples of what sort information could be passed this way include: type casting, scope restrictions, limited extent, rich/hypertext comments, revision number of piece of text, assertions.

There is nothing to stop extending such a mechanism from being used to execute arbitrary code. There is no way to insure that the meaning of the program is not changed (see equivalence problem), but it is important to realize that it is very easy to test. The placeholder acts as a one way barrier of information (tied to syntax instead of types, like with Monads). The combination of the added information with the core language can be thought of as a new language. This mechanism can be used strategically to distinguish the code for improved performance from the meaning of the program. Separating high and low level code should (eventually) result in substantial gains in productivity.

The flexibility of a language is also intrinsic ambiguity that requires dynamic checking. This kind of overhead is what makes a language slow. Static analysis can remove some checking, while statistical analysis can remove much of the rest. Even with a good test suite, the process will not be fully automatic. The remaining checks can be examined by the programmer (see manual optimization below). Keep in mind that some checking will be left in place because it is inconsequential or even worthwhile. A less desirable alternative (that would initially seem desirable to many) then to (optionally) add annotations, would be to require that such information would be provided at the outset. Our language can temporarily be turned into a static language by using a compiler flag that would require information to be supplied that would satisfy a criteria of the registered algorithm. Performance considerations do not end with checking. In a functional language lambda lifting and memory flattening are very important as well.

Much of the success of our strategy depends on how easy it is to add these annotations. If the kinds of optimizations I have written about so far are largely successful, then we have only come close to the speed of static languages before optimization. On the other hand, our language has the advantage of being simple (to analyze and modify). A manual form of optimization may also be handy: analysis is done to find prospective optimizations, profiling is done to rank the prospects according to their estimated performance impact, the suggested optimization are displayed to the user/programmer, the programmer verifies some of the guesses, the compiler puts the respective optimization into use.

Tracking run-time behavior is even more important for debugging than profiling. It should be possible to schedule a combination of check pointing (modified GC) and logging (instrumented) with hierarchical granularity, so that resources are concentrated on time sensitive procedures and long lived data (or otherwise volatile / suspect code, also for security).

This stuff seems to go together well. Maybe you rather try some other combination. How well statistical, user directed optimization would work and what type / constraint systems would be best to use with the core language are open questions. Hmm, I think this is the same as my last post. Yes, I know I am an idiot, but a specific question would help me. thanks

Build Your Own Blocks (BYOB)

Scratch has previously been discussed here, but I recently learned about BYOB (Build Your Own Blocks):

Welcome to the distribution center for BYOB (Build Your Own Blocks), an advanced offshoot of Scratch, a visual programming language primarily for kids from the Lifelong Kindergarten Group at the MIT Media Lab. This version, developed by Jens Mönig with design input and documentation from Brian Harvey, is an attempt to extend the brilliant accessibility of Scratch to somewhat older users—in particular, non-CS-major computer science students—without becoming inaccessible to its original audience. BYOB 3 adds first class lists and procedures to BYOB's original contribution of custom blocks and recursion.

Also check out Panther, another great advanced spinoff of Scratch with a somewhat different point of view. Panther team member sparks has created a Blocks Library here that includes a collection of downloadable BYOB blocks contributed by users. Thanks, sparks!

—Jens and Brian

Such a project poses some interesting questions: are lambdas really too hard for Scratch's audience (and has the design process been led astray into having such group think), are uncontrolled side effects and concurrency really a good idea for beginner programmers, how should a system like Scratch integrate older audiences (and its aging one) without sacrificing the sharing component, etc. Talking to one of the designers, it seems like they have some fascinating ideas going forward and are looking for collaborators!