Teaching & Learning
We've mentioned some empirical studies of programming languages a few times, but I haven't seen a comprehensive list we can use as a reference.
Fortunately, I just came across this pretty decent overview of existing literature on how types impact development. Agree or disagree with Dan Luu's position, the comprehensive list warrants a front-page post in my opinion.
One point worth noting is that all the studies used relatively inexpressive languages with bland type systems, like C and Java, and compared those against typed equivalents. A future study ought to compare a more expressive language, like OCaml, Haskell or F#, which should I think would yield more pertinent data to this age-old debate.
Part of the benefits of types allegedly surround documentation to help refactoring without violating invariants. So another future study I'd like to see is one where participants develop a program meeting certain requirements in their language of choice. They will have as much time as needed to satisfy a correctness test suite. They should then be asked many months later to add a new feature to the program they developed. I expect that the maintenance effort required of a language is more important than the effort required of initial development, because programs change more often than they are written from scratch.
This could be a good thread on how to test the various beliefs surrounding statically typed and dynamically languages. If you have any studies that aren't mentioned above, or some ideas on what would make a good study, let's hear it!
Using Commutative Assessments to Compare Conceptual Understanding in Blocks-based and Text-based Programs, David Weintrop, Uri Wilensky. Proceedings of the eleventh annual International Conference on International Computing Education Research. Via Computing Education Blog.
Blocks-based programming environments are becoming increasingly common in introductory programming courses, but to date, little comparative work has been done to understand if and how this approach affects students' emerging understanding of fundamental programming concepts. In an effort to understand how tools like Scratch and Blockly differ from more conventional text-based introductory programming languages with respect to conceptual understanding, we developed a set of "commutative" assessments. Each multiple-choice question on the assessment includes a short program that can be displayed in either a blocks- based or text-based form. The set of potential answers for each question includes the correct answer along with choices informed by prior research on novice programming misconceptions. In this paper we introduce the Commutative Assessment, discuss the theoretical and practical motivations for the assessment, and present findings from a study that used the assessment. The study had 90 high school students take the assessment at three points over the course of the first ten weeks of an introduction to programming course, alternating the modality (blocks vs. text) for each question over the course of the three administrations of the assessment. Our analysis reveals differences on performance between blocks-based and text-based questions as well as differences in the frequency of misconceptions based on the modality. Future work, potential implications, and limitations of these findings are also discussed.
There is little public about Eve so far, no precise design documents, but the development team has a public monthly Development Diary that I found fairly interesting. It displays an interesting form of research culture, with in particular recurrent reference to academic works that are coming from outside the programming-language-research community: database queries, Datalog evaluation, distributed systems, version-control systems. This diary might be a good opportunity to have a look at the internals of a language design process (or really programming environment design) that is neither academic nor really industrial in nature. It sounds more representative (I hope!) of the well-educated parts of startup culture.
Eve is a functional-relational language. Every input to an Eve program is stored in one of a few insert-only tables. The program itself consists of a series of views written in a relational query language. Some of these views represent internal state. Others represent IO that needs to be performed. Either way there is no hidden or forgotten state - the contents of these views can always be calculated from the input tables.
Eve is designed for live programming. As the user makes changes, the compiler is constantly re-compiling code and incrementally updating the views. The compiler is designed to be resilient and will compile and run as much of the code as possible in the face of errors. The structural editor restricts partially edited code to small sections, rather than rendering entire files unparseable. The pointer-free relational data model and the timeless views make it feasible to incrementally compute the state of the program, rather than starting from scratch on each edit.
The public/target for the language is described as "non-programmers", but in fact it looks like their control group has some previous experience of Excel. (I would guess that experimenting with children with no experience of programming at all, including no Excel work, could have resulted in very different results.)
Posts so far, by Jamie Brandon:
- Eve so far (October 2014): a summary of the work from January to September 2014. The general vision; a move from a functional pidgin to a datalog-like language motivated by non-programmers testing; discussions of algorithms for incremental Datalog evaluation (ending with a poor-man solution).
- October: bootstrap editor, experiments, aggregates: the team picks five concrete use-cases and draws lessons for their language needs
- November: more experiments, better performance, integrity constraints, zztrees
- December: more zzjoin, communication, process spawning
- January / February: GUIs, time, joins and aggregates
- Version control, collaborative editing and undo
Some random quotes.
Excited, we presented our prototype to a small number of non-programmers and sat back to watch the magic. To our horror, not a single one of them could figure out what the simple example program did or how it worked, nor could they produce any useful programs themselves. The sticking points were lexical scope and data structures. Every single person we talked to just wanted to put data in an Excel-like grid and drag direct references. Abstraction via symbol binding was not an intuitive or well-liked idea.
Our main data-structure was now a tree of tables. Rather than one big top-level function, we switched to a pipeline of functions. Each function pulled data out of the global store using a datalog query, ran some computation and wrote data back. Having less nesting reduced the impact of lexical scope and cursor passing. Using datalog allowed normalising the data store, avoiding all the issues that came from hierarchical models.
At this point we realised we weren't building a functional language anymore. Most of the programs were just datalog queries on normalised tables with a little scalar computation in the middle. We were familiar with Bloom and realised that it fit our needs much better than the functional pidgin we had built so far - no lexical scoping, no data-structures, no explicit ordering. In late March we began work on a Bloom interpreter.
Where most languages express state as a series of changes ('when I click this button add 1 to the counter'), Eve is built around views over input logs ('the value of the counter is the number of button clicks in the log'). Thinking in terms of views makes the current language simple and powerful. It removes the need for explicit control flow, since views can be calculated in any order that is consistent with the dependency graph, and allows arbitrary composition of data without requiring the cooperation of the component that owns that data.
Whenever we have tried to introduce explicit change we immediately run into problems with ordering and composing those changes and we lose the ability to directly explain the state of the program without reference to data that no longer exists.
In a traditional imperative language, [context] is provided by access to dynamic scoping (or global variables - the poor mans dynamic scope) or by function parameters. In purely functional languages it can only be provided by function parameters, which is a problem when a deeply buried function wants to access some high up data and it has to be manually threaded through the entire callstack.
Eve processes can now spawn subprocesses and inject code into them. Together with the new communication API this allowed much of the IDE architecture to be lifted into Eve. When running in the browser only the UI manager lives on the main thread - the editor, the compiler and the user's program all live in separate web-workers. The editor uses the process API to spawn both the compiler and the user's program and then subscribes to the views it needs for the debugging interface. Both the editor and the user's program send graphics data to the UI manager and receiving UI events in return.
Conor McBride gave an 8-lecture summer course on Dependently typed metaprogramming (in Agda) at the Cambridge University Computer Laboratory:
Dependently typed functional programming languages such as Agda are capable of expressing very precise types for data. When those data themselves encode types, we gain a powerful mechanism for abstracting generic operations over carefully circumscribed universes. This course will begin with a rapid depedently-typed programming primer in Agda, then explore techniques for and consequences of universe constructions. Of central importance are the â€œpattern functorsâ€ which determine the node structure of inductive and coinductive datatypes. We shall consider syntactic presentations of these functors (allowing operations as useful as symbolic differentiation), and relate them to the more uniform abstract notion of â€œcontainerâ€. We shall expose the double-life containers lead as â€œinteraction structuresâ€ describing systems of effects. Later, we step up to functors over universes, acquiring the power of inductive-recursive definitions, and we use that power to build universes of dependent types.
The lecture notes, code, and video captures are available online.
As with his previous course, the notes contain many(!) mind expanding exploratory exercises, some of which quite challenging.
In their survey paper and their website, Leo Meyerovich and Ari Rabkin take Jared Diamond approach to explaining Programming Language adoption.
Why do some programming languages fail and others succeed? What does the answer tell us about programming language design, implementation, and principles? To help answer these and other questions, we argue for examining the sociological groundings of programming language theory: socio-PLT.
Researchers in the social sciences have studied adoption in many contexts. We show how their findings are applicable to programming language design.
There are also videos of talks available from Splash 2012 and Google Tech Talks.
See also previous discussions.
We will be making this course, Brown's upper-level programming languages offering, available for free on the Web. People anywhere are welcome to view the lectures, read the materials, and do the assignments
This is a great opportunity! I have relied heavily on Shriram's lecture notes when I was starting out.
It is nice to see that he promises to give personal recognition for those who participate, and even has a system in place for giving partial credit to busy professionals who cannot spare the time to do all the assignments and projects.
My only misgiving is that the course uses Racket; I wish it was in Scheme.
Following on from recent discussions about graphical languages in the Russian space program, here's a recent story about Google's new visual programming language.
Cade Metz, "Google Blockly Lets You Hack With No Keyboard", Wired Enterprise.
Now available on Google Code â€” the companyâ€™s site for hosting open source software â€” the new language is called Google Blockly, and itâ€™s reminiscent of Scratch, a platform developed at MIT that seeks to turn even young children into programmers.
As the Blockly FAQ says, "Blockly was influenced by App Inventor, which in turn was influenced by Scratch." So if you've seen Scratch before, this will look very familiar. If you haven't seen Scratch, and want to have a go with Blockly, you can find the maze demo from the Wired story here.
Interactive Tutorial of the Sequent Calculus by Edward Z. Yang.
This interactive tutorial will teach you how to use the sequent calculus, a simple set of rules with which you can use to show the truth of statements in first order logic. It is geared towards anyone with some background in writing software for computers, with knowledge of basic boolean logic. ...
Proving theorems is not for the mathematicians anymore: with theorem provers, it's now a job for the hacker. â€” Martin Rinard ...
A common complaint with a formal systems like the sequent calculus is the "I clicked around and managed to prove this, but I'm not really sure what happened!" This is what Martin means by the hacker mentality: it is now possible for people to prove things, even when they don't know what they're doing. The computer will ensure that, in the end, they will have gotten it right.
The tool behind this nice tutorial is Logitext.
Kalani Thielen's The Algebra of Data, and the Calculus of Mutation is a very good explanation of ADTs, and also scratches the surfaces of Zippers:
With the spreading popularity of languages like F# and Haskell, many people are encountering the concept of an algebraic data type for the first time. When that term is produced without explanation, it almost invariably becomes a source of confusion. In what sense are data types algebraic? Is there a one-to-one correspondence between the structures of high-school algebra and the data types of Haskell? Could I create a polynomial data type? Do I have to remember the quadratic formula? Are the term-transformations of (say) differential calculus meaningful in the context of algebraic data types? Isnâ€™t this all just a bunch of general abstract nonsense?
(hat tip to Daniel Yokomizo, who used to be an LtU member...)
From the syllabus of the Cambridge course on Usability of Programming Languages
Compiler construction is one of the basic skills of all computer scientists, and thousands of new programming, scripting and customisation languages are created every year. Yet very few of these succeed in the market, or are well regarded by their users. This course addresses the research questions underlying the success of new programmable tools. A programming language is essentially a means of communicating between humans and computers. Traditional computer science research has studied the machine end of the communications link at great length, but there is a shortage of knowledge and research methods for understanding the human end of the link. This course provides practical research skills necessary to make advances in this essential field. The skills acquired will also be valuable for students intending to pursue research in advanced HCI, or designing evaluation studies as a part of their MPhil research project.
Is this kind of HCI based research going to lead to better languages? Or more regurgitations of languages people are already comfortable with?