Teaching & Learning

Tensor Considered Harmful

Tensor Considered Harmful, by Alexander Rush

TL;DR: Despite its ubiquity in deep learning, Tensor is broken. It forces bad habits such as exposing private dimensions, broadcasting based on absolute position, and keeping type information in documentation. This post presents a proof-of-concept of an alternative approach, named tensors, with named dimensions. This change eliminates the need for indexing, dim arguments, einsum- style unpacking, and documentation-based coding. The prototype PyTorch library accompanying this blog post is available as namedtensor.

Thanks to Edward Z. Yang for pointing me to this "Considered Harmful" position paper.

Seven Sketches in Compositionality: An Invitation to Applied Category Theory

Seven Sketches in Compositionality: An Invitation to Applied Category Theory

2018 by Brendan Fong and David I. Spivak

Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.
A very approachable but useful introduction to category theory. It avoids the Scylla and Charybdis of becoming incomprehensible after page 2 (as many academic texts do), and barely scratching the surface (as many popular texts do).

Simon Peyton Jones elected into the Royal Society Fellowship

Simon Peyton Jones has been elected as a Fellow of the Royal Society. The Royal Society biography reads:


Simon's main research interest is in functional programming languages, their implementation, and their application. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, Simon is interested in language design, rich type systems, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that is one reason he loves functional programming so much.

Simon is also chair of Computing at School, the grass-roots organisation that was at the epicentre of the 2014 reform of the English computing curriculum.

Congratulations SPJ!

Static vs. Dynamic Languages: A Literature Review

We've mentioned some empirical studies of programming languages a few times, but I haven't seen a comprehensive list we can use as a reference.

Fortunately, I just came across this pretty decent overview of existing literature on how types impact development. Agree or disagree with Dan Luu's position, the comprehensive list warrants a front-page post in my opinion.

One point worth noting is that all the studies used relatively inexpressive languages with bland type systems, like C and Java, and compared those against typed equivalents. A future study ought to compare a more expressive language, like OCaml, Haskell or F#, which should I think would yield more pertinent data to this age-old debate.

Part of the benefits of types allegedly surround documentation to help refactoring without violating invariants. So another future study I'd like to see is one where participants develop a program meeting certain requirements in their language of choice. They will have as much time as needed to satisfy a correctness test suite. They should then be asked many months later to add a new feature to the program they developed. I expect that the maintenance effort required of a language is more important than the effort required of initial development, because programs change more often than they are written from scratch.

This could be a good thread on how to test the various beliefs surrounding statically typed and dynamically languages. If you have any studies that aren't mentioned above, or some ideas on what would make a good study, let's hear it!

Using Commutative Assessments to Compare Conceptual Understanding in Blocks-based and Text-based Programs

Using Commutative Assessments to Compare Conceptual Understanding in Blocks-based and Text-based Programs, David Weintrop, Uri Wilensky. Proceedings of the eleventh annual International Conference on International Computing Education Research. Via Computing Education Blog.

Blocks-based programming environments are becoming increasingly common in introductory programming courses, but to date, little comparative work has been done to understand if and how this approach affects students' emerging understanding of fundamental programming concepts. In an effort to understand how tools like Scratch and Blockly differ from more conventional text-based introductory programming languages with respect to conceptual understanding, we developed a set of "commutative" assessments. Each multiple-choice question on the assessment includes a short program that can be displayed in either a blocks- based or text-based form. The set of potential answers for each question includes the correct answer along with choices informed by prior research on novice programming misconceptions. In this paper we introduce the Commutative Assessment, discuss the theoretical and practical motivations for the assessment, and present findings from a study that used the assessment. The study had 90 high school students take the assessment at three points over the course of the first ten weeks of an introduction to programming course, alternating the modality (blocks vs. text) for each question over the course of the three administrations of the assessment. Our analysis reveals differences on performance between blocks-based and text-based questions as well as differences in the frequency of misconceptions based on the modality. Future work, potential implications, and limitations of these findings are also discussed.

Eve: the development diary of a programming environment aimed at non-programmers

In spring 2012 Chris Granger successfully completed a Kickstarter fundraising and got $300K (instead of the requested $200K) to work on a live-feedback IDE inspired by Bret Victor "Inventing on principle" talk. The IDE project was called Light Table. It initially supported Clojure (the team's favourite language) only, but eventually added support for Javascript and Python. In January 2014, Light Table was open sourced, and in October 2014 the Light Table development team announced that they decided to create a new language, Eve, that would be a better fit for their vision of programming experience.

There is little public about Eve so far, no precise design documents, but the development team has a public monthly Development Diary that I found fairly interesting. It displays an interesting form of research culture, with in particular recurrent reference to academic works that are coming from outside the programming-language-research community: database queries, Datalog evaluation, distributed systems, version-control systems. This diary might be a good opportunity to have a look at the internals of a language design process (or really programming environment design) that is neither academic nor really industrial in nature. It sounds more representative (I hope!) of the well-educated parts of startup culture.

Eve is a functional-relational language. Every input to an Eve program is stored in one of a few insert-only tables. The program itself consists of a series of views written in a relational query language. Some of these views represent internal state. Others represent IO that needs to be performed. Either way there is no hidden or forgotten state - the contents of these views can always be calculated from the input tables.

Eve is designed for live programming. As the user makes changes, the compiler is constantly re-compiling code and incrementally updating the views. The compiler is designed to be resilient and will compile and run as much of the code as possible in the face of errors. The structural editor restricts partially edited code to small sections, rather than rendering entire files unparseable. The pointer-free relational data model and the timeless views make it feasible to incrementally compute the state of the program, rather than starting from scratch on each edit.

The public/target for the language is described as "non-programmers", but in fact it looks like their control group has some previous experience of Excel. (I would guess that experimenting with children with no experience of programming at all, including no Excel work, could have resulted in very different results.)

Posts so far, by Jamie Brandon:

Some random quotes.

Retrospective:

Excited, we presented our prototype to a small number of non-programmers and sat back to watch the magic. To our horror, not a single one of them could figure out what the simple example program did or how it worked, nor could they produce any useful programs themselves. The sticking points were lexical scope and data structures. Every single person we talked to just wanted to put data in an Excel-like grid and drag direct references. Abstraction via symbol binding was not an intuitive or well-liked idea.

[...]

Our main data-structure was now a tree of tables. Rather than one big top-level function, we switched to a pipeline of functions. Each function pulled data out of the global store using a datalog query, ran some computation and wrote data back. Having less nesting reduced the impact of lexical scope and cursor passing. Using datalog allowed normalising the data store, avoiding all the issues that came from hierarchical models.

At this point we realised we weren't building a functional language anymore. Most of the programs were just datalog queries on normalised tables with a little scalar computation in the middle. We were familiar with Bloom and realised that it fit our needs much better than the functional pidgin we had built so far - no lexical scoping, no data-structures, no explicit ordering. In late March we began work on a Bloom interpreter.

October:

Where most languages express state as a series of changes ('when I click this button add 1 to the counter'), Eve is built around views over input logs ('the value of the counter is the number of button clicks in the log'). Thinking in terms of views makes the current language simple and powerful. It removes the need for explicit control flow, since views can be calculated in any order that is consistent with the dependency graph, and allows arbitrary composition of data without requiring the cooperation of the component that owns that data.

Whenever we have tried to introduce explicit change we immediately run into problems with ordering and composing those changes and we lose the ability to directly explain the state of the program without reference to data that no longer exists.

[...]

In a traditional imperative language, [context] is provided by access to dynamic scoping (or global variables - the poor mans dynamic scope) or by function parameters. In purely functional languages it can only be provided by function parameters, which is a problem when a deeply buried function wants to access some high up data and it has to be manually threaded through the entire callstack.

December:

Eve processes can now spawn subprocesses and inject code into them. Together with the new communication API this allowed much of the IDE architecture to be lifted into Eve. When running in the browser only the UI manager lives on the main thread - the editor, the compiler and the user's program all live in separate web-workers. The editor uses the process API to spawn both the compiler and the user's program and then subscribes to the views it needs for the debugging interface. Both the editor and the user's program send graphics data to the UI manager and receiving UI events in return.

Dependently-Typed Metaprogramming (in Agda)

Conor McBride gave an 8-lecture summer course on Dependently typed metaprogramming (in Agda) at the Cambridge University Computer Laboratory:

Dependently typed functional programming languages such as Agda are capable of expressing very precise types for data. When those data themselves encode types, we gain a powerful mechanism for abstracting generic operations over carefully circumscribed universes. This course will begin with a rapid depedently-typed programming primer in Agda, then explore techniques for and consequences of universe constructions. Of central importance are the “pattern functors” which determine the node structure of inductive and coinductive datatypes. We shall consider syntactic presentations of these functors (allowing operations as useful as symbolic differentiation), and relate them to the more uniform abstract notion of “container”. We shall expose the double-life containers lead as “interaction structures” describing systems of effects. Later, we step up to functors over universes, acquiring the power of inductive-recursive definitions, and we use that power to build universes of dependent types.

The lecture notes, code, and video captures are available online.

As with his previous course, the notes contain many(!) mind expanding exploratory exercises, some of which quite challenging.

Socio-PLT: Principles for Programming Language Adoption

In their survey paper and their website, Leo Meyerovich and Ari Rabkin take Jared Diamond approach to explaining Programming Language adoption.

Why do some programming languages fail and others succeed? What does the answer tell us about programming language design, implementation, and principles? To help answer these and other questions, we argue for examining the sociological groundings of programming language theory: socio-PLT.
Researchers in the social sciences have studied adoption in many contexts. We show how their findings are applicable to programming language design.

There are also videos of talks available from Splash 2012 and Google Tech Talks.

See also previous discussions.

Brown CS: CSCI 1730: Programming Languages: On-Line Offering

We will be making this course, Brown's upper-level programming languages offering, available for free on the Web. People anywhere are welcome to view the lectures, read the materials, and do the assignments

This is a great opportunity! I have relied heavily on Shriram's lecture notes when I was starting out.

It is nice to see that he promises to give personal recognition for those who participate, and even has a system in place for giving partial credit to busy professionals who cannot spare the time to do all the assignments and projects.

My only misgiving is that the course uses Racket; I wish it was in Scheme.

Google Blockly Lets You Hack With No Keyboard

Following on from recent discussions about graphical languages in the Russian space program, here's a recent story about Google's new visual programming language.

Cade Metz, "Google Blockly Lets You Hack With No Keyboard", Wired Enterprise.

Now available on Google Code — the company’s site for hosting open source software — the new language is called Google Blockly, and it’s reminiscent of Scratch, a platform developed at MIT that seeks to turn even young children into programmers.

As the Blockly FAQ says, "Blockly was influenced by App Inventor, which in turn was influenced by Scratch." So if you've seen Scratch before, this will look very familiar. If you haven't seen Scratch, and want to have a go with Blockly, you can find the maze demo from the Wired story here.

XML feed