Insights on teaching computer programming

I have been teaching programming for nearly ten years now, to university students from the first to the fourth year, to both CS majors and non-CS majors (all in technical fields). I have tried to distill results from programming language research into these courses (and wrote a book in the process, CTM). Recently, I had two insights:

1. The best year in which to teach a programming course based on programming concepts is the second year. In the first year, students are not mature enough. In the third and later years, students get conservative. In the second year, they have enough maturity to understand the concepts and enough openness to appreciate them. At UCL, we have been doing this for two years now for all engineering students (more than 300 in Fall 2005) using a first-year course based on Java, and it works remarkably well (better than I expected).

2. Students can be taught programming semantics in the second year. This can succeed if: (1) the semantics requires very little mathematical baggage, just sets and functions, (2) the semantics is factored so it can be taught incrementally, and (3) the semantics is simple and uncluttered so that students can work out a program's execution with paper and pencil. The semantics of CTM is one example that fits these conditions. The students consider the semantics the hardest part of the course. I think it's because they've never been exposed to this kind of formalism, so they are a bit unsettled by it. But I am convinced that students in any technical field should be taught programming language semantics at least once in their careers. It should be as basic as algebra or calculus.

What do you think? I would appreciate hearing your experience and comments. Here are the course notes of the latest version of my course (they are in French; there are English course notes too but they are not as nice for the second year).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Two stories

I just noticed that another story on undergraduate education was posted at just about the same time as this one. Maybe this story can help answer some of the questions of the other one.

Book!

It sounds like a sound approach.

I'm adding that book to my wishlist.

Don't wait until second year!

I am teaching semantics in a first programming course in first year (125 students this fall). The reason I can do this is that I am using Scheme and the book HtDP which uses nested language subsets. It has some support for reduction semantics (buried in "Intermezzo"s between major sections) but I bring it out more, and augment the students' existing understanding with each new feature, as you suggest.

Of course, CTM (which is on my bookshelf, though I have not read it in depth yet) has grander ambitions, and I was being facetious in my title. But there are some parallels: my students also find the semantics hard, and cannot always trace the reduction of small programs accurately, even though this is the most mechanical thing I request of them. But I contend that the difficulty many students have with more complex languages (read: Java) is precisely the lack of a proper (or even informal) semantic model, without which they are generalizing (often imperfectly) from examples.

I agree with your notion of timing for a concepts course. First and second year are magic times. We need to be doing more with students in this period, instead of plodding through basic programming, data structures, and OO design. --PR

First year

I am curious to hear more about your experience teaching semantics the first year. My students have enough trouble learning programming, which is already quite a precise discipline (the rigors of syntax and algorithmic thinking: telling the computer exactly what it should do!). During the first year I find it is enough to give them precise, natural-language definitions of certain concepts (scoping, encapsulation, assignment, and so forth). This helps them to understand programming. But I don't give a semantics formalism as such.

I have it easier

I have an easier time of it than you, since HtDP uses small subsets of Scheme (with some added constructs, e.g. structures) as a precursor to Java (a second-term course). This offers three advantages: first, simplicity; second, a purely functional world until week 10; third, reduction semantics (meaning you don't have to go outside the language to specify the meaning of a program).

To this, I personally can add a fourth advantage: my students are entering a Faculty of Mathematics based on their performance in high-school math. So not only do the semantics nicely parallel the notion of substitution with which they are familiar from algebra, but they are good at it.

At three points (corresponding to Beginning Student with List Abbreviations, Intermediate Student with lambda, and Advanced Student in the HtDP language hierarchy) I make available summaries of the syntax and semantics to date, and each one fits onto two typeset pages in 12pt type. Furthermore, the rules are pretty intuitive, for the most part.

I also sprinkle "traces" using the semantics throughout; complete ones at the beginning, but then "condensed" ones later on, though I try to condense in such a way that it's evident what is omitted (usually conditionals, multi-argument special forms such as "and", and applications of accessors for structures and lists).

Back when Pascal and Fortran were the teaching languages, students used to trace programs using tables: at line 10, here are the values of all the variables. There is very little of that now. The best students (e.g. everyone reading this comment) don't need that; they understand the abstractions. The rest of the students don't quite know where to turn when their intuitive notions of what their program does clash with the only real source of "meaning" they have, namely running the program. I am fortunate in that I can be a little more rigourous about semantics than someone teaching Java or C++ as a first language. But I believe going further down this road than most current practice is the key to restoring some effectiveness to first courses.

My course home page is here if you wish to look at my materials. The course follows HtDP fairly closely, with perhaps a bit more rigour. --PR

Don't needlessly mathematize computer programming

I am a bit confused by your message; let me explain why.

This offers three advantages: first, simplicity;

I don't understand; my approach is quite simple too.

second, a purely functional world until week 10;

I don't consider this an advantage, on the contrary. I try to give my students exposure to more than functional programming, such as state, object-oriented principles (encapsulation, polymorphism, inheritance), abstract data types, exceptions, and dataflow concurrency. It is nice to start with functional programming but there are many useful concepts outside of it that students should see. There is absolutely nothing wrong with state, on the contrary it is an essential concept that is needed to achieve modularity in programs.

third, reduction semantics (meaning you don't have to go outside the language to specify the meaning of a program).

Again, I don't consider this an advantage. Defining something in terms of itself can be very confusing to students. I think it is much better to define something in terms of something else. If you have a kernel language, then you can define a rich language in terms of the kernel language. But the kernel language itself should be defined in terms of something else: something the students already understand like simple mathematics. In my experience, new concepts should always be grounded in existing concepts.

a fourth advantage: my students are entering a Faculty of Mathematics based on their performance in high-school math.

Computer programming is not the same subject as mathematics. I think that the two fields are quite different, and one should not needlessly mathematize computer programming. From the viewpoint of computer programming, mathematics is just a tool.

Mathematics as a scaffold for lax teaching?

I agree completely that you need rigor, but one doesn't need a lot of mathematical baggage to be rigorous. I do think I understand your point though: teachers who don't mathematize much tend not to be rigorous. It's a lot of effort to be rigorous without overmathematizing (pardon the neologism), but it's not impossible.

I'm not

I am a bit confused by your message

And I am by yours! You suggested that programming in first year was hard enough without introducing semantics; I was pointing out why our situations were different. I was not claiming that what I am doing in first year is in any way a replacement for what you are doing in second year. So when you say I try to give my students exposure to more than functional programming, such as state, object-oriented principles (encapsulation, polymorphism, inheritance), abstract data types, exceptions, and dataflow concurrency you are comparing an introductory programming course for those who may lack prior computing experience to a second-year concepts course.

So let me back up and state my premise again, perhaps more clearly this time. I think that a certain amount of semantics in first year is possible and in fact can help students in understanding the art of programming, and prepare them for further study in a course such as you are teaching. But I think that Scheme-then-Java is preferable to Java-first.

I agree with your points about kernel languages and richer languages, except where a first course is concerned. I don't think the students' understanding of mathematics, even my incoming Math majors, is sufficiently strong on which to base semantics in a first course, though I am willing to be convinced otherwise. (They are weak on set theory, for example, until they get through their first university math course which they take concurrently with my course.) Saying "we rewrite programs according to these rules until we get down to a sequence of values" is pretty simple, and every intermediate step is itself a legal program. It is particularly valuable when introducing recursion.

As for "needlessly mathematizing", I don't think I am doing that, though again if anyone can point out specifics in my materials, I will correct this. But if students have math skills -- in particular, if they understand elementary algebra -- then these can be put to good use by giving them semantics whose manipulations resemble what they are familiar with. I can't think of anything else approaching a formal system that they understand at a similar level when they enter university. --PR

Semantics in the first year, etc.

I agree with everything you say. I think that I misunderstood the context of your earlier message. Some semantics in the first year is beneficial, if it is well integrated and unobtrusive.

I think that the ultimate solution will be that the basic concepts of programming have to be taught correctly in high school. Whatever happened to the Logo project of Seymour Papert from the 1980s? The book 'Mindstorms' that documents his experience is still worth reading today. It seems that computer science education in high school has regressed since then.

But in the meanwhile, university courses have to be able to correct the students' bad beginnings and fill in the gaps. It seems that this can be done in either the first year (see the work of Matthias Felleisen and his colleagues and the book HTDP) or the second year (this is my experience).

Mindsorms

The book 'Mindstorms' that documents his experience is still worth reading today.

I second that. His later work isn't all that exciting, but this book, and the Logo experience in general is worth exploring. (Disclosure: My first programming language was Logo...)

Some Thoughts

I've only had a bit of experience teaching undergraduates (1st and 2nd year) but none the less I have some comments.

Firstly, I think background makes an enormous difference. Our entry requirements have changed recently, and the difference in the students is noticeable. The more maths or science background the students have, the easier it seems to teach them.

Java is a terrible language to teach in. It's too complex is too many ways, starting with syntax.

I agree that semantics should be presented with as little mathematical baggage as possible. However I think it would be possible to present it in the first year. Certainly I had a 1st year student who was really helped when I showed him how a program could be reduced. He really had no concept of the semantics of Java, and the natural language descriptions in the notes just appeared to confuse.

Reduction

I think that it might be useful to show examples of reduction and/or other program transformation. It might even make sense to use this to explain the semantics of the programming language being taught (i.e, in CS1, not the PL course). However, you can do all that without defining what reduction semantics are. Just show them how it's done, using example.

Second, if you want to use this technique, choose an applicative language, with little compile time checking (i.e., static semantics, which aren't going to be explained by the reduction relation).

The I/O model

This is one place where I think more formal semantics might help.

More of the languages that are being taught offer a complex and rich set of I/O operations. How these interact with each other and with the input buffer is often explained informally, leading students to experiment until the code "just works". Even if you ask students to formally prive their progrmas correct, the one place left undefined is I/O, and the handling of I/O errors (which are a fact of life, of course).

For quite some time I am of the opinion that we should offer a limted set of I/O operations, formally defined, and that we should include I/O in at least some of the correctness proofs presented.

Real programs do I/O, and if we neglect this aspect of their behavior we are implying that the way to produce programs that work correctly is to hack until things seem to work (e.g, "let's flush the buffer here"). This is sending the wrong message.

Mapping to natural language syntax

I also teach programming, usually of the object-oriented variety.

I have always wondered whether trying to map natural language syntax to formal computer language commands would make it easier for students. This is in addition to teaching algorithms per se.

For instance, take the English sentence "You with the fat mouth, eat your fish with alacrity, but don't choke on it!" This sentence can be parsed into subject, object, verb, adjective, adverb structure, including phrases or sub-clauses of the major clause that represent (return the values of) subjects, objects, adjectives, and adverbs. I can then easily map the sentence into standard programming language syntax, say, in C# or Java. The phrases or sub-clauses that play the non-verb roles are, for example, procedures that return a single value to the outermost clause. (Please don't ask me to go through it with you here though, if you don't mind.)

Now, is it possible that humans, especially programming beginners, actually solve algorithms better than we think, but run into problems partly because they trip on translating human language into computer language? And, is it possible that many of the difficulties in teaching programming stem from the fact that students of today seldom know much about natural language syntactical structures (which also limits their ability to easily learn foreign languages, by they way)?

Does anyone know of any good research in this area, or have opinions of their own? Every time I suggest that this is problem to my colleagues, they pooh-pooh me, saying that by far the major problem is that the students can't solve algorithms. But, as I said, is inability to map syntax from one language to another really a good part of that problem with algorithmic thinking, as well as a problem on its own?

they pooh-pooh me, saying

they pooh-pooh me, saying that by far the major problem is that the students can't solve algorithms.

I can imagine ways of testing this statement, for example, you could have a knowledgeable programmer "translate" for the students or you could have one student write down an algorithm in natural language and give it to a second student to perform, the second student not knowing what the ultimate goal is, and other ideas like this with varying levels of subjective/uncontrollable elements that take out the "translation" aspect. Also, when an instructor writes some code on a chalkboard and explains how it performs the algorithm, how well do the students follow along, at least as far as understanding how the algorithm accomplishes the task if not how the code corresponds to the algorithm?

I have no idea if any studies along this line have been carried out. That said, while it would be ridiculous to think that the translation has no overhead (if it didn't you wouldn't even need to learn a new language, you would immediately be able to code in it, in fact the speed in which new programming languages can be picked up can be used to support the following), my personal anecdotal evidence not as an educator is that your colleagues are mostly right, for example, I've seen students have difficulty understanding what is going on in box-and-pointer diagrams and these have almost zero "translation" overhead.

All this said, it might still be useful. Any aspect that can be made simpler is certainly helpful so that more time can be spent on algorithms rather than the proper semicolon placement.

Natural language to computer syntax sub-thread

Perhaps you're somewhat missing my point. Who cares about semi-colon placement. I'm talking about mapping the subjects, objects, adverbs, adjectives, and verbs, and phrases or clauses that play these roles, to corresponding program statements. For instance, in the example I gave you, here are two possible answers, at different levels of explicitness or complexity, in C# syntax:

Simple version: person::location1.getPersonName(“with big mouth”).eat(“fish”, "with alacrity”, "not too quickly")

More complex version: person::location1.getPersonName(“with big mouth”).eat(food::fish.getFishName(), "with alacrity”, person::location1. getPersonName(“with big mouth”).setEatSpeed("not too quickly"))

Role assignment in, for example, the more complex version (simple version should be obvious from this):

- Subject: person::location1.getPersonName(“with big mouth”)
- Verb: eat
- Object: food::fish.getFishName()
- Adverb #1: ”with alacrity”
- Adverb#2: person::location1.getPersonName(“with big mouth”).setEatSpeed("not too quickly"))

The point is also, in my opinion, to generate interest in programming, by making the student see the similarities in computer and human speech. At the same time, they start to understand the semantics.

By the way, check this page on my website, in case you think I'm not interested in algorithms: http://www.searchhelpcenter.com/search-discussion-article-thesaurus-relationships.html

I am going to add some important things to the scheme of object role relations as described on that page, namely, system relations, such as higher level plans and relations that guide sequences of events. That makes a better bridge between the object and system levels of mental templates, and also enables me to explain the connection to my previous works on systems.

Perhaps you're somewhat

Perhaps you're somewhat missing my point. Who cares about semi-colon placement.

I think you are misinterpreting what I wrote. My last sentence was exactly along the lines of "who cares about semi-colon placement". My point was that in my, admittedly small, experience complete beginners spend a lot of time on trivial syntactic elements whose mastering adds nothing to ones ability to program. However, I'll agree with you about somewhat missing your point. When you say:

I'm talking about mapping the subjects, objects, adverbs, adjectives, and verbs, and phrases or clauses that play these roles, to corresponding program statements.

my response is "To what end?" While I expect programmers to see the Subject-Verb-Object nature of OO and to be able to use that, I don't see what you are doing much differently from normal OO education and/or what you hope to gain by it.

The point is also, in my opinion, to generate interest in programming, by making the student see the similarities in computer and human speech.

<don't take too seriously>Yes, connecting programming with grammar, a part of language proficiency many students find exciting.</don't take too seriously>

in case you think I'm not interested in algorithms

What did I say that gave you that idea? I did not think your ideas were particularly aimed at "algorithms", rather more at understanding of the basics of the languages particularly OO ones and at modelling(-esque) relationships. If I'm mistaken then since we are talking about students and beginning students, how would this be applied in the teaching of a sort, say? (bubble sort, insertion sort, quicksort, or whichever you prefer)

Finally, you mentioned you teach. Have you applied these ideas in your classrooms? How have they come across? How did you present them? Did it seem to make a difference? How well did the students do in later programming classes? And so forth.

To what end? Have I tried it out? Answers.

I'm teaching college students, on the one hand, and college computer technology students who may not go to university, not university level science and engineering students like you. I'm not sure some of them even see the subject-verb-object-etc. relationship in natural language, never mind in computer programming language. (In fact, I sometimes wonder if they qualify for a programming course, that being one of the reasons. I feel let down by the educational system just by the fact that they don't already know these kinds of things, at least not in terms of roles in a sentence played by phrases and clauses.) So, we may face different kinds of students, with different levels of problems.

But, even with your kinds of students, I think that many miss the structural analogies. For example, verbs in programming language are often stated using nouns from human language, whereby the noun is used in the role of a verb. This may make some students miss the structural mappings with human language sentences.

You don't? I have yet to read a book that models the subject-verb-object-etc. structures of programming languages, that shows that, for instance, most OO languages places the subject before the verb, and pass the objects and adverbs as arguments inside brackets. In fact, I have never found a decent explanation in any programming book of what arguments are in the first place. All the books say is that they pass "extra information" to the called procedure, but they don't give an idea of the function of that extra information in terms such as object and adverb, if in any terms at all. I personally find that clarifying, perhaps because I have had the experience of learning several human languages (including your French, by the way, which I speak frequently, being from Montreal and living in Ottawa).

It would simply give students background of another kind. The linguistic mapping I am proposing solves just one aspect of teaching students programming; it is not a silver bullet, of course. There is still the algorithm per se, whether it’s solved in natural or computer language, or in terms of, say, diagrams. It’s question of whether there is another aspect of problems students have with solving algorithms related to not being able to think in terms of language that is familiar to them.

I tried it a little last semester, in an introductory Visual Basic course. I spent about five minutes on it, and a few of the students got the idea. I am just developing the approach further now, now having the time after classes are over for the year. I don’t know if it would make a difference.

But, linguistic mapping is just one aspect of an approach I would like some day to try out on higher level students. I also have a book, as you know, called “Functional Modeling of Systems.” That would probably be of interest in teaching how to structure algorithms using templates. The book addresses the system level, but you can think of a program as a system (a “system” being just one of several levels of mental templates we project on phenomena, according to some philosophers). My latest material, which you see on my website, and which is still only partially developed, is a first step in tying my previous ideas to a theory of object role relationships in systems. The concepts for bridging between the theory of object role types and my previous materials is still in my head, and needs to be included in that webpage. (I need to add several further paragraphs there.) Unfortunately, I don’t have time for writing and research in my job as a college teacher, because the job is about teaching predefined courses, and there is little appreciation of research. If I worked in a university, it would be a different ballgame, of course, but it doesn’t seem that will come about any time soon (I’m 61 years old this June 9th).

A question about the meaning of "semantics"

By the way, in reading all the posts in this section of the blog, I keep wondering what exactly people mean by "semantics." I think semantics refers to meaning in relation to a particular domain, but it can also mean that the programmer is using a non-abstract vocabulary. Are you talking about the vocabulary of a particular programming language, and/or are you talking about applying the program logic to a particular domain? As I see it, these are two separate, albeit interrelated and somewhat overlapping ideas.

Syntax, as I understand it, refers to rules that a orthogonal to any particular vocabulary, although the syntactical rules themselves have to be stated in a particular vocabulary (in a semantic fashion). While the idea of syntax is usually applied to language, I imagine that it can also apply to the abstract (mathematical/logical) structure of, say, an algorithm. Do you agree.

Then there is the word "grammar." I'm not quite sure of how it differs in meaning from the word "syntax." Does it refer to certain aspects of syntax, or does it have a broader meaning than syntax? Or, do the words largely overlap in meaning, although there are some unique, non-overlapping differences.