Getting Started

It seems to me that LtU has many new readers and contributors since moving to the new site. That's great!

Yet it seems to me that the situation right now is that LtU has readers with very different backgrounds, among them many readers who haven't studied PL formally. Others come from academia, and study PL theory for a living.

Since we have such a lively community it occured to me to start a thread for advice on where to begin aimed at those who haven't studied PL theory, yet want to follow the papers and books we discuss.

So the question, mostly directed at old timers, is to which resources would you send a friend asking for advice on learning about the theoretical study of programming languages?

P.S The early LtU archives may be helpful, since I used LtU to organize and collect the papers I read when I began my studies in the field.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

ReadScheme

For me, the real eye-opener was a discovery of Scheme. Specifically, The Original 'Lambda Papers' by Guy Steele and Gerald Sussman.

(Thanks to Matt Hellige for pointing me more than a year ago to that direction!)

After that, I got interested in basic category theory (mostly because of Frank) - see this or this.

In general, googling for "Wadler", "Danvy", or "Filinski" always provided me with interesting and valuable reading. (This is not to say there are no other great authors)

On a practical side, I would recommend to at least install and go through the tutorial with some ML, Scheme, Haskell, and Erlang. I was impressed how I was able to express in several lines of Jocaml code what takes many classes with intricate synchronization and type casting in Java. You never forget that feeling of power and simplicity, but on the other side you will be never happy with Java again :-(

Scheme here too

I should probably be ashamed to admit that it was Paul Graham's promotion of Lisp as the mind-tool of the programming übermensch that made me want to give it a try. I settled on Scheme because I had GNU Guile already up and running on my Linux box at home, although I suppose Emacs Lisp would have done just as well.

I started out by reading through The Little Schemer and The Seasoned Schemer, and would recommend them to anyone at the beginner level who doesn't tire quickly of pizza. Meanwhile a CS-graduate colleague of mine was amusing himself by taking my snippets of Scheme code and rewriting them for me in Haskell. Later on he lent me his copy of Bird's Introduction to Functional Programming using Haskell, which helped move things along quite a lot. It is always useful to know people who know just a bit more than you.

Non-programming books that have helped me a great deal are Hodges's Logic and Helmös's Naive Set Theory. There may be better books on both topics, but those are the ones I read.

The venerable SICP is available online, of course, and it's still possible to hunt down the draft pdf of CTM (although I really would recommend buying it).

I found Hutton and Meijer's paper on Monadic Parser Combinators brought a lot of functional programming topics into perspective for me. Hutton's Tutorial on the universality and expressiveness of fold is also a gem. Proceed from there to Meijer, Fokkinger and Paterson's Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire.

Advanced Scheme?

I think I've got most of the basics of Scheme down, can any recommend a book that covers more than an overview of some of the more advanced topics like call-cc, macros, the Y combinator, etc.?

Write an interpreter

...perhaps with the guide of SICP. That will expose enough detail of the language model to make both call/cc and macros clear. As for the Y combinator, I don't have any recommendations offhand... probably time to start digging in the papers indexed on readscheme.org.

doing it

I just finished my first lisp interpreter (in perl). I found Paul Graham's Roots of Lisp to be a simple and straight forward guide. But will implementing call/cc and macros in an interpreter really give a person a sense of how to put them to good use in practice? (No doubt it'll flesh out how they really work).

Y

I learned my "advanced Scheme" from papers and such, so I'll have to think about a good reference (but do check out the Scheme library, and Dybvig's TSPL).

As to Y: Take a look at this paper. The paper uses ML, if you want Scheme versions of the code, email me and I'll try to dig them out for you.

CiteSeerX link

Although I don't fully unders

Although I don't fully understand the Y combinator yet, I found Richard Gabriel's short essay helpful.

Thanks for the thread!

I am in that boat of people new to LtU and without formal PL background =) Pointers much appreciated!

Pointers much appreciated, bu

Pointers much appreciated, but let's keep it real, eh guys? :-)

"Keep It Real?"

What in God's good name does that mean?

Keeping it real

For starters, no square roots of negative numbers:-)

Re: Keeping it real

Oh, and there I thought "keep it real" was an appeal for solid foundations and robust epistemology! ;)

For example, the OP might be interested in denotational semantics, because of the mapping it provides from programming languages to well-defined Math, right down to the appeal to geometric intuition regarding lattices. Assuming one is willing to acknowledge the reality of anything, how much more real than that does it get?

Speaking of which, I mentioned it the other day, but I liked Schmidt's Denotational Semantics: A Methodology for Language Development, and the first few chapters could be useful as a general intro to some formal PL topics.

Another resource I haven't seen mentioned is Frank Attanassow's Programming language theory texts online, a list of online books, lectures, tutorials etc. on programming language theory.

Frank's PLT has moved

Frank Atanassow's PLT is no longer at that location (and his homepage listed in his LtU profile also went 404; I don't know if this is permanent or transient).

However, I did find a copy here which Frank posted on The Software Technologist wiki.

PL Design Launching-Off Points

My favorites are:
EOPL
LiSP
TAPL
CTM

Some may question the inclusion of CTM, since it isn't a PL design text, but I think its coverage of the various concepts that one might wish to include in a new language, and its pedagogical mechanism of gradual concept inclusion in a virtual machine specification, are extraordinarily useful.

Books

I teach from EOPL2, and I like the book, but the parts on OOP and typing are a bit weak.

LiSP is a bit better if you are really interested in building Lisp/Scheme interpreters.

Notice that TAPL is the first theoy book mentioned: the others don't teach formal techiques.

Formal techniques in CTM

Notice that TAPL is the first theory book mentioned: the others don't teach formal techniques.

Actually, this is not quite true. CTM gives a formal semantics for all the concepts it introduces in the main body of the book, both as an abstract machine and as a structural operational semantics. It also explains logical semantics and axiomatic semantics, and gives a nod to denotational semantics. From our experience in teaching the semantics to engineering students, we find that the level of semantics given in CTM is just about right for a practicing programmer.

Meandering

I agree that EOPL is a bit weak on OOP and typing. Obviously, my feeling is that TAPL addresses those weaknesses, but it's definitely richer meat, if you will, than EOPL, while I feel EOPL will prepare the reader so that their entry into TAPL will be somewhat easier.

LiSP I appreciate mostly for its coverage of the issues in continuations and closures, but also in its treatment of issues in namespaces.

TAPL, then, seems to expand on all of these subjects, deepen them, and put them on a truly rigorous footing—but calling EOPL "informal" may be accurate in some, er, formal sense, but its treatment of algebraic methods and the CPS transformation seems quite formal to me, albeit presented informally.

So not only do I recommend the texts I mentioned, but apparently I also recommend them in the order that I mentioned them.

EOPL

Lest it be misunderstood, let me clarify that I am a fan of EOPL2. I just meant that by itself it doesn't provide you with the formal notation you will find in many of the papers discussed on LtU.

my own...

I think I got "serious" about programming languages when SML kept nagging at me. At first I hated the language, but it got under my skin. Finally something clicked and I wanted to go a little deeper, so I picked up The Definition of Standard ML. The striking elegance of the design and the clarity, beauty and rigor of the formal treatment hit me really, really hard and I've never really looked back. Of course it took me quite awhile and a lot of work to develop the conceptual tools to really make sense of a lot of that book, but after years of dabbling in compilers and language design (I'm sure I thought the K&R C book was as rigorous as it got), the Definition opened new vistas.

I also really liked the presentation in Elements of Functional Programming by Chris Reade. It covers a lot of very diverse ground in a really enjoyable way, dabbling in theory, design, implementation and actual applications of FP.

Only later did I really come to appreciate Scheme, EOPL, etc...

Another...

I've always been interested in mathematics and have been interested in programming for about as long. So looking for the theory side of things seemed natural. However, since I was (and am) a self-taught programmer and had nothing to guide me; the mainstream dominated my view of programming. That would've been fine except there is a rather wide chasm between theory and practice for mainstream languages (C++ for me) and techniques (read OOP). While I certainly looked at non-mainstream languages (including Scheme), the language that prompted my programming theory renaissance was Haskell (admittedly Scheme was the closest language to Haskell that I knew at the time, but at the time, I was explicitly looking to learn pure FP). One of the most striking things about Haskell is how close theory and practice are. The main examples are: monads, folds and unfolds, and the Curry-Howard correspondence holding fairly strongly.

I do want to emphasize the Curry-Howard correspondence to those unfamiliar with it. Lectures on the Curry-Howard Isomorphism (pdf) is a large, comprehensive paper covering it. Google will turn up lighter articles, but some minimal familiarity with the ideas in programming language theory (or related fields) is probably required to know and appreciate what it is saying.

Haskell is also how I got interested in Category Theory, Barr and Well's lecture notes are still one of my favorite online resources. It doesn't require much of the reader, and does a reasonably good job of motivating you to find out about the things that it does assume you know.

Moonbase

Just a random request to finish the Moonbase tutorial.

More general suggestions

General book: PLAI by Krishnamurthi

Embedding Interpreters by Benton -- gathers a number of threads (interpreters, monads, reflection) together into a neat bow.

Others to google: Friedman, Kiselyov, Wand, Felleisen, Hughes.

I'd have a few questions

I'm not sure I know anyone who would ask me where to start in PL theory.  I'd want to know what the basis for their interest was and if they had an idea of the discussions and papers that they would like to be able to follow.  Probing a little deeper, I'd ask whether there was something they wanted to be equipped to do or participate in that brings up the interest in PL theory.

I'm not sure how I would answer this myself were I a newcomer.  These days I would say I'm mostly interested in how a PL theory bridges from the world of mathematics, with its well-definedness presumption, to the world of computational behavior with operational state and interaction, preserving applicability of mathematics in so doing.

Suggestions for learning formal semantics

Here's another question for everyone: I've a decent grasp on PL theory (or so I think) but don't know much about the formal models of PL semantics. Now the situation has arisen where I need to know more. What is a good introduction to the various different models and what are some pointers to deeper material on each? Your suggestions are much appreciated!

Maybe you find this book usef

Maybe you find this book usefull:
Syntax and Semantics of Programming Languages. It introduces denotational, algebric and axiomatic semantics.

Polymorphic type inference intro

Here's a nice intro to Polymorphic Type Inference, by Michael Schwartzbach. It's not quite as specific as it might sound — it defines a "tiny functional language" in order to illustrate type inference, and also covers the relationship between types and logic, including the Curry-Howard isomorphism.

HTDP as a text

I'm of the belief that all you need to know about functional programming (or programming in general) and programming languages can be learned from HTDP. Everything from the basics of functions, to the universiality of fold, to polytypic programming, is there, if you look for it.

Programming _languages_, right?

I'm assuming you mean programming languages rather than programming. Otherwise it sounds analogous to "all you need to know about bridge-building can be learned from the first volume of the Feynman notes." I'm trying to think of how knowing about the universality of fold will help my students program their compilers...

The universality of fold and programming languages

Fold for implementing programming languages:

  • Many analyses and transformations are expressible as folds
  • Compositional interpreters/compilers are exactly the ones expressible as folds,i.e. if you express your interpreter as a fold, it's immediately compositional
  • Folds always terminate (if their argument always terminate)
  • More generally, proving that each argument is correct implies that the whole fold is correct by universality

Folds for language design:

  • Folds are the only elimination form you need for a(n inductive) data type
  • And they're polytypic (so a compiler can generate them automatically)
  • Again, folds always terminate if their arguments do (can be used for a strongly normalizing (subset of the) language or at the type level)
  • Folds have nice properties
  • Folds have clear categorical and type theoretical interpretations making it easy to combine, extend, and analyse them with those frameworks
  • And of course, this leads to unfolds, hylomorphisms, etc.

Two ideas that may make an interesting component or basis for a language are:
Dealing with Large Bananas and
Metamorphic Programming

Not my point

I fully agree that compilers are a lot about tree transformations, and as such can be expressed as folds. But have you ever been a student at, or TAed at, an American state university? They are training a lot of potential programmers, and hylomorphisms (had to look that up) and folds are not even on the radar.

I was really just making the following uninteresting observation: The OP's claim that "all you need to know about programming [which to him evidently means 'functional programming'] is folds" is absurd. Lambda is, and should remain, a forum primarily for PL theory, and PL theory is part of practical programming knowledge. However, there is much to practical programming beyond PL theory, and precise language cools flamefests like the one on OO, and avoids the appearance of arrogance and condescencion.

Teaching solid foundations

The OP's claim that "all you need to know about programming [which to him evidently means 'functional programming'] is folds" is absurd.

Actually, what the OP wrote was "all you need to know about functional programming (or programming in general) and programming languages can be learned from HTDP", which is rather more defensible, if one is talking about students. You were the one who raised the question of how folds apply to compilers, which was answered quite comprehensively.

But have you ever been a student at, or TAed at, an American state university?

HTDP is being used at high schools and universities, including state universities. Indications are that it gives students a better grounding for learning and understanding languages like Java and C++ than more traditional approaches.

Are there other things students might need to learn? Of course. But if you were putting together a list of core concepts they should understand well, the contents of HTDP would rank very high.

"compilers" vs compilers

You were the one who raised the question of how folds apply to compilers, which was answered quite comprehensively.
First, it was a statement of doubt, not a question. Second, though, it was doubt not about whether the abstractions were relevant to the theory or implementation structure of compilers, but about whether folds would be helpful to the students of a class called "compilers." In short, the course isn't, from the students' perspective, about compilers, but about programming in the (somewhat) large.

In my opinion, the students' difficulties stem from varied and poor backgrounds acquired at a dozen universities and community colleges, and from a lack of basic engineering skills (abstraction, modeling, automation of mundane tasks, ordering of one's environment, etc.). You may reply that knowing some PLT will encourage these skills, and I would agree, but only in so far as any mathematical training teaches one to think abstractly.

if you were putting together a list of core concepts they should understand well, the contents of HTDP would rank very high.
I agree -- I just don't want to ignore the "other things", even though I think LtU probably isn't the place for them (hi, Frank!). FWIW, I had a PLT-taught, Scheme-based intro programming course, and though I didn't understand this at the time, I believe it served me well later.

I have in fact

But have you ever been a student at, or TAed at, an American state university? They are training a lot of potential programmers, and hylomorphisms (had to look that up) and folds are not even on the radar.

Well, I have in fact been a TA at an American state university. I even held the lofty position of full-time "instructor" there for a year.

I can report that catamorphisms (folds) and anamorphisms (unfolds) did cross students' radar, right in CS1.

Lest you think this is some sort of anomaly, I can also report that many schools teach catamorphisms and (list) anamorphisms by at least CS2, using terms like "visitor pattern" and "implement the Iterator interface".

I guess what you're claiming is that at most places, students are kept in the dark about the fact that visitors and iterators have any mathematical properties at all. This is probably true, but probably also suboptimal when training students who intend to engineer our software.

hah

Well, I have in fact been a TA at an American state university. I even held the lofty position of full-time "instructor" there for a year.

Yet I notice you didn't stick around ;). Your guess about my claims is correct, but not complete. I also claim that a lack of coding skills, from varied and poor backgrounds, makes even upper-division courses more about how to interact with a programming environment than about how to reason abstractly about the structure of programs. There is a large body of "common sense" knowledge necessary to program effectively, normally learned implicitly in the course of some other task (learning about PLT, being a sysadmin, writing a numerical model). I'm claiming that at least at one place, this common sense is not being absorbed, and gets in the way of learning.

Folds and Unfolds

There's a classic thread that's relevant on folds, unfolds and regular programming.

In particular Neel Krishnaswami's post highlights the value of understanding these morphisms:

The reason that unfolds and folds are so convenient is precisely because they are special cases, with powerful invariants coming along for free. Better still, a fold and an unfold composed together can describe any primitive recursive function, and that's a huge class of computations, which describes almost all of the programs I've written.

It's a good exercise to transform operations in either folds or unfolds: it helps to understand better the problem being solved and the specific solutions.

It's all there

Other people have said lots of worthwhile things in reponse, so I'll just mention a couple points.

1. I've TAed HTDP-based courses at Northeastern University, which is comparable to a middling flagship state university (I think, having not studied at any).

2. HTDP doesn't tell anyone about the universiality of fold. But it's all there, if you look for it.

3. Proving theorems about polytypic programming isn't going to help most programmers, regardless of their background. But the important insights are still comprehensible by anyone.

4. Finally, you should actually take a look at HTDP. It's nothing like Feynman. It's more like saying that a high school physics text can teach you all about string theory (which is ridiculous for physics).

My Recommendations

Types and Programming Languages -- Pierce
This is an amazing book. You will never look at a programming languages the same way after reading this.

Defintion of Standard ML (revised) --Millner, et al
A slim but dense volume. The insights you will gain are not unique to ML, but it is best to have played around with ML first. I think that Elements of ML Programming (Ullman) is a good gentle introduction to ML.

Esentials of Programming languages -- Friedman
Not formal, but very useful. Quite helpful to refer to when the formal definition isn't sinking in right.

Compiling with Continuations -- Appel
Not the latest word on compilation, but still an excellent book to read if you are interested in computation over programs.

Category Theory for Computer Scientists -- Pierce
Good to have alongside Types and Programming Languages (with a good Set Theory book if you're weak there). Read the first two chapters before TAPL gets into subtyping.

Categories for Types -- Crole
A possible alternative or companion to Category Theory for Computer Scientists. I must confess that I only recently obtained this and just started to look through it. But I like how it is written and organized.

Structure and Interpretation of Computer Programs -- Abelson & Sussman
On Lisp -- Graham
Neither of these is very formal, so they don't really belong on this list. Nonetheless, they can be very useful for thinking differently about what a program is. Note that Paul Graham has strong opinions and may occasionally present them as revealed truths.

Lambda Calculus

This is good place to start.

We had differing views about Hankin's LC book.

Assortment of LC links

Good introducing book : "Concepts In Programming Languages"

I'm reading it, and I must say it is complete, with just the right blance between theory, code samples and historical background. It covers many different PL themes and make (IMHO) good choices, like starting with a not so short LISP introduction.

I think it's a good book for students who wants a broad introduction to past and current programming languages concepts.

It has been written by John Mitchell, and is published by Cambridge University Press.

Good beginners book

I think it's an excellent book to start with if you're an absolute beginner. A good balance between imperative and functional is maintained so people ingrained in imperative programming will get a gentle introduction to many new concepts. It also includes a large section on object-oriented programming covering Smalltalk, C++ and Java.

Curry-Howard

Another topic mentioned here quite frequently is the Curry-Howard isomorphism (or correspondence).

A popular science introduction by Wadler.

More here:

Morten Heine B. Sørenson and Pawel Urzyczyn. Lectures on the Curry-Howard Isomorphism. (gzipped PS)

Jean-Yves Girard, Paul Taylor and Yves Lafont. Proofs and Types. (link)

Why Functional Programming Matters

Many of the discussions on LtU relate to functional programming languages in one way or the other, since this is a major area of research and since many new and important ideas come from functional languages.

If you are not familiar with functional programming, a nice introduction (covering things like why there is a ban on assignment in [pure] functional languages, referential transparency, lazy evaluation etc.) can be found in the paper Why Functional Programming Matters by John Hughes.

I Second This!

This is a truly great (and short) paper. It neatly bundles up a wide range of reasons why functional languages are interesting, and has prompted me to attack Haskell and SML more seriously on at least two occasions.

Could this be always visible from the front page somehow?

This is my request for a "getting started" link on the left sidebar that's always visible and points to this thread.

I think the link from the

I think the link from the FAQ should be enough. But if others think we should put it on the sidebar, we can do that.

Here's one vote...

...for linking this from the sidebar. (I think some people would be more inclined to click on a link called "Getting Started" than an "FAQ", and there's a whole host of really valuable links here).

OK. I added the link to the

OK. I added the link to the sidebar.

Definitional Interpreters

One paper which I think belongs here is Definitional Interpreters. The CMU link on that page seems dead, though. There's a version of it here, and the "Revisited" summary is here.

Why is X interesting

We should keep a collection of links to the Why Are X interesting threads, for X in {type systems, continuations, monads...}

lazy evaluation ?

hello ,
i am not so familar with functional languages semantics
what does mean lazy evaluation ?

thanks

lazy evaluation ?

The operational principle of code interpretation/execution refers to how the instructions are executed. My first sentence is using imperative language terminology (instructions, execution), where an imperative language is normally supposed to be interpreted or executed with a sequential or Von Neumann operational principle.

In a functional language, there should not be instructions or commands, generally, and everything is an expression, which must be evaluated. Expressions may be nested, like in ((a + b) * c), where a + b is the inner expression and x * c with x = a + b is the outer one.
For expression evaluation there may be at least two operational principles, namely data driven (eager evaluation) and demand driven or reduction (lazy evaluation).
In eager evaluation, an expression is evaluated as soon as its operands are known, hence from the inner to the outer expressions. In the above example, first a + b is evaluated, then it is multiplied by c.
In lazy evaluation, an expression is evaluated when it is needed for the final result, hence from the outer to the inner. In the above example, first there is an attempt to compute x * c, then a + b is evaluated only because it is needed. The advantage of lazy evaluation is that unnecessary operations may be skipped.

See also http://en.wikipedia.org/wiki/Lazy_evaluation

It was also used in some experimental massively parallel multicomputer in the late 80s, as an alternative to dataflow computer arrays ( see e.g., http://www.springerlink.com/content/y60648652243l31m )

Church's Thesis and Functional Programming

This is allegedly a very good summary to many LtU topics.

Better generic implementation approach

Which of the following is a better approach to implement generic algorithms ; overloading or templates? Justify with the help of examples.

Homework

LtU is not the place to cheat on homework assignments.

Felleisen's lesson plan for self-study

From the PLT Scheme mailing list:
The plan:

So to revisit this again, what do you need to learn of the lambda calculus relative to FP?

A working programmer should have seen:

  • syntax
  • reductions (eval is a function via Church Rosser and Curry-Feys Standard Reduction theorems)
  • scope vs de Bruijn indices
  • beta vs beta-value, normal form [relevant in math] vs value [relevant in cs]
  • basic denotational semantics (typed PCF suffices)
  • typed lc plus type system, SNF
  • extended with some amount of fancy type system stuff (say
    explicit polymorphism)

That's approx 10 weeks worth of course material. Or 2 weeks of self-
study with a one-hour lunch break.

The texts:

What would your reading list be for the self study, apart from Barendregt?

When I present this piece of the puzzle in PhD seminars, I use
original literature as much as possible:

  • Barendregt, The Lambda Calculus, North Holland
    chapters 2 and 3

  • Plotkin, Call-by-name, call-by-value, and the lambda calculus
    Theoretical Computer Science, 1974

  • Plotkin, PCF considered a programming language
    Theoretical Computer Science, 1978

Optionally, if you wish to hook up pure LC with effects:

  • Felleisen and Friedman, Control operators, the SECD-machine, and
    the lambda-calculus.
    Formal Description of Programming Concepts III, North Holland, 1986

See also the NU PLT pubs page.

Thanks to Grant Rettke and Chris Stephenson for asking the questions quoted above.

Regarding SPIN model checker

Is there is any way to obtain a good trace(i.e. if no error occur in the PROMELA model and model satisfies the specification)

At present SPIN only provides only the bad traces(i.e. execution trace if error occurs)

So I want to find a way through which I can get the good trace in SPIN model checker

Wrong site

This isn't the right forum for your question. There's lots of spin resources here.

Novice

I found this site when searching for information about how programming languages are formed, and stumbled upon Let's make a programming language!. I read through the many comments there, and got a bit of insight on that subject. I suppose this isn't directly related to programming language theory, but I found that the major area that I lack understanding in is moving from the assembly language to machine code- Can any of you recommend any formal texts dealing with how the machine code interfaces with a processor (x86 is the one I would go with, since all of my computers at the moment use it)?

refs

Intel has manuals on line.

Google for: "compiler+writer's"+guide
IBM has a great compiler writer's guide for PowerPC.

Hennessy and Patterson have some textbooks that may help with terminology. Traditionally the interface between machine code and the processor is called the "instruction set architecture."

Thanks, this looks like

Thanks, this looks like exactly what I needed.

APL and related (e.g., J)

Is anybody here supporting APL or its derivatives, such as J?

Thanks

I am not sure what you mean

I am not sure what you mean by "supporting". Some people here are interested in APL/J, but that's not a very frequent topic of discussion. This site is not for questions/answers about particular languages or tools (please see the FAQ).