Lambda the Ultimate

inactiveTopic Concepts, Techniques and Models of Computer Programming
started 6/18/2003; 12:53:18 PM - last post 7/2/2003; 10:04:48 AM
Dan Shappir - Concepts, Techniques and Models of Computer Programming  blueArrow
6/18/2003; 12:53:18 PM (reads: 2883, responses: 29)
Concepts, Techniques and Models of Computer Programming
Google's cached HTML version can be found here.

From a Slashdot thread:

If you are ready to take on a heroic task and read thru all 976 pages of Concepts, Techniques, and Models of Computer Programming (draft) (pdf file, 3MB, intro here) written by Peter Van Roy and Seif Haridi you won't regret it. Just finished reading it and I feel like I have read the Bible.

With a recomendation like that I couldn't very well ignore it, now could I. From the Preface:

We mention many programming languages in the book and relate them to particular computation models. For example, Java and Smalltalk are based on an object-oriented model. Haskell and Standard ML are based on a functional model. Prolog and Mercury are based on a logic model. Not all interesting languages can be so classified. We mention some other languages for their own merits. For example, Lisp and Scheme pioneered many of the concepts presented here. Erlang is functional, inherently concurrent, and supports fault tolerant distributed programming.

We single out four languages as representatives of important computation models: Erlang, Haskell, Java, and Prolog. We identify the computation model of each language in terms of the book's uniform framework. For more information about them we refer readers to other books. Because of space limitations, we are not able to mention all interesting languages. Omission of a language does not imply any kind of value judgement.

And from the Abstract:

The book is suitable for undergraduate and graduate courses in programming techniques, programming models, constraint programming, distributed programming, and semantics.

At the very least, looks interesting.
Posted to teaching/learning by Dan Shappir on 6/18/03; 1:01:54 PM

andrew cooke - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/18/2003; 1:06:36 PM (reads: 2108, responses: 0)
this is a great book (although i seem to have stalled on page 368 at the moment). it's also about to be published (by mit press no less). when it is, the download will disappear (if it's being slashdotted then maybe it will go before). on the other hand, you'll be able to buy it in a nice format (for less than it cost me to get mine printed and bound at a local copy-shop, i suspect!).

Isaac Gouy - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/18/2003; 4:10:32 PM (reads: 2081, responses: 0)
Coincidentally (?) Peter Van Roy posted this to the Mozart/Oz list:

Regarding the book 'Concepts, Techniques, and Models of Computer Programming', MIT Press has graciously allowed us to keep it on the Web site until the book actually appears in print. The current Web version is the actual version to be published. The printed version will differ only in correction of grammatical errors and formatting changes. The Web version has all chapters complete and many corrections compared to the previous draft (from April 26). So if you are interested in a free copy, now's the time to download it.

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 11:27:41 AM (reads: 1732, responses: 2)
I just looked at the section on Object-Oriented programming. I haven't read much of it, but just about everything I have read is wrong or confused. They continue this common misconception that objects are a form of ADT (abstract data type) and the inheritance is the only important concept in object-oriented programming. Some examples of their confusion:

"We can loosely define object-oriented programming as programming with encapsulation, explicit state, and inheritance."

On inheritance: "... stateful abstract data types are a very useful concept for organizing a program. In fact, a program can be built in a hierarchical structure as ADTs that depend on other ADTs. This is the idea of component-based programming."

"In the most general case, a class is an incremental definition of an ADT, that defines the ADT as a modification of other ADTs. There is a rich set of concepts for defining classes. We classify these concepts into two sets, according as they permit the class to define an ADT completely or incrementally:

*Complete ADT definition*...

-Defining the various elements that make up a class (Section 7.2.3), namely methods, attributes, and properties. ...

-Taking advantage of dynamic typing. This gives first-class messages Section 7.2.5) and first-class attributes (Section 7.2.6). This allows powerful forms of polymorphism that are difficult or impossible to do in statically-typed languages. ...

*Incremental ADT definition* These are all the concepts related to inheritance, that is, they define how a class is related to existing classes."

As far as I can tell the entire presentation is completely muddled. If people are interested I can give references and more detailed discussion for their confusion. For example, they don't seem to understand the difference between subtyping and inheritance (extends and implements in Java). If I get time, I will write to the authors to complain. But there is probably nothing that can be done about it before publication at this point.

William Cook wcook.org

Isaac Gouy - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 12:54:40 PM (reads: 1737, responses: 1)
Some examples of their confusion...
What "loose" definition do you prefer?

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 1:38:51 PM (reads: 1697, responses: 1)
It is inherently hard to argue with a "loose definition" and so perhaps I should not have included it in my comments. However, I think a better definition would be something like "the combination of behavior and state to form *objects* in which procedural interfaces encapsulate state, and extensive use of *polymorphism* based on multiple implementations of the same interfaces."

If you allow me to separate inheritance from subtyping (where inheritance is "code sharing" and subtyping is "interface similarity" -- these corespond to extending classes and interfaces in Java) then I would argue that inheritance is not essential to object-oriented programming. You can write perfectly good and powerful object-oriented programs in Java without ever extending a class with another class. One example is Microsoft COM, which is very object-oriented but doesn't have a basic notion of inheritance.

For a discussion of ADTs versus Objects, see the paper "User-Defined Types and Procedural Data Structures as Complementary Approaches to Data Abstraction" by John Reynolds, in Theoretical Aspects of Object Oriented Programming Languages: Types, Semantics, and Language Design (http://www.cis.upenn.edu/~gunter/publications/documents/taoop94.html) You can also see my comments on it in http://www.wcook.org/papers/OOPvsADT/CookOOPvsADT90.pdf

andrew cooke - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 2:03:59 PM (reads: 1721, responses: 0)
have you forgotten that oz is dynamically typed? the equivalent of java's "implements" isn't going to explicitly exist in an oz based oo system, i would guess (i'm still on chapter 5). hence, from an oz point of view, all that's left of inheritance is code sharing.

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 2:16:03 PM (reads: 1671, responses: 1)
Static versus dynamic typing is not significant here. Smalltalk is also dynamically typed. But again, you can do great object-oriented programming in Smalltalk without ever using inheritance. This may seem like a crazy idea, but it is true. All you have to do is create multiple classes that define the name method names. See, for example, my analysis of inheritance and subtyping in Smalltalk collection classes (http://citeseer.nj.nec.com/cook92interfaces.html). There are many important cases where the same method is defined in two classes that are not related by inheritance. For example, Dictionary and MappedCollection both implement at:put: but this method is not inherited from any common superclass.

andrew cooke - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 3:11:25 PM (reads: 1707, responses: 0)
i didn't say it wasn't possible, i said that it wouldn't be explicit. in oz that kind of functionality is not restricted to objects - it's true of any adt. so why would you expect it to be presented in the section on objects?

the chapter you are looking at is specifically about functionality related to objects - in the context of that langauge, and that book, that means code inheritance.

[added later] what i'm trying to say is that while what you say is probably true (i've not tried to understand it all, but i presume you're saying something along the lines of "code inheritance often leads to nasty messes and what is more importance is constructing relationships between types"), and i would expect that to appear in the book, somewhere, i don't think it will be in the objects section.

the book doesn't take one paradigm (objects, for example), and show all the cool things possible in associated languages. it starts with simpler paradigms and proceeds incrementally. so by the time you get to chapter 7, which is a fair way through the book, it's not surprising that they're focussing more on the "extra bits" that oo implies, since adts and subtypes are more general (simpler, in the sense i used earlier in this paragraph) than objects.

on the other hand, i think it might be a valid statement to say that the book doesn't focus sufficiently on types and the realtionships between them. but then it's a rather practical book based on a dynamically typed language that already covers a huge amoount of ground. maybe they should have added another 500 pages and included alice (oz's statically typed sister)... [again, i have to add that i'm only 1/3 of the way through the book].

it feels like you've walked up to a conversation where you've overheard two people saying something, placed it in an assumed context, and started shouting at them. a book is a conversation between reader and author - taking a chapter in isolation is dangerous.

[added even later] i hope i'm not being too assertive here. i'm not 100% confident in all that i say. i think i need a clearer understanding of dispatch in oz (under different paradigms). also, my emphasis on static typing is purely practical - if you're going to teach about types surely it's easier if their explicit in the language. and finally, another question - to what extent are module systems and namespaces consdiered as types?

what does seem clear to me, however, is the genral motivation of the authors at that point in the book - they're trying to explain the mechanics of implementing objects. hopefully they also treat other issues like good style in oop. i must read more...

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/22/2003; 7:53:44 PM (reads: 1652, responses: 0)
It is true that I am looking at the OOP chapter in isolation. I think you are on the right track in your comments. And yes, I suspect that their explanation of dispatch using Oz is likely to be confusing as much as clarifying.

At this point I am not sure if this is a book about the language Oz and how it can be used to explain (or simulate) different paradigms, or if it is a book that tries to explain paradigms using the most effective explanation for each paradigm. It seems to me that it is the former. The title tricked me into thinking it was the latter. I see now that the key clue is the idea that they "identify the computation model of each language in terms of the book's uniform framework". The uniform framework is Oz, which happens to be another particular programming language. I think that a more neutral (and perhaps more theoretical) framework is better foundation for doing these kind of comparison between paradigms.

Matt Hellige - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/23/2003; 9:08:53 AM (reads: 1687, responses: 0)
I think the OO waters are already so hopelessly muddied that no single presentation will please even a significant fraction of "OOP people." This isn't a defense of Roy and Haridi's exposition (I haven't read it), but just a pre-emptive devil's advocation.

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/23/2003; 2:53:29 PM (reads: 1623, responses: 0)
I posted a large comment on slashdot. Here it is, without formatting. Things are pretty muddled, but I can't help trying to clear them up.

I have skimmed a lot of your book and studied the sections discussed below in some detail.

Pages 240-255:

You are using the term "abstract data type" in a dynamic language, and this doesn't make very much sense. The original idea of a user-defined abstract data type is that one can define a new type that works just like the built-in types "integer", "char" in Pascal. The idea is that the type is abstract in that the implementation is hidden.

Now I agree that your idea of "secure abstract data type" is in effect very similar to the traditional ADT in, for example, CLU. But it bothers me that the "type", which is a key part of the "abstract data type" concept is not explicit. You are not using "type abstraction" to enforce representation hiding. The commonly-accepted analysis of this form of abstraction is with existential types (see ADT1 [cs.kun.nl] or ADT2 [cmu.edu] for example)

Instead, you have created a form of "abstract data value" that can only be operated upon via the associated operations.

Basically I think that you should replace the word "abstract data type" with the word "data abstraction" everywhere. Then things would make much more sense. What you are talking about is various forms of data abstraction. Abstract data types are one very specific kind of data abstraction.

p 470-481

I actually like the basic idea behind this discussion. I hadn't read it when I started critiquing the OOP chapter. However, there are some issues hidden here that are important.

Most of the content in this section should be in the object-oriented chapter if you ask me. Your development of "bundled" data abstractions is basically an explication of objects.

I think the most important one is hinted at in your comment on page 474: "This version is secure, declarative, and bundled. Note that it does not use wrapping, since wrapping is only needed for unbundled ADTs."

This is one of the key difference between objects and ADTs: objects don't need type abstraction or any kind of explicit data hiding. Instead, objects use procedural abstraction alone. This calls into question the orthogonality of your attributes: In particular, bundled+open doesn't make sense.

To me, the difference between unbundled and bundled is the difference between ADTs and Objects. Either ADTs or Objects can be declarative or state-based. The open/secure distinction is not interesting because objects are always secure, and ADTs are only really useful if they are secure.

As a result, I would say that Figure 6.2 incorrectly classifies {Secure, Stateful, Unbundled} as a "unbundled variant of the object-oriented style". Instead, it is simply the normal stateful ADT.

The section "comparing two popular versions" should be better.

I don't agree that "The implementations of both versions have to do actions when entering and exiting an operation. The calls of Unwrap and Wrap correspond to calls of @ and :=, respectively." The interface for the stateful bundled version is not clear. You should not list Push Pop and Empty as operations, since they are methods. The type should be fun {NewStack}: < op(push:proc {$ T}, pop:fun {$}: T, isEmpty:fun {$}: Bool)> This is the clearest illustration that this object version is not an ADT. The type of the return value is completely visible; it is not an abstract type <Stack T>. Note that declarative bundled versions require recursive types.

It is very telling that you completely miss a very important difference between the two versions: this is that anybody can create their own implementation of stack in the object paradigm, and all these different implementations can be used together and interchangeably without any clients every knowing the difference. This is the converse of your third point: "The declarative unbundled version needs no higher-order techniques to work with many stacks, since all stacks work with all operations." The fact that all stacks must have the same operations is one key reason why pure ADTs have never caught on, while objects are now ubiquitous.

p536-539

The chapter on OOP perhaps is really just about inheritance.

The historical notes are not good.

It doesn't seem right that "object-oriented programming made it clear that... programs should be organized as collections of ADTs." What about objects and polymorphism? Why the focus on ADTs, which are not really object-oriented? As far as I know, paper [142] had nothing to do with object-oriented programming. Instead you should talk about Simula and how objects each can have their own behavior, but have common interfaces. This made simulation really easy. I can't imagine why you think its origins were "timid". By the way, you have already introduced classes: the function NewStack described above. Section 7.1.1

You say "As we saw in the previous chapter, stateful abstract data types are a very useful concept for organizing a program. In fact, a program can be built in a hierarchical structure as ADTs that depend on other ADTs. This is the idea of component based programming." This is completely wrong. First off, objects are not ADTs. Second, what you described in the previous chapter was object-oriented programming not "component based programming".

You say "All these sequences share the basic, linear-order property of the concept of sequence. How can we implement them without duplicating the common parts?" Again, this is not the key question. The right question, which has nothing to do with inheritance, is "How can define them so that all the different implementations can be used together and interchangeably based on their common operations?" Inheritance is useful, but not essential.

As such, I completely disagree with the idea that "Inheritance is the essential difference between object-oriented programming and most other kinds of stateful programming." In fact, if you look at the fundamental definition of inheritance given in my thesis almost 15 years ago, and generally accepted programming language community, you will see that inheritance can be used in functional programming just as easily as in stateful programming! See wcook.org for link to my thesis.

Also, inheritance is not syntactic any more that a function call is like syntactic substitution.

I am going to stop here. If you are really interested in hearing more I can continue. I have tried to make my comments as acceptable as possible, but I am concerned that you will simply ignore them, because they are perhaps too far from your current views.

In any case, I hope this helps.

William Cook

Frank Atanassow - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 10:19:31 AM (reads: 1552, responses: 1)
This is one of the key difference between objects and ADTs: objects don't need type abstraction or any kind of explicit data hiding. Instead, objects use procedural abstraction alone.

I would like to hear an elaboration of this, and what your definition of `procedural abstraction' is. Do you mean the technique of hiding state in a closure? If not, in view of your comments in the earlier part of the same message, it sounds like your notion of `object' is the same as your notion of `data abstraction'. What is the difference?

The fact that all stacks must have the same operations is one key reason why pure ADTs have never caught on, while objects are now ubiquitous.

What do you mean by `same operations'? I guess you mean intensionally equal and that pure ADTs only allow one implementation per program.

On a side note, is this really such a big deal? An object-based implementation of stacks allows many different stack implementations to run simultaneously in one program; an ADT-based one allows only one implementation, but still permits the implementation to be replaced globally in between program runs. I think it is hardly ever the case that one specifically needs the ability to simultaneously support multiple implementations of an interface in the same program run.

Actually, now that I think of it, it seems to me that if you claim that an `ADT' must involve existential quantification, then your notion of ADT does in fact support multiple simultaneous implementations. Then I no longer see the difference between your notion of ADT and object.

(To me, BTW, an `ADT' is basically an algebra for an endofunctor, and an `object' is a coalgebra for an endofunctor. But I know this does not jibe with standard terminology.)

As such, I completely disagree with the idea that "Inheritance is the essential difference between object-oriented programming and most other kinds of stateful programming." In fact, if you look at the fundamental definition of inheritance given in my thesis almost 15 years ago, and generally accepted programming language community, you will see that inheritance can be used in functional programming just as easily as in stateful programming!

I may be wrong but, as I recall, the book equates OOP with stateful, class-based OO and neglects object-based OO or pure functional (what they call `declarative') subtyping, which explains why they see inheritance as the key ingredient here. However, it seems to me that many people share their opinion that `OOP' specifically denotes programming with state and classes.

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 10:41:29 AM (reads: 1544, responses: 1)
I just wanted to add that I have had a good offline discussion with the authors and they say that will address the comments I made above. If you really want to know, I can post some of the discussion.

Dan Shappir - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 10:57:59 AM (reads: 1573, responses: 0)
I think I speak for every one here when I state: we want to know (after all, knowing is the whole point of this community).

Matt Hellige - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 11:25:38 AM (reads: 1567, responses: 0)
On a side note, is this really such a big deal? An object-based implementation of stacks allows many different stack implementations to run simultaneously in one program; an ADT-based one allows only one implementation, but still permits the implementation to be replaced globally in between program runs. I think it is hardly ever the case that one specifically needs the ability to simultaneously support multiple implementations of an interface in the same program run.

This is definitely not the case when you have interface inheritance. It is essential to be able to define operations on all stacks, regardless of implementation details or extensions a given implementation provides.

For example, in Java there is a List class that has several implementations, e.g. LinkedList and ArrayList, which have different performance characteristics. In many cases, an operation would like to rely only on the fact that its argument a List rather than any particular implementation, to support the widest possible range of clients.

In the case of List and children, we're really talking about abstracting implementation details, but with the class Collection, of which List, Map and Set are subclasses, we abstract over various extensions to a particular, more general interface. So, if you have an operation on Collections that doesn't particularly care whether something is a List, why restrict it unnecessarily?

But I'm sure you know all this, and you're a very thoughtful guy, so I wonder if maybe you meant something else, something that excludes these examples? If so, I wonder if you could elaborate...

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 9:22:59 PM (reads: 1535, responses: 1)
Here are the emails I exchanged with one author. Book author in blue, my comments in black.

I guess your main criticism is that:
1- That we do not distinguish in your view between abstract data types and objects
2-  That we do not use the word object-oriented programming when we are using 'object-based' entities.

Is this correct?

It is not so much that you don't distinguish ADTs from Objects -- you do make this distinction and I don't even mind that you are working hard to bring them both into the same formalism. This can be good for explanation. The problem is more that you don't distinguish between data abstraction and abstract data types.

* You use the word "abstract data type" to refer to all kinds of data abstraction. That is, you seem to view "abstract data type" and "data abstraction" as synonymous. This is pretty typical, because ADTs and data abstraction were discovered around the same time, and most people can't imagine any other kind of data abstraction besides and ADTs. But I believe that objects are data abstractions, but they are not ADTs. Reynolds used the term "procedural data abstraction" which is just about the best technical term for "object" that I have ever found.

* Based on this equivalence, you then say that "objects" are a specific kind of "abstract data type". I think it is much clearer to say

    - Objects are form of data abstraction

    - ADTs are a form of data abstraction

    - ADTs and Objects are complementary - one being "unbundled" and the other "bundled".

        This is a fundamental difference that has many consequences.

        For example, the unbundled one (ADTs) require some explicit form of hiding (type abstraction or Wrap/Unwrap). The bundled one does not.

* You seem to imply that "object-oriented programming" is not "programming with objects and polymorphism" but rather "programming with inheritance".

Let us discuss the issues:
First: You are using the term "abstract data type" in a dynamic language, and this doesn't make very much sense.

I retract this. See below.

Do you agree that: first a type is a set of values and its associated operations. The type is abstract if the details of the values are hidden or opaque.

Yes. Just keep in mind that objects types are not abstract in the same way that ADTs are abstract. Object types are fully visible types: records of procedures all of whose types are completely known. You say this yourself in another way: "Note that it does not use wrapping, since wrapping is only needed for unbundled ADTs". It is the procedures that are abstract, just as in normal functional abstraction.

Now we are using in the book a 'dynamically' typed language, which means the types of the variables (the variables are single assignment) are not known until runtime. Again this means that when a variable is bound the type is known. Now if the values belongs to an ADT in the sense I mentioned above, it makes sense to talk about ADTs in a dynamic language.

The original idea of a user-defined abstract data type is that one can define a new type that works just like the built-in types "integer", "char" in Pascal. The idea is that the type is abstract in that the implementation is hidden.

Incidentally in Oz (the language of the book) there is a construct that can take any concrete value and make it abstract by hiding the representation and associating a number of operations on this new type. This construct is called 'Chunk' that can hide a concrete value and by lexical scoping a number of operations are defined the 'sees' inside the cover and operates on the new type. Type equality can be test on these new abstract values. The abstract value is separated from the operations (unbundled). Does this fit your definition of an ADT.

Examples in Oz:
The stateful ADT  Dictionary
Dictionary.new: --> <Dictionary>
Dictionary.get: <Dictionary> <Key> --> <Value>
Dictionary.put : <Dictionary> <Key> <Value> -->

I said this: "I agree that your idea of 'secure abstract data type' is in effect very similar to the traditional ADT in, for example, CLU". Let me state this more strongly. I accept that your use of names (Wrap/Unwrap) for abstraction is a valid form of ADT. Thus your examples in 3.7.5 and 3.7.6 are a valid form of abstract data type. The type is not explicitly named because of dynamic typing, but the set of legal values is clearly defined and is properly opaque. (I don't think the Chunk idea adds much beyond Wrap/Unwrap).

In conclusion, I retract my comment that "You are using the term 'abstract data type' in a dynamic[ly typed] language, and this doesn't make very much sense".

Now let me understand what you said about objects, from you point of view the following is an object: It is an entity when it is created it bundle 'the value in a hidden way' and the operation together. So again we could expect the following contrasting example:

NewDictionary returns some dictionary object D, and then
D.get and D.put  are the procedures  to operate on the object D.

I would not say they are "the procedures to operate on the object D". That way of saying it implies that the procedures are somehow separate from the object. Instead say "D.get and D.put are procedural components that can be invoked to request the object do something". You can't even know if there is any state in the object. An object is the collection of procedures that can be invoked to perform operations. I sometimes say that the object is simply the collection of observations that can be performed on it.

Now you also say that the object view is more general than the ADT view, because of the polymorphism:
A client program:
fun {Client O} ..... {O.get ...}   {O.put ....} ... end
Could work on any value that has the 'get' and 'put' interface, which is not the case with the ADT above.

I would not say that objects are more general. Objects and ADTs are truly orthogonal: there are things you can do with objects that you cannot do with ADTs (mostly related to polymorphism) and there are things you can do with ADTs that you cannot do with objects (like inspecting the representation of multiple objects in a single operation. This is the well-known problem of 'binary methods' in object-oriented programming. It is also one reason why numbers are hard to implement as objects: the operations +, *, / etc all require access to representations unless you are willing to give up a lot of performance. The way Smalltalk handles numbers is really fascinating, and not very well known.)

Please keep in mind that inheritance is a truly general, semantic operation for modifying a self-referential (recursive) definition. It can be used on types, classes, ADTs, functions, modules, libraries, etc.

Did I get this right, or did I miss something?

Yes, this it the key example. I might recommend slightly different names. The argument O is not the Client, the function is the Client:

Client = Fun{Container O}.... {O.get...}  {O.put....} ... end.

It gets even more interesting when you have operations that take multiple object values, like "compare dictionary" or "append lists". Then you would have

Client2 =  Fun{Container O1, Container O2}.... {O1.append(O2)...}  {O2.compare(O1)....} ... end.

In this case the "class" of the two container objects O1 and O2 can be completely unrelated and everything will still work! This is a strong illustration of the power of objects. Again, see my paper on ADT vs OOP for more discussion. This is how numbers work in Smalltalk.

Note: if you work at it, you could bundle up the operations of an ADT in a record and pass them around as an "object". However, you how have an object that implements a (dynamically-typed) ADT. Note that you can't do this very easily in a statically-typed language. This would still not give you the same flexibility as pure objects, because all the values from a given ADT have to have the same representation and operations, as discussed above.

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 9:29:44 PM (reads: 1517, responses: 0)
This is very clear to me, thanks a lot for a very clear discussion.

Last point: how would you define the general term: 'data abstraction'?

Your example with 'containers' is very illuminating. We will try to modify the draft to reflect this discussion, and of course you will be acknowledged in the book.

Last night I was thinking that I had not explained the comment "objects types are not abstract in the same way that ADTs are abstract" very well.

Let's just assume that "a type is a set of values and its associated operations. The type is abstract if the details of the values are hidden or opaque"

Now tell me about this type, which is the type of "stack objects" from your book:

T1 = < op(push:proc {$ T}, pop:fun {$}: T, isEmpty:fun {$}: Bool)>

What is the set of values? What are the operations? Are these values hidden or opaque?

Now consider this type:

T2 = fun {Integer}: Boolean

What is the set of values? What are the operations? Are these values hidden or opaque?

The problem is that the answers for these two types must be the same.

I am not sure I will have time to read the rest of the material. I didn't see anything that jumped out at me when I was skimming. What is you final publication date?

How to define data abstraction? A data abstraction is a representation of data in which the internal implementation details have been hidden. In the case of ADTs, it the internal details of the representation type are hidden and only accessible within the associated operations. For objects, the implementation details (including state) are hidden behind a set of procedures, or methods.

Does that work for you? If you get a chance, you should really look at the Reynolds paper in http://www.cis.upenn.edu/~gunter/publications/documents/taoop94.html

Some sample answers:

* T1 = The set of all possible records that contain 3 fields taken from the set of all possible functions of appropriate type for the push, pop, and isEmpty fields. This is a very large set, and contains many records that do not behave like stacks at all (in fact, very few of the values behave like stacks). Operations are: accessing the push, pop, or isEmpty fields and calling the appropriate funciton (the operations on the type are different from the operations *in* a given record of the type). The values in the type do not correspond to stacks, that is for sure. Note that if you start attaching specifications to the types, things get more interesting. I see you do this in a later section, and I also did it in my Smalltalk Collections papper.

* T2 = The set of all possible functions from integer to boolean. Operation is function application.

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 9:32:26 PM (reads: 1512, responses: 0)
Now given your comments, I agree that the construction of objects as shown below as a record of procedures is less abstract than an ADT. However the object construction is Chapter 7 is more abstract where any object is just a one-argument procedure. In the real Oz system, Mozart, an object is just a Chunk (completely sealed entity) that include other information besides the one-argument procedure. So these are quite abstract to me. Procedures in Oz have associated names (completely unforgable tokens using only for identity comparisons).

I do not agree that "objects are less abstract than an ADT". I am arguing that they are different but not comparable: neither one is more or less abstract than the other, they are just different. Each has advantages and disadvantages.

The use of a chunk for objects is not necessary and does not make them more abstract than a record of procedures. Please keep in mind that a record of procedures is completely isomorphic to an appropriate one-argument procedure. If this is not clear, I can explain.

I completely agree that you should not spend a lot of time talking about Chunks in your book. They are likely to make this less clear, not more so. First-class procedures area already fully abstract, so they should not need chunks or names to distinguish them.

Let me say that I like your overall presentation (despite my having said that it was "wrong or confused"). You have good content, you are just a little fuzzy on the object-oriented side and using terms like "abstract data type" too loosely.

PS: Languages like C++ are confusing because you can implement both ADTs and objects using the single "class" construct (or a mixture of the two!); the use of virtual functions or not corresponds to your distinction between bundled and unbundled.

[And then a final reply]

We are now in complete agreement. I have discussed with Peter and we will change the book according to your recommendations, i.e. make clear the concepts: data abstraction, ADTs and objects, and the differences between objects and ADTs.

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/24/2003; 9:41:01 PM (reads: 1511, responses: 1)
Ok, that is the discussion. I hope it makes some sense.

Re: Frank's comments. Perhaps the discussion quoted above will serve as a response. But a few specific notes:

Yes, procedural abstraction is pretty much the same as hiding state in a closure. Objects are closures.

You can use multiple ADTs in a program, but the values created by each ADT must be kept separate and only used with the operations from the ADT that created the value. Objects do not have this limitation. Stacks are too unintersting an example of an "object". What about servlets? Do you need more than one kind of servelet in a program? Or windows? Or numbers?

I know a little category theory: can you explain, or point me to an explanation of you comment "an `ADT' is basically an algebra for an endofunctor, and an `object' is a coalgebra for an endofunctor. But I know this does not jibe with standard terminology."

Ehud Lamm - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/25/2003; 12:09:22 AM (reads: 1519, responses: 0)
By the way, Liskov's and Guttag's Program Development in Java has a chapter dedicated to data abstraction (in the context of OOP programming in Java, of course) which you may find interesting.

andrew cooke - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/25/2003; 7:04:36 AM (reads: 1501, responses: 0)
thanks. there's also interesting discussion of combining oo and functional approaches (objects as closures etc) on the ll-1 list at the moment. see this post for an example that includes ideas about "lazy" resolution of superclasses.

Frank Atanassow - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/25/2003; 8:33:54 AM (reads: 1463, responses: 0)
[I wrote:] I think it is hardly ever the case that one specifically needs the ability to simultaneously support multiple implementations of an interface in the same program run.

Well, I guess I will retract this. Though in my experience it is not `essential' (Matt's words) for writing programs, it is a desirable feature.

[William wrote:] You can use multiple ADTs in a program, but the values created by each ADT must be kept separate and only used with the operations from the ADT that created the value. Objects do not have this limitation.

I don't see that the last statement is true. You cannot apply a method of one object to another any more than you can use an operation of an ADT on some other ADT. Sure, two distinct objects may have methods with the same name, but surely that's irrelevant in the absence of subtyping. And if one has subtyping for objects, then why not also for ADT's?

I know a little category theory: can you explain, or point me to an explanation of you comment "an `ADT' is basically an algebra for an endofunctor, and an `object' is a coalgebra for an endofunctor. But I know this does not jibe with standard terminology."

Oops, I should have said: a class is a coalgebra for an endofunctor, and an object is a value of a coalgebra.

The idea that an ADT is an algebra for some signature (= endofunctor) is standard, and you can find it in any CT text. The idea that a class is a coalgebra for an endofunctor is, AFAIK, due to Bart Jacobs. He has a couple of papers on the subject. Here are two:

B. Jacobs, Objects and classes, co-algebraically. In: B. Freitag, C.B. Jones, C. Lengauer, and H.-J. Schek (eds) Object-Orientation with Parallelism and Persistence Kluwer Acad. Publ., 1996, p. 83--103.

B. Jacobs, Inheritance and cofree constructions. In: P. Cointe (ed.) European Conference on object-oriented programming (ECOOP) (Springer LNCS 1098, Berlin, 1996) 210 - 231.

They're available from his publications page.

Basically the idea is that an `interface' is a set of typed operators which share a domain, and a coalgebra is an implementation of those operators. A morphism from the final coalgebra, describable by the unfold function, instantiates an interface with a concrete implementation, a class. Objects of the class are values of the (co)carrier.

Compare with ADTs/algebras: a signature is a set of typed operators which share a codomain, and an algebra is an implementation of those operators. A morphism from the initial algebra, describable by the fold function, assigns an implementation to the signature, an ADT. Values of the ADT are values of carrier.

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/25/2003; 10:07:34 AM (reads: 1444, responses: 0)
You say:

You cannot apply a method of one object to another any more than you can use an operation of an ADT on some other ADT. Sure, two distinct objects may have methods with the same name, but surely that's irrelevant in the absence of subtyping. And if one has subtyping for objects, then why not also for ADT's?

The key problem is that subtyping for objects and subtyping for ADTs is completely different. Objects have non-abstract record types and support subtyping very easily. The subtype rules allow programs created by different people at different times to interact through interfaces without any hidden or shared knowledge of each other.

ADTs on the other hand are based on a "secret" that is shared by some operations. If you try to use subtyping on the representation type, then the secret has to be shared -- this prevents independent definition of subtyped ADTs. It prevents modularity. If you use subtyping on the existential types of the ADT implementation, then you are back to the original problem that every ADT maintains its own unrelated set of values.

[New text] I have started reading the co-algebra papers. It is clear that co-algebras are much better than algebras as describing the semantics of objects. You can even use it to argue that ADTs and Objects are duals of each other.

But the model does not capture the full generality of objects. For example, they prohibit all binary operations, like the compare and append examples I gave earlier. This alone makes it impossible for this form of co-algebra to describe any practical object-oriented system.

I will keep reading. I am afraid that they are using the word "inheritance" to mean "subtyping", but I will have to dig deeper to be sure.

Frank Atanassow - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/26/2003; 4:45:27 AM (reads: 1409, responses: 0)
ADTs on the other hand are based on a "secret" that is shared by some operations. If you try to use subtyping on the representation type, then the secret has to be shared -- this prevents independent definition of subtyped ADTs. It prevents modularity. If you use subtyping on the existential types of the ADT implementation, then you are back to the original problem that every ADT maintains its own unrelated set of values.

Objects have the same problem, though: if some new method cannot be expressed in terms of the old methods you have to expose the representation/state.

Conversely, you can use subtyping with ADTs in a way which preserves abstractness of the representation when the new operation can be defined in terms of the old. If I have an ADT which supports, for example, a multiplication operator (and its unit), then I can add an exponentiation operator without problems by defining it via the old operators. Or am I missing something?

BTW there is a different notion of subtyping for ADT's which, I think, is more appropriate for them: subtyping w.r.t. sums rather than products, generated by A < A + B and B < A + B (with subsumption witnessed by injections in the term model). For example, the signature of a monoid, or any set with a binary operation and a constant, is F where F(X) = 1 + X*X. That signature extended with an exponentiation operator, or any another binary operation, is F' where F'(X) = F(X) + X*X. Hence you have F(X) < F'(X) for all X and fix F < fix F'. This is the opposite of the situation with objects, of course, but is exactly what you want since it says that I can assign a semantics to a term involving fewer operators than I need to account for. For example, I can use a non-empty list everywhere a possibly-empty list is required.

[Actually, at this point I have to admit that I have stopped reading `ADT' as `abstract data type' and am now reading it as `algebraic data type', which may be presuming some properties you don't demand.]

But the model does not capture the full generality of objects. For example, they prohibit all binary operations, like the compare and append examples I gave earlier. This alone makes it impossible for this form of co-algebra to describe any practical object-oriented system.

Hm...? They allow operators like f:X -> (X -> X), which looks like a binary method to me. But it's been a while since I've read it.

Anyway, I think coalgebras provide a simple, elegant and accessible model which can put at least a core fragment of OOP on a semantic foundation comparable with FP, without relying on so much obscure syntactic rigamarole and ad hoc prioritization (for example, dispatch methods).

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/26/2003; 11:34:09 PM (reads: 1341, responses: 0)
Atanassow: Objects have the same problem, though: if some new method cannot be expressed in terms of the old methods you have to expose the representation/state.

No, this is not right. You are thinking of inheritance, but I am talking about polymorphism. With objects I can create a completely new implementation with a new representation that is not related to any existing implementation at all -- yet this new class can interact with all existing objects and clients without any prior coordination. It can implement entirely new behaviors that are unrelated to any existing methods.

Conversely, you can use subtyping with ADTs in a way which preserves abstractness of the representation when the new operation can be defined in terms of the old. If I have an ADT which supports, for example, a multiplication operator (and its unit), then I can add an exponentiation operator without problems by defining it via the old operators. Or am I missing something?

How would it work if I wanted to add new values (representations) to the ADT, not just new operations? If you will take a look at my paper "Abstract Data Types versus Object-Oriented Programming" (www.wcook.org) you will see that adding new operations is easy in ADTs, while adding new representations is easier in OOP.

The paper says no page 8: "No binary methods (of the form X x X -> B + C x X) are allowed in the co-algebraic approach, since they lead to contravariant functors." He notes that contravariance is a well-known problem in typing objects -- but most typed object models at least handle them reasonably well.

I understand that algebraic models are very pure, but are you such a purist that denotational models based on lambda-calculus are considered "syntactic rigamarole"? My thesis, for example, gives a clean denotational semantics of objects, subtyping, inheritance, while proving it equivalent to the messy operational semantics based on dispatch.

William Cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/29/2003; 7:35:15 AM (reads: 1262, responses: 1)
Let me give you an example. Lets say you have the following object interface type (or co-algebraic signature):
C ={ add: T -> unit; 
     remove: unit -> T + error}
     isEmpty: unit -> Bool;
Let's let T stand for a generic type. I can implement classes that create objects of this type that will behave as stacks or queues. If you want, you can throw axioms on to the effect: remove returns values that have been added, and when isEmpty is true then remove returns error until add is called. I would not want the specification to be any more constrained than that (see below).

Now lets say I write a client program TreeSearch that is parameterized by objects of type C.

TreeSearch : C -> Tree -> Node
The object of type C is used to hold the items that still need to be searched. In this case T must be identified with Node.

If I pass a stack to TreeSearch, then I get a depth-first search. If I pass a queue, then I get a beadth-first search.

But the interesting thing is that anybody can implement their own object of type C to get other search behaviors: an object could return random Nodes to get a random search. Or if you know more about the type Node you could weight the values for a heuristic search. Or you could throw out Nodes for a pruned search. If you check for duplicates of previously added nodes, then you can use this algorithm to search graphs, not just trees. All these different objects still meet our loose specification above. It might be a challenge to verify all these programs, but verification people should look forward to challenges.

Finally, you could get more complex and allow Nodes to optionally specify an object of type C to use for searching their sub-nodes. If under certain situations these multiple objects of type C need to be merged together, then you would have to add a function to the signature:

     merge: C -> unit
The merge function makes the type C contravariant. There are other, better examples of contravarient methods, but I suppose this one isn't too bad. At least the first half of this example is pretty good, I think.

Ehud Lamm - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/29/2003; 1:41:49 PM (reads: 1269, responses: 0)
If I am not mistaken this examples appears in Mitchell and Plotkin's Abstract Types Have Existential Type.

william cook - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
6/29/2003; 8:30:20 PM (reads: 1256, responses: 0)
Yes, I did not remember that they used the same example. I completely agree with their treatment of ADTs. This work is not a suitable model for objects, however.

They do not mention the issue of merging different stack/queues together. Binary methods of this sort are one of the key issue that separates ADTs from Objects. Binary methods are used, for example, in implementing object-oriented numbers in Smalltalk.

Todd Millstein's work on modular extensibility in multiple dispatch languages (like Cecil) is the best recent work related to this problem. I think that multi-method dispatch is a hybrid that may unify the object and ADT paradigms. Millstein has a good contribution to supporting modular extensibility in the presence of binary methods.

Frank Atanassow - Re: Concepts, Techniques and Models of Computer Programming  blueArrow
7/2/2003; 10:04:48 AM (reads: 1237, responses: 0)
OK, William, I've had a careful look at your paper:

W. Cook. Object-oriented programming versus abstract data types. Proc. of the REX Workshop/School on the Foundations of Object-Oriented Languages (FOOL), LNCS 173, Springer-Verlag, 1990, pp. 151-178.

For the benefit of other readers (whom I urge, however, to also look at the paper itself), let me summarize the contents: it specifies a data abstraction for lists of integers and implements it in two ways, as an abstract data type (ADT) and as a set of classes (or procedural data abstraction, PDA, as you will). PDA's are, AFAICT, identical to the object calculus minus method update described in Abadi and Cardelli's Theory of Objects.

Based on the integer list example, the paper claims (p.2):

The difference between PDA and ADT concerns how they organize and protect the implementation of a data abstraction. The choice of organization has a tremendous effect on the extensibility of the implementation.

Well, I was going to write a long technical note remarking on the paper as I thought I had discovered a serious problem with the notion of PDA, but in the end I find that I'm unable to poke any holes in it, and I have too much work on my hands now to be more ambitious, so let me just adumbrate my thoughts. (I love that word, "adumbrate"...)

  1. I think the matrix argument and the observation that ADT's and PDA's organize things in different ways is on target. I think it holds even more exactly for algebraic datatypes and coalgebraic datatypes.

  2. My most serious complaint about PDA's will sound hollow to you: the abstraction relies on the fact that functions are opaque and, contrary to popular opinion, function equality is actually decidable in lambda-calculi when evaluation accounts for the eta-rule. See, for example,

    C. Barry Jay and Neil Ghani. The Virtues of Eta-expansion. JFP 95.

    No extant language actually implements such an evaluation scheme, but that is a defect of extant languages; PDA's seem to depend on this defect, which I think is an undesirable situation.

    OTOH, I think it is not sensible to ask a language supporting the sort of subtyping you want to admit the eta-laws since they say basically that there are no `hidden constructors' in a type. For example, the eta-law for function types says that every value is constructible via lambda-abstraction. One of the ideas of OO is precisely that the set of constructors is extensible and not fixed. This is a methodological issue; I prefer systems where both beta- and eta-laws hold, or are at least admissible, because I like the simplicity of equational reasoning.

  3. Your integer list ADT is not very realistic. You implemented it in such a way that every observation directly accesses the representation; few people would do so. In fact, because integer lists form an algebra, you only need one observation operation, namely the fold function; the defining property of an algebra says that every observation can be expressed as a fold, so fold is a `universal observation'. The rest, head, tail, null? and even equal can be defined as normal functions outside the abstraction. When changing the representation to handle intervals, then, you would only need to add one line to the body of fold, instead of changing the implementation of each and every operation in the scope of the ADT abstraction. Admittedly, there is a performance issue in the use of fold.

  4. Immediately after expounding the disadvantages of existential quantification, in the section on typing PDA's, you resort to (bounded) existential quantification for PDA's as well.

  5. I think I can implement every PDA as an ADT, in fact as an algebraic DT (which is just a special kind of ADT). For example, I implemented your list PDA as an algebraic datatype in Haskell. OTOH, I think it is not possible to implement every ADT as a PDA.

  6. I prefer algebraic DT's over ADT's. There is information in every algebraic DT which could be reasonably be exposed in the interface, namely that certain operations are constructors. This is what allows to decide exhaustiveness and disjointness in pattern-matching, and when the DT has initial semantics it is decidable from the repesentation when an operation is a constructor. This is information which is not decidable from the description of a PDA. (Conversely, if you replaced PDA's with coalgebraic datatypes one could reasonably expose information about which operations are observations.) In other words, I think PDA's are less declarative than algebraic DT's.

    OTOH, most ADT's, even ones whose constructor operations all have the right form to be an algebraic signature (all sharing a domain), don't have initial semantics. For example, the Interval ADT is not initial. And indeed sort of the whole point of existential abstraction is to be able to implement non-initial algebras (and coalgebras), so my argument that PDA's are not declarative here is not very strong. Actually, I can go on at length about this but haven't the time...

Let me now reply to some of your other posts:

How would it work if I wanted to add new values (representations) to the ADT, not just new operations?

I've done some inconclusive research on a related topic: how to write functions which are `polymorphic' over provably isomorphic representations of an algebraic datatype. No papers on this yet, sorry.

The paper says no page 8: "No binary methods (of the form X x X -> B + C x X) are allowed in the co-algebraic approach, since they lead to contravariant functors." He notes that contravariance is a well-known problem in typing objects -- but most typed object models at least handle them reasonably well.

Fair enough, but there are well-known ways to treat contravariance in the co-/algebraic world, for example:

Erik Meijer, Maarten Fokkinga and Ross Paterson. Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire. In Proc. FPCA '91.

I don't think Jacobs purports to address all the things one might want in an OOP language in his paper.

I understand that algebraic models are very pure, but are you such a purist that denotational models based on lambda-calculus are considered "syntactic rigamarole"?

Purity is not the point, and frankly I don't like that word because it sounds like a religious sentiment. The reason I like algebraic models is rather because they make proofs concise and relate languages to a very large, pre-existing body of literature, namely modern mathematics, which is, to my mind, the language of program specifications. I have no problem per se with establishing semantics by translation to lambda-calculus, but my experience with that is that authors do not bother to bring out the additional structure which is embodied in the image of the translation.

For example, I can describe algebraic datatypes in a functional language by translation to the pure impredicative polymorphic lambda-calculus (J-Y Girard showed this), but that does not tell me precisely what it is about imp. poly. LC that lets me model algebraic datatypes. The nice thing about algebraic and categorical models is that they make it very clear exactly what is needed for a model, and why. Do I need impredicativity, for example? It turns out not.

And, BTW, yes, I think there is a lot of syntactic rigamarole in LC, mostly in connection with the treatment of names.

My thesis, for example, gives a clean denotational semantics of objects, subtyping, inheritance, while proving it equivalent to the messy operational semantics based on dispatch.

I'll look at it.

They do not mention the issue of merging different stack/queues together. Binary methods of this sort are one of the key issue that separates ADTs from Objects.

How is that?