Lambda the Ultimate

inactiveTopic Towards the best collection API
started 11/18/2003; 5:01:30 AM - last post 11/23/2003; 11:46:52 PM
Ehud Lamm - Towards the best collection API  blueArrow
11/18/2003; 5:01:30 AM (reads: 13196, responses: 16)
Towards the best collection API
It is well know that given a cursor interface to a collection, we can implement an enumerator (aka for-each or fold). It is less appreciated that given an enumerator interface, we can always derive a cursor -- in an automatic way. We demonstrate that generic procedure for languages with and without first-class continuations.

Now that cursors and enumerators are inter-convertible, an implementor of a collection has a choice: which of the two interfaces to implement natively? We argue that he should offer an enumerator interface as the native one. The paper lists several reasons why enumerators are superior. We present a design of the overall optimal collection traversal interface, which is based on a left-fold-like combinator with premature termination. The design has been implemented and tested in practice.

Slides are also available.

Well worth your time.

The enumerator coll-fold-left described in this presentation has been chosen to be the primary traversal interface in Scheme Collections SRFI [SRFI-44]. A similar interface is being considered for an Oracle RDBMS binding in Haskell.


Posted to general by Ehud Lamm on 11/18/03; 5:04:41 AM

Avdi - Re: Towards the best collection API  blueArrow
11/18/2003; 7:54:34 AM (reads: 909, responses: 0)
I'm having a hard time understanding the distinction made between generators and enumerators. The document makes the claim that "Like cursors and streams, generators are demand-driven. A user must explicitly request a new value to advance the traversal". Icon, Python, and Ruby are given as examples of languages which support generators. I'm not very familiar with Python's generator implementation (and not at all with Icon's), but in Ruby the idiomatic way to iterate over each member of a collection 'c' is c.each {|x| ...}. In this case the method 'each' would seem to correspond to the "enumerator" in the paper's terminology, as it controls how many values are yielded and when the iteration terminates. If this is a "generator", I don't see how it differs from an enumerator. Can anyone enlighten me?

Matthew Maycock - Re: Towards the best collection API  blueArrow
11/18/2003; 9:54:55 AM (reads: 872, responses: 0)
Generators mentioned in ruby contexts can be found on comp.lang.ruby - here's a synthesis of a lot of ideas from there (and some of my own). I asked people there for any commentary (code style, maybe I'm an idiot, etc...) but all they did was refer me to the 1.8.x external iterator mainatainer - who didn't say much. Any commentary from the people around here would be nice.

(An error left from earlier unfinished update fixed..)

Alex Moffat - Re: Towards the best collection API  blueArrow
11/19/2003; 7:09:56 AM (reads: 665, responses: 0)
A sort of off topic comment. I've written scheme and elisp code in the past but I now write Java for a living. Seeing the sort of code presented in this paper always makes me wish I could write scheme again apart from one thing, ide support! With Java I'm currently using IDEA and whenever I have to write perl or scheme or any other code I really start to miss all of the features IDEA provides for, for example, autocompletion, refactoring etc. Some of these depend on knowing the types of the things being used and so might be difficult to implement in a scheme editor. So, I have a question, what's the current state of the art in development environments for scheme and, possibly in contrast, something like OCaml where there is more type info?

Anton van Straaten - Re: Towards the best collection API  blueArrow
11/20/2003; 11:40:36 PM (reads: 472, responses: 1)
Re IDEs for Scheme, have you looked at DrScheme? It has a good renaming capability, based on code analysis, as well as things like the ability to graphically show references to identifiers when you hover the mouse over them. (Use the Check Syntax button to get this feature.) Some of these features don't exist in commercial environments. However, it doesn't have autocompletion, or refactoring of the sort that IDEA and Eclipse have. IMO, the power of the Scheme language more than makes up for it - I doubt Eclipse would know what to do with my Scheme functors anyway.

Andris Birkmanis - Re: Towards the best collection API  blueArrow
11/21/2003; 12:37:55 AM (reads: 465, responses: 0)
I strongly believe that refactoring and autocompletion are based on static type information. While it is theoretically possible for IDE to infer (some imprecise euristic) type info even for dynamically typed languages, I doubt feasibility of this.

The rescue is coming - everyday we hear of more and more statically typed "schemes" :-)

Personally I believe that very soon some offspring of Scheme with static (inferred) types and tamed call/cc (cupto?) will be the next language of choice for the next hackers :-)

Luke Gorrie - Re: Towards the best collection API  blueArrow
11/21/2003; 7:55:08 PM (reads: 404, responses: 0)
Andris, can you describe why IDEs for dynamically typed languages would need to infer types at all for refactoring or completion? This hasn't been necessary (yet) in my experience.

I have mixed feelings about tools like IDEA. On the one hand, I see that IDEA does nice things and I can believe it's better for Java programming than Emacs. On the other hand, some aspects feel like a "high-tech solution to the wrong problem."

In particular, it seems like one of the biggest problems for people writing and refactoring Java is typing all the type declarations. IDEA seems to tackle this by writing a full type-inferencer within the editor, so that it can figure this out for you. That's clever and fancy, but doesn't it seem more appropriate to do this in the language/compiler than in the editor? This would seem to make refactoring Java considerably harder than (say) ML or Erlang as I far as I can see.

P.S., my favourite all-round development environment is Emacs Lisp in Emacs - really outstanding I reckon!

Isaac Gouy - Re: Towards the best collection API  blueArrow
11/21/2003; 11:38:54 PM (reads: 389, responses: 1)
I strongly believe that refactoring and autocompletion are based on static type information

ANNOUNCE: Smalltalk Refactoring Tool 1994-05-11

The paragraphs on Rename Method (Dynamic Refactoring) are to the point.

The current Refactoring Browser standard in VW includes these:

    Class-oriented Refactorings
  • Create a Subclass
  • Rename a Class and its References
  • Safely Remove a Class
  • Change a Class to a Sibling
  • Add a Variable
  • Rename a Variable and its References
  • Remove a Variable
  • Move a Variable to/from a Subclass
  • Create Variable Accessors
  • Make a Variable Abstract/Concrete
    Method-oriented Refactorings
  • Move a Definition to Another Component
  • Rename a Method and its References
  • Safely Remove a Method
  • Add a Parameter to a Method
  • Inline all Sends to Self
  • Move a Method to/from a Superclass
    Statement-oriented Refactorings
  • Extract a Method
  • Inline a Temporary Variable
  • Convert a Temporary to an Instance Variable
  • Remove a Parameter
  • Inline a Parameter
  • Rename a Temporary Variable
  • Move a Temporary to an Inner Scope
  • Extract to a Temporary
  • Inline a Message

Andris Birkmanis - Re: Towards the best collection API  blueArrow
11/22/2003; 2:32:35 AM (reads: 389, responses: 0)
I strongly believe that refactoring and autocompletion are based on static type information

Mea culpa, I should have said that knowing static type info enables more refactoring types :-)

But seriously, refactoring by definition must preserve functionality. This means preserving various invariants. And what is type info if not (incomplete) specification of invariants?

I do not say you absolutely need this type info in the language, it just helps. Otherwise your refactoring tool tries to come up with invariants all by itself.

Autocompletion is less influenced by presence of types, if you mean just selecting from a list of all tokens of an appropriate metatype visible in the workspace.

My working knowledge is limited to Java, lately developing on Eclipse for Eclipse. There are many places where the list of tokens is dramatically reduced giving greater usability. E.g., after a dot you see just names of members, in the scope of a class you see just members and classes, etc.

If your autocompletion tool filters names like this, it already infers static type info, doesn't it?

Waiting for a counter-blow ;-)

Isaac Gouy - Re: Towards the best collection API  blueArrow
11/22/2003; 9:13:47 AM (reads: 359, responses: 1)
Waiting for a counter-blow
When my working knowledge was limited to Smalltalk, I imagined that Refactoring was a Smalltalk thing. Then I found that William Opdyke's work had targeted C++ (and was developed in Common Lisp)! (There seems to have been some previous Smalltalk work by Edward Rak.)
Humility isn't required, but it helps ;-)

static type info enables more refactoring types
What additional refactoring types? Would they make sense in an untyped language?

refactoring by definition must preserve functionality
OTOH if we can define new rewrite rules for the refactoring tool then we can change functionality in powerful ways: Transformation of an Application Data Layer.

Andris Birkmanis - Re: Towards the best collection API  blueArrow
11/22/2003; 9:55:10 AM (reads: 363, responses: 0)
I am retreating :-) My current thesis is then:

Knowing more than AST of code (e.g., pre- and postconditions, variants/invariants, type info, etc.) makes code transformation tool more user-friendly (e.g., less error-prone, easier to use, more automated).

OTOH: Ignoring this knowledge by the tool effectively forces the user of the tool to manage this knowledge by her/himself (themselves for pairs).

Humility isn't required, but it helps ;-)

I could state stronger version, with "plain text" instead of "AST", but I am trying to be humble ;-)

Alex Moffat - Re: Towards the best collection API  blueArrow
11/22/2003; 10:10:01 AM (reads: 352, responses: 0)
One of the great advantages of autocompletion for me is just the reduction in typing. It lets me use longer and more meaningful names without having to type them fully each time I want to reference them. I think this is a better approach than that of arc, for instance, which seems to want to use short names to make writing code easier, I prefer longer names that make reading code easier and tools to help me with writing.

I agree that writing elisp in emacs is good, but not as nice as Java in IDEA. I'd really like to know what type of thing each variable is meant to hold (of course this requires more types in elisp but ...)

Patrick Logan - Re: Towards the best collection API  blueArrow
11/22/2003; 8:01:19 PM (reads: 332, responses: 1)
Knowing more than AST of code (e.g., pre- and postconditions, variants/invariants, type info, etc.) makes code transformation tool more user-friendly (e.g., less error-prone, easier to use, more automated).

I thought the Smalltalk RB did do some amount of type inference. Of course, it could do type inference, and given a "whole world" assumption that inference could be extensive as with the Stalin compiler for Scheme.

I am recalling a lot from several years ago based on an earlier implementation. As it is, I believe there are some changes that are performed by instrumenting code, catching them at run time, and handling them in some way. Maybe it does some patching at runtime, I am getting out on a limb here.

Neither does the Smalltalk RB force the programmer to keep more in her head. So I do not buy this argument: Ignoring this knowledge by the tool effectively forces the user of the tool to manage this knowledge.

That seems like speculation and does not match with my experience. Maybe I am unaware of what I am keeping in my head when using Smalltalk, but I can say at a minimum it is not a burden that makes me less productive so far, relative to any statically checked language.

But what it all comes down to for me is that the Smalltalk RB is roughly as expressive as those for Java, if not moreso. So while in theory an RB based on explicit type information *could* be more powerful than one based on implicit type information, in *practice* that delta does not yet appear to be significant.

Isaac Gouy - Re: Towards the best collection API  blueArrow
11/22/2003; 11:58:41 PM (reads: 330, responses: 0)
some patching at runtime
A Refactoring Tool for Smalltalk see the section named "Dynamic Analysis and Method Wrappers"

Andris Birkmanis - Re: Towards the best collection API  blueArrow
11/23/2003; 4:53:06 AM (reads: 318, responses: 0)
I need to apologize for my statements not being specific enough. I did not mean to start yet another holy war between static/dynamic types in programming languages. The idea, probably too vague even for myself, was:

1. Refactoring (and generally code transformation) is powered by knowledge about code. Whether this knowledge is encoded in the types of the language, or in IDE's metadata, or is inferred by the refactoring tool, or just sits in the developer's head - influences only "non-functional" aspects. 2. The same knowledge helps improving quality of autocompletion.

I am not against dynamically-typed languages, some of my favorite languages are dynamically-typed ;-)

Patrick Logan - Re: Towards the best collection API  blueArrow
11/23/2003; 9:18:06 AM (reads: 310, responses: 1)
I did not mean to start yet another holy war between static/dynamic types in programming languages.

Your intentions have nothing to do with it. You are merely a pawn in the force underlying the original density fluctuations of the universe!

Andris Birkmanis - Re: Towards the best collection API  blueArrow
11/23/2003; 11:46:52 PM (reads: 299, responses: 0)
Your intentions have nothing to do with it. You are merely a pawn in the force underlying the original density fluctuations of the universe!

I should consult the Oracle about my destiny and oneness ;-) Or was it the Microsoft :-D