Tim Bray: Dynamic-Language IDEs

Tim Bray writes about advanced IDE feature support for dynamic languages, a subject that was debated here at length a few times.

Lest this turn into the usual flamefest, I suggest concentrating this time on practical implementation approaches that can help with implementing dynamic language support in IDEs.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

questions

Are there also easy refactorings maybe renaming classes or local variables or moving methods?
If you rename a method how about renaming all methods with the same name will that work for dynamic languages or is that not even possible or is it not what we need?
Can you make re-factoring user assisted or something?
What exactly is the difference that it is possible in smalltalk to do refactoring but why is it difficult in ruby

there also easy refactorings

See Refactoring Browser - Refactorings

"renaming all methods with the same name"
See Method Refactorings #2 Rename, in the Refactorings list. (Note: the problem is that you might want to rename method foo declared in class A but not method foo declared in class B.)

thanks for the link

So the exact kind of refactoring like in java is also not possible in smalltalk. But maybe that kind of method rename fits the semantics of a dynamic language better it is possible to find methods definitions that need renaming anyhow by how they are used and that are not "related" by inheritance

What exactly is the

What exactly is the difference that it is possible in smalltalk to do refactoring but why is it difficult in ruby

Smalltalk refactoring browsers have the same sorts of problems. If you look at the documentation for a Smalltalk refactoring browser, the description of each refactoring needs to list the situations in which it won't work, or might not do what you expect. Contrast this to Java refactoring in say, Eclipse, where ignoring code which uses the reflection API, many refactorings can be 100% accurate, due to the available static type information.

For example, take renaming a class method. In Java, as long as no code invokes the method using the reflection API, it's possible for the compiler to positively identify every place in the code where that method of that class or a subclass is being used, and to ignore methods with the same name which belong to unrelated classes.

This is not possible in general in a dynamically typed language, so the Smalltalk refactoring browser can't be as fine-grained. Instead the "rename method" refactoring renames all methods with that name no matter what class they belong to, and in fact also renames all symbols it finds with the same name. If it tries to be more selective and identify only method invocations on a particular class, then it can no longer guarantee 100% accuracy.

It's in the air

As it seems many people have the same ideas at the same time. Using UTs to record all kinds of data is just a natural approach to dynamic languages. I've worked on a prototype implementation of this same idea using Python and EasyExtend ( after finishing a code coverage tool as an EasyExtend demo ). The goal is to create an ad hoc type system using runtime type tag information. This will yield an incomplete but far more accurate picture of the covered paths than global static inference on these languages will ever do. There are also methodological considerations for all kind of postprocessing activities. Refactoring is just one among many. So it might happen that a statically typed program is just an endproduct of prototyping with a dynamically typed language. It also addresses G.Brachas concerns about "default typesystems" and his proposal for "pluggable" type systems that shall not alter the runtime semantics. An ad hoc type system is always created s.t. no succeeding testcase fails after the types are fixed. Fixing types may lead to failure of new tests but fixing types, liberating from static types for new tests to succeed, fine tuning of type assertions etc. will be all part of the development process.

It's hard to tell something about the coverage properties of programs. So it might happen that rejecting a program within the type recovery process may depend on coverage data, local type inference and optional declarations.

Why is it in the air?

"it seems many people have the same ideas at the same time"
Are there similarities between those people, are they Java programmers (experienced in refactoring with Eclipse and IntelliJ) who are now starting to look at other languages?

There's something funny about the suggestion of using type recovery from unit test runs to provide type information for refactoring - as we're basing this on unit tests, we could just do the method rename and then see which unit tests fail. (I've been told that's also the approach in the Java world when method names appear in configuration files unknown to the IDE.)

iirc (doubtful) early versions of the Smalltalk Refactoring browser used method wrappers for type recovery, but afaik that approach was dropped and the other type recovery tools that do exist for Smalltalk don't seem to have been widely adopted.

Maybe we should look to a combination of type recovery and type inference - people are still making progress on type inference techniques (Demand-Driven Type Inference with Subgoal Pruning: Trading Precision for Scalability).

we could just do the method rename and then see which unit test

That kind of approach can be made better if you have a debugger where you can change and continue like in squeak (probably others also) you can maybe even make the debugger more smart by letting it know about your refactorings and automatically try to resolve the problem and continue.

Refactoring in configuration files

I've been told that's also the approach in the Java world when method names appear in configuration files unknown to the IDE.

True, although there seem to be more and more moves to make IDEs aware of common configuration file structures, specifically to empower refactoring and search. For the last year or so, all of the configuration files I've used have had IDE adapters available, and refactoring and usage search have worked transparently.

have worked transparently

Excellent!

Cold code is dead

Well I thought about UTs as the primary source of delivering runtime data in a disciplined manner. It's about UTs only because they set well aside from the compile-fix-run-fix development cycle. They enable a definite step for making decisions ( running tests is usually the last action before devlivering software / closing developent - whether TDD is used or not ). Using UTs to stabilize the programs behaviour before applying refactoring moves is a yet established technique. So people like Michael Feathers who work about legacy transformations and "putting systems under test" and now Tim Bray come up with similar ideas. Admittedly the racial or ethnic affiliation is that of Java programmers. I myself was just tired about at most mildly fruitfull attempts of type inferring Python and I have to admit that I consider 30-45% of exact type information from DDP after 20 years of research on Smalltalk a devastating result, not a strong indication of progress.

When Tim Bray says "but my interpreter is already running all the time" then why not using the information generated by running the interpreter? We don't really have a strong pressure by limited memory resources. We can apply several filtering and mining techniques on any CPU of our contemporary n-core computers. Just with a few UTs I get an average statement coverage of 70-85% on my samples. The corner cases might be indeed type inferred or inspected by automatically generated data and UTs but this won't really change the fact that cold code is dead.

hundreds of thousands of lines

"I consider 30-45% of exact type information from DDP after 20 years of research on Smalltalk a devastating result, not a strong indication of progress."
That was "... it infers precise types for roughly 30% to 45% of the variables in a program with hundreds of thousands of lines" - do we have a % figure for "quite successful for programs with up to tens of thousands of lines of code" ?

Sorry but I don't understand what you meant by this: "Using UTs to stabilize the programs behaviour before applying refactoring moves is a yet established technique."

That was "... it infers

That was "... it infers precise types for roughly 30% to 45% of the variables in a program with hundreds of thousands of lines" - do we have a % figure for "quite successful for programs with up to tens of thousands of lines of code" ?

Yes, this was the passage I was refering to. I'm not aware about such a statictics you are asking for but I guess the authors know what they are talking about and made actually progress although its merrits are quite relative.

Sorry but I don't understand what you meant by this: "Using UTs to stabilize the programs behaviour before applying refactoring moves is a yet established technique."

One among the many eXtreme programming practices is checking refactoring moves by unit tests. Without a test harness refactoring is not even recommended. Obviously TDD fits quite naturally into this scheme. Hence UTs are applied before code is being refactored and after. Only testing after refactoring makes no sense because a test failure cannot be attributed to refactoring in that case. Note that this approach is independent of tools and their particular capabilities and heuristics. If a programmer wants to refactor code he does it with or without particular tools and their strengths, weaknesses and accuracy guarantees.

assuming sense

It would be interesting to know if "quite successful" meant precise types for 95% of the variables in 40,000 LOC, or what...

"Only testing after refactoring makes no sense because a test failure cannot be attributed to refactoring in that case."
Which is why, when I wondered why we wouldn't simply "do the method rename and then see which unit tests fail", I was assuming we had a previous known state ;-)

The premise that dynamic

I'm suspicious of the premise that dynamic languages are harder to write IDEs for. I can see how it would look this way if you compare Java with Smalltalk/Ruby/Python because these languages are object-oriented and static type information can help to recover information from the incessant indirection of OO. However I don't see that you have any of these problems in the first place in a non-object-oriented dynamic language like Erlang. What's so hard about globally renaming a function when even grep can reliably find all the callers, reflection aside?

I'm glad that your complicated languages are providing you all with so much amusement. :-)

spot on, man! indeed IDEs

spot on, man!

indeed IDEs are a solution for a non-existant problem in the expressive languages realm. :)

Eh, I Dunno

IDEs remain helpful even in expressive languages like Haskell, even if only as a navigational mechanism that offers a better fit to human cognition in the face of a phase-space larger than that which we can normally carry around entirely in our heads.

yes, i agree quickly

yes, i agree quickly navigating and finding stuff is important for programming.

But really such features aren't characteristic of IDEs: any half-baked text editor should allow for quickly and easily navigating through text and text-buffers and finding chunks of text. The best ones do the job just fine.

IDEs add a lot of nice features like tight debugger integration, project management and syntatic completion, but text navigation and search isn't really one of them...

Namespaces and semantic searches

It is rare for large projects to be developed nowadays without some set of namespace partitioning. This may be done organically to the programming model (as in object orientation), as an add-on to the programming model (C++ namespaces, SML Modules), or through various ad-hoc solutions involving directory trees and whatnot. For that matter, in most languages procedures/methods/functions define their own small namespaces, for locally defined entities. No half-baked text editor is going to be able to correctly unravel all these namespaces, so they will usually give a lot of incorrect search results which the developer then has to sort through, at an enormous usability hit.

Similarly, what I'm usually looking for when I do a search isn't "show me all uses of variables named foo". What I'm search for is "show me all assignments to this particular variable foo, organized by containing method". Again, this is beyond a text editor's capabilities.

If I were to pull a statistic out of thin air, I'd guesstimate that the labor savings from modern IDEs comes first from auto-complete, second from semantics-aware navigation, third from automatic background error-checking, with everything else trailing far behind.

well

"No half-baked text editor is going to be able to correctly unravel all these namespaces, so they will usually give a lot of incorrect search results which the developer then has to sort through"

Yes, but it's not half as bad as you describe.

The "principle of locality" in statically scoped, modular languages ensures that most of the time, the nearest variable or function declare is also in near textual reach for the searcher function of said editors. So, just to stay with vim or emacs completion mechanisms as an example, hitting CTRL+P is likely to give you the correct completion as the first option anyway. Global module scoped declares are perhaps second or third options in a CTRL+P completions "menu".

IDEs semantic completion is a great tool for quickly navigating through a module/class and discovering its functions/methods without ever going physically there in the file. Once you know what function/methods you'll be calling most times and the buffer is already sprinkled with lots of said calls, i'd say textual completion is far faster than hitting ctrl+space, waiting as the semantic completer does its job and generates a drop-down menu, then going through it and selecting the desired one among lots of other useless unrelated function/methods...

I do a lot of Delphi coding here and i'd gladly change the lame editor for vim wasn't it for the code-generation dependency on the graphical Form Designer. That's because business rules code itself is always the same method calls all over and it would benefit from pure vim text editing.

Smalltalk or Lisp isn't expressive enough for you?

Smalltalk is pretty much defined by their environments, and Lispers use Slime or a commercial environment. The top plugins for Vim are invariably something that makes it more IDE-like. Eclipse is just a big plugin platform to mold into anything you want.

Note that Luke worked on SLIME and even though Common Lisp does OO, methods/functions don't belong to classes so doesn't suffer from the completion problem that other dynamic OO languages do.

Have fun developing with notepad.

you know i don't dig

you know i don't dig notepad. ;)

i'm not in the mood for another flamefest... :P

No I don't know

indeed IDEs are a solution for a non-existant problem in the expressive languages realm. :)

I can write Java in Vi too. But what isn't expressive about Smalltalk? Or better yet, what's wrong with having software to do some grunt work for you, whether that's in Java, Lisp, Haskell, or Smalltalk?

nothing at all

There's nothing wrong with having code writing code for you. Heck, i get a kick out of Lisp macros!

The problem is when your language is so constricted that you resort to automatic code generation -- read: generating all those lame java-alike declarations -- a lot more than you actually write meaningful code by yourself. Because tons of generated code really get in the way.

The need for lots of code generation is actually a sign of weakness rather than a feature. And IDEs proliferate for handling these handicapped languages...

Not talking about code generation

So the brunt of your discontent with IDEs is code generation? Most IDE functionality has nothing to do with code generation.

The problem is when your language is so constricted that you resort to automatic code generation -- read: generating all those lame java-alike declarations -- a lot more than you actually write meaningful code by yourself. Because tons of generated code really get in the way.

Was it Pike who wrote about various levels of programmers and the highest level were those programmers that wrote code to generate code. You can make the case that a language like Java makes code reading problematic because of a lack of density, but that has nothing to do with code generation in general.

The need for lots of code generation is actually a sign of weakness rather than a feature. And IDEs proliferate for handling these handicapped languages...

Does it make you feel better that you can run a Ruby file to generate a ton of Rails code instead of pressing a button in an IDE?

Even in Java I don't see that much code generation, except for gui code. Most of it just automates getters, setters, sets up a class skeleton that will fill in interface stubs and such. Most people have macros in regular editors to do that stuff (but not as well) anyway.

yes!

"Does it make you feel better that you can run a Ruby file to generate a ton of Rails code instead of pressing a button in an IDE?"

yes, exactly! Such code is generated during execution. The code the programmer deals with is clear and uncluttered.

Once I hit a button in an IDE I get tons of trash code together with meaningful code.

thanks for the perfect example.

Nope

yes, exactly! Such code is generated during execution. The code the programmer deals with is clear and uncluttered.

Actually, the last time I checked the initial scaffolding of a Rails app is not done at app runtime, and generates a ton of code (far more than what would be generated by most normal IDE code generations), which is separate from the runtime metaprogramming.

Once I hit a button in an IDE I get tons of trash code together with meaningful code.

What button is generating what trash code?

"the initial scaffolding of

"the initial scaffolding of a Rails app is not done at app runtime, and generates a ton of code"

I've been away from web programming for a while, so am not really used to frameworks like Rails and i can possibly be wrong. But are you talking about the scripts that generate stub files for you to fill in? If so, i don't really consider those trash code, as they are needed for the actual logic of the app. They are nice usage of code writing.

"What button is generating what trash code?"

the one generating all the annoying, pointless wrappers for getters, setters, GUI attributes, private static this, public final that etc...

So it's all about Java

I've been away from web programming for a while, so am not really used to frameworks like Rails and i can possibly be wrong. But are you talking about the scripts that generate stub files for you to fill in? If so, i don't really consider those trash code, as they are needed for the actual logic of the app. They are nice usage of code writing.

It's a bunch of boilerplate generating MVC stuff, and not really logic of the program.

the one generating all the annoying, pointless wrappers for getters, setters, GUI attributes, private static this, public final that etc...

You don't have a problem with logic code generation, but have a problem with something that would have to be written anyway, like getters and setters? It seems you would have less of a problem with boilerplate stuff, than something guessing logic for you.

So you don't like the verbosity of a language like Java. That doesn't take away the utility of something like SLIME. "Expressive" language programmers get a lot of benefit from IDEs too. Smalltalk is basically all about the environments.

So it's a non-existant problem in Java too, as with any language that you can code in a text editor. It's just that people would rather let a tool do some of the lookups, grunt-work, rather than manually doing it.

But I'll agree with you (if I'm getting your point) that Java without the IDEs is a lot less appealing, but if the tool can make up for some of its perceived weaknesses, then so be it.

I just don't throw out the baby with the bathwater though.

yes

"a bunch of boilerplate generating MVC stuff, and not really logic of the program."

from what i remember from toying around with Rails, it's MVC, yes, except it mostly generate stubs for the controller and model parts, rather than view. It certainly has a lot to do with the logic of the program.

"have a problem with something that would have to be written anyway, like getters and setters?"

why does it need to be written anyway?

"Smalltalk is basically all about the environments."

it's because its historically tied to GUI development.

"It's just that people would rather let a tool do some of the lookups, grunt-work, rather than manually doing it."

fine. I'm not against IDEs, just think expressive enough languages don't need them as much as lesser ones. I personally don't enjoy bloat and 120% more features i would ever need, but that's just me.

from what i remember from

from what i remember from toying around with Rails, it's MVC, yes, except it mostly generate stubs for the controller and model parts, rather than view. It certainly has a lot to do with the logic of the program.

Not really. The big selling point of Rails is that all the scaffolding is laid out out for you up front when you run the app script. It's "opinionated" software (kind of like an IDE is) and lays things out for you so you don't have to think about it. Any logic generation would be done via the metaprogramming in the model.

"have a problem with something that would have to be written anyway, like getters and setters?"

why does it need to be written anyway?

Encapsulation

"Smalltalk is basically all about the environments."

it's because its historically tied to GUI development.

It's much more than that. The whole thing about Smalltalk is that they're image-based, so you have this running environment to be tweaked via browser and inspectors.

fine. I'm not against IDEs, just think expressive enough languages don't need them as much as lesser ones. I personally don't enjoy bloat and 120% more features i would ever need, but that's just me.

Emacs was known as, what, "Eight Megs And Constantly Swapping" at one time. I guess Eclipse will be considered light-weight down the road.

clarification


why does it need to be written anyway?

Encapsulation

You didn't get it. It could be just like in Ruby:


class Foo
attr_accessor :bar, :boz
end

f = Foo.new
f.bar = "foobar"

there. you don't get lots of trash wrapper lines for default get/set behaviour. And yet, it's fully encapsulated.

The whole thing about Smalltalk is that they're image-based, so you have this running environment to be tweaked via browser and inspectors.

i guess that's an implementation detail that really doesn't have anything to do with the classy language itself.

Emacs was known as, what, "Eight Megs And Constantly Swapping" at one time. I guess Eclipse will be considered light-weight down the road.

but then, it won't be able to handle the hyper-complex and convoluted commercial languages and tools of the day. Or perhaps AI will take care of programming and won't need IDEs anyway... :)

Encapsulation

Encapsulation

You didn't get it. It could be just like in Ruby:

class Foo
attr_accessor :bar, :boz
end

I obviously get that Ruby is not Java.

The whole thing about Smalltalk is that they're image-based, so you have this running environment to be tweaked via browser and inspectors.

i guess that's an implementation detail that really doesn't have anything to do with the classy language itself.

I guess you consider Smalltalk an expressive language, since you called it a "classy" language. Smalltalkers aren't going to agree with indeed IDEs are a solution for a non-existant problem in the expressive languages realm

Emacs was known as, what, "Eight Megs And Constantly Swapping" at one time. I guess Eclipse will be considered light-weight down the road.

but then, it won't be able to handle the hyper-complex and convoluted commercial languages and tools of the day. Or perhaps AI will take care of programming and won't need IDEs anyway... :)

Hmmm, ok Kreskin, what info are you privy to regarding hyper-complex and convoluted future languages that the rest of us aren't?

But seriously, I could complain about the bloat of modern Gnome and KDE for my poor little Thinkpad 380ed that lives under the bed, but it would be silly.

"I obviously get that Ruby

"I obviously get that Ruby is not Java."

yes, but that's not the point: the point is that more expressive languages reduce the need for IDE code-generation "assistance" by expressing the same abstractions in a far less convoluted way, like the example of default encapsulation in ruby shows.

"Smalltalkers aren't going to agree with indeed IDEs are a solution for a non-existant problem in the expressive languages realm"

like i said, st is firmly historically tied to GUIs. and IDEs are the shell of the GUI guy.

"Hmmm, ok Kreskin, what info are you privy to regarding hyper-complex and convoluted future languages that the rest of us aren't?"

I heard VB will bring monads and functional programming to the masses. it's a preview of insane things to come: like structured assembly or imperative haskell... i'm warning you! ;P

"it would be silly."

i agree: put some Fluxbox love into your thinkpad...

"I obviously get that Ruby

"I obviously get that Ruby is not Java."

yes, but that's not the point: the point is that more expressive languages reduce the need for IDE code-generation "assistance" by expressing the same abstractions in a far less convoluted way, like the example of default encapsulation in ruby shows.

Java doesn't IDE need code-generation any more than "expressive" languages. "Expressive" languages don't need hippy completion or a little elisp tied to a keyboard-combination to generate code in Emacs either.

"Smalltalkers aren't going to agree with indeed IDEs are a solution for a non-existant problem in the expressive languages realm"

like i said, st is firmly historically tied to GUIs.

Like I said, Smalltalkers find their IDEs a productive tool and has nothing to do producing GUIs.

and IDEs are the shell of the GUI guy.

That's, for obvious reason, wrong. Most code written in IDEs has nothing to do with GUIs or being a "GUI" guy.

"Hmmm, ok Kreskin, what info are you privy to regarding hyper-complex and convoluted future languages that the rest of us aren't?"

I heard VB will bring monads and functional programming to the masses. it's a preview of insane things to come: like structured assembly or imperative haskell... i'm warning you! ;P

VB 9.0 is getting closer than you think. http://lambda-the-ultimate.org/node/967.

"it would be silly."

i agree: put some Fluxbox love into your thinkpad...

Already on there, but I'm waiting for a nice Smalltalk-like environment for Ruby on my development machines.:)

indeed

"Java doesn't IDE need code-generation any more than 'expressive' languages."

It sure does. Unless you want to handle the stupid encapsulation grunt work by hand, rather than having it a clean and smart language feature to keep coding dense.

"'Expressive' languages don't need hippy completion or a little elisp tied to a keyboard-combination to generate code in Emacs either."

They don't, if you're not feeling cheap. They can go entirely on reusable textual macros and notepad (or at least a parenthesis-matching aware editor) rather than typing the same key combos everytime, if they need...

"has nothing to do producing GUIs... Most code written in IDEs has nothing to do with GUIs or being a 'GUI' guy."

I didn't say it's for producing GUIs.

"I'm waiting for a nice Smalltalk-like environment for Ruby on my development machines.:)"

I said things are getting weird... :P

"Java doesn't IDE need

"Java doesn't IDE need code-generation any more than 'expressive' languages."

It sure does. Unless you want to handle the stupid encapsulation grunt work by hand, rather than having it a clean and smart language feature to keep coding dense.

Maybe you haven't heard, but there weren't IDEs when Java first came out. You have to write code at some point, even with "clean and smart " language features.

"'Expressive' languages don't need hippy completion or a little elisp tied to a keyboard-combination to generate code in Emacs either."

They don't, if you're not feeling cheap. They can go entirely on reusable textual macros and notepad (or at least a parenthesis-matching aware editor) rather than typing the same key combos everytime, if they need...

Hmm, textual macro...what do you think inserting getters and setters is?

"has nothing to do producing GUIs... Most code written in IDEs has nothing to do with GUIs or being a 'GUI' guy."

I didn't say it's for producing GUIs.

What's a GUI guy then?

"I'm waiting for a nice Smalltalk-like environment for Ruby on my development machines.:)"

I said things are getting weird... :P

An image-based environment would be a stretch, but I'm surprised you haven't heard of the Smalltalk-Ruby connection. But for the sake of baby steps, we'll wait for Ruby to get a real VM first (YARV).

it's a farewell

"there weren't IDEs when Java first came out."

yeah, back when it was called Oak and was meant as a little embeddable language and there was none of these huge multi-frameworks and app servers... good times...

"Hmm, textual macro...what do you think inserting getters and setters is?"

a macro, say, like (attr_acessor attr1 attr2 ...) that generates get/setters in one go, just before execution, without cluttering the source code with such nonsense, is not the same as an IDE actually textually filling the source with thousands of lines of such redundant stuff.

"What's a GUI guy then?"

a guy for which functionality either exists as a button or not.

"we'll wait for Ruby to get a real VM first (YARV)."

ruby runs on java. there's your VM for VM lovers.

well, that's it for today! more Lopez/Malafaia duels some time soon, folks!

See ya

"there weren't IDEs when Java first came out."

yeah, back when it was called Oak and was meant as a little embeddable language and there was none of these huge multi-frameworks and app servers... good times...

You mean before Java was Java? In case you haven't noticed, these huge multi-frameworks and app servers aren't part of the Java language. But back to the point, when Java was Java and not Oak and actually released there weren't IDEs.

"Hmm, textual macro...what do you think inserting getters and setters is?"

a macro, say, like (attr_acessor attr1 attr2 ...) that generates get/setters in one go, just before execution, without cluttering the source code with such nonsense, is not the same as an IDE actually textually filling the source with thousands of lines of such redundant stuff.

Last time I checked Ruby didn't have macros. Of course, any decent editor is going to have code folding, and none of this has anything to do with the utility of an IDE for a language like Haskell or Smalltalk or any other geek language.

"we'll wait for Ruby to get a real VM first (YARV)."

ruby runs on java. there's your VM for VM lovers.

And there's ruby now for AST-interpreted lovers. Is it even possible to not ship Ruby source to the end user?

well, that's it for today! more Lopez/Malafaia duels some time soon, folks!

As a side note, using an IDE doesn't lead down an irreversible, slippery slope of transformation to a VB programmer. Of course, now with VB9 around the corner, I guess the VB programmer insult is now the Java programmer insult:)

so long!

"these huge multi-frameworks and app servers aren't part of the Java language."

great! it makes it much more easy to drop the bloated IDE since you'll be coding just with the standard library.

"when Java was Java and not Oak and actually released there weren't IDEs."

Java the language got a lot more convoluted since then, about as much as C++: checked exceptions, generics, anonymous classes, metaprogramming annotations... yes, some of them were thought to be syntatic sugar, but in the end they brought more syntax to the table, more keywords, more contextual behaviour...

and they got huge frameworks, which, while not a part of the language, are a part of most java programmers everyday jobs.

more importantly, though, java got a lot of so so developers who need an IDE to do anything at all.

"Last time I checked Ruby didn't have macros."

it doesn't, it was an example in scheme borrowing from the previous example in ruby, which, while not a macro, is a special class method which automatically generates the get/setters at run time. Thought you'd realize that.

"any decent editor is going to have code folding"

yes, when a language is at fault, let's rely on support tools...

"the utility of an IDE for a language like Haskell or Smalltalk or any other geek language."

what? to make dense expressions even denser? identifiers in such languages are usually so short that even completion isn't all that meaningful...

"Is it even possible to not ship Ruby source to the end user?"

thankfully, not! let's spread some free software love, man. ;)

Bye

"these huge multi-frameworks and app servers aren't part of the Java language."

great! it makes it much more easy to drop the bloated IDE since you'll be coding just with the standard library.

Heh, so you didn't know? But of course since you're not programming with a framework or app-server(uhmm?), we can get back to programming with notepad. Yeah, that's the ticket.

"Last time I checked Ruby didn't have macros."

it doesn't, it was an example in scheme borrowing from the previous example in ruby, which, while not a macro, is a special class method which automatically generates the get/setters at run time. Thought you'd realize that.

Oh, I did realize that, but since you didn't mention scheme and I wasn't reading your mind...

"any decent editor is going to have code folding"

yes, when a language is at fault, let's rely on support tools..

Yeah, let's get rid of code folding in that bloated Vim and maybe flog whomever did or is planning on implementing it in Emacs.

"the utility of an IDE for a language like Haskell or Smalltalk or any other geek language."

what? to make dense expressions even denser? identifiers in such languages are usually so short that even completion isn't all that meaningful..

Notepad is your perfect tool

"Is it even possible to not ship Ruby source to the end user?"

thankfully, not! let's spread some free software love, man. ;)

I'll pass on your kool-aid offer.

I personally don't enjoy

I personally don't enjoy bloat and 120% more features i would ever need, but that's just me.

120% more features than you would ever need isn't equivalent to 120% more features than everybody would ever need. You, Dave, and I may each only use 40% of IDE X's features, but the 40% you use isn't the same 40% that Dave uses, which is different again from the 40% that I use (they may overlap, but they are not the same). Just because an IDE has 120% more features than you would ever need doesn't mean that it's bloated.

Sadly, no

Just because an IDE has 120% more features than you would ever need doesn't mean that it's bloated.

It might be a prettier world if this statement were true, or at least an easier one for software developers, but sadly it's dead false. "Bloated" is a subjective judgement by a customer, and as such is Always Right. If a piece of software has so many features that it gets in the way of the user, or lacks responsiveness in the user's core workflow, then it's bloated, and the customer is perfectly justified in chucking it, no matter how many other customers need the extra features or are happy with the responsiveness.

All is not lost, however. There are a variety of interaction design techniques available which will allow more features to be crammed into an IDE without triggering feelings of bloat. It's not easy, but it can be done. Sadly, many commercial IDEs (and all the free ones I'm aware of) haven't used these techniques, or have used them only spottily. The situation is improving, but I'm not yet ready to tell the VIM users that they are always wrong.

Bloat

All is not lost, however. There are a variety of interaction design techniques available which will allow more features to be crammed into an IDE without triggering feelings of bloat.

Plugins?

external processes running

external processes running in an "inferior mode"?

where...

Where, in the word "plugin," do you get all that? What you've described debends entirely upon the design of the program, which hardly seems relevant when the discussion isn't anywhere near that low level.

actually, it's another

actually, it's another option to plugins, like emacs does with its inferior modes...

ah

Ah, I see. Nevermind :-)

Yes, inferior mode

Notepad?

No, of course not.

Plugins are a deployment technology, and don't have anything to do with interaction design. I was talking about stuff like mode-linked menu options, inline analyses and code-assists, obsessive defaulting, and auto-configuration for projects. It's a lot of work to make features ignorable, but it's well worth the effort.

Interesting Point

But I think the point really is about the difficulty of accurately performing completion in the face of various forms of overloading that you typically find in OO languages, vs. not having overloading in a functional language like Erlang. So I suspect that a good way to explore that would be to contrast, e.g. a C++ or Java IDE with, e.g. a Standard ML or O'Caml IDE. That is, I think you can remove the qualifier "dynamic" from "non-object-oriented dynamic language like Erlang" and your point still stands—it has nothing to do with being static or dynamic.

Can dynamic programs be 'executed' in the context of an IDE?

Could it be possible to 'execute' dynamic programs as they are written in order to get the possible underlying types of the variables? I am not talking about full execution, but smart execution that ignores the side effects and only computes the types of operations.

Input dependencies

It doesn't take much before the types of your data depend on the value of inputs read from some external source. Most programs nowadays read from a file or network. One of the pleasures of dynamically typed languages is that you don't need specify wire formats and build parsers for these sort of things. That means that you're partial execution idea would run aground fairly quickly.

No

It's impossible to execute code maintaining its control flow but ignoring side effects, because the control flow may depend on side effects, e.g. may result from GUI interaction: the paths of code depend on user input.

Could input be faked?

Could input be faked? i.e. the tool selects a possible value that is found to affect the control flow of the code; for example if the code contains a 'if x > 3' then 3 is a possible value to select for computation. At least some type information could be retrieved in that manner. Of course this looks like dependent typesystem analysis...

Or rather, abstract

Or rather, abstract interpretation.

You guys think this is a

You guys think this is a "problem" with "dynamic languages" or with implicit typing? Would it be just as hard to get IDEs to handle Haskell or OCaml code?

I mean, the IDE would have to come with its very own type inferencer to provide meaningful completions for function arguments, for example... and that's no easy job.

If it works, it's easy

Compared to a lot of stuff going on under the hood of a modern IDE, Hindley-Milner type inferencing is child's play, if the program is well-typed. Where things get tough is when portions of your program can't be typed. This effects anything which references those portions, propagating outward without any explicit type declaration or typeable code to block the propagation. If you're not careful, you'll end up with a bunch of incomprehensible error messages from a single typo.

Other than that, there really aren't any enormous challenges to IDEs for implicitly typed languages, other than money. A modern IDE is a large piece of work, and there just aren't any implicitly typed languages with a large enough community to justify the effort (yet). There is a JetBrains project being launched to add Scala support to IntelliJ IDEA, but that's probably a ways out.

Reminder

A good approach to incremental type inference is to use a type system that supports principal typings, as opposed to principal types. I wrote about this briefly here, but you should search the site for other references. It's also worth noting that it's generally accepted that the Milner-Damas system doesn't have principal typings, but that there's one paper out there that claims that it does, once you have the "right" definition of principal typings. It posits such a definition, non-mainstream, but that at least has the property of typing everything that Milner-Damas does. I personally would love to see more effort in this direction, because I think that even very experienced, intelligent, well-intentioned people often mistakenly attribute qualities to some vector, whether it's using-IDE vs. not-using-IDE, statically-typed vs. dynamically typed, OO vs. non-OO, that are really accidents of the current state of the art and road-taken vs. road-not-taken choices.

ML has Principal Typings

The paper is ML has Principal Typings. If I'm reading it correctly it's the Daman-Milner typesystem the paper replaces, not the definition of principal typings. They formulate a different type system which accepts exactly the same programs as Damas-Milner and assigns equivalent types, but structure their judgements differently so they have principal type. I haven't found any other papers or understood this system well enough to suggest how well their approach might scale to more complicated type systems.

For a simpler technique, following the Haskell fashion of giving each top-level function a type annotation seriously limits the spread of type-errors.

Righto

That's the paper, all right. And from it, we find:

The definition of whether a typing for a given expression is principal or not should be based on a partial order on typings, which in turn should be based on both an ordering of types and on an ordering of typing contexts. The definitions given in this paper do exactly that. Somewhat unfortunately, our definition of principal typing turns out to be original. But its intent is the same as in other approaches, namely, to capture the simple idea of representing the set of all typings that can be obtained in derivations for a given expression in a given type system.

My point wasn't that I think their definition of "principal typings" is bad; it was merely that it's different from other definitions in the literature, and that there hasn't been enough additional research to suggest (to me, at least) whether those differences are necessary for any reason other than to support the claim that ML has principal typings or not.

Definitions

The paper The Essence of Principal Typings is two years more recent, and claims to offer the first system-independent definition of "principal typing". Given this, the note in "ML has Principal Typings" that their definition turns out to be slightly original is unsurprising. I haven't worked out whether thier definition is an instance of the general defintion, but at least (x x) has the typing (b, {x:a->b, x:a}), which seems to be principal under the general definition. The proof HM lacks principal typings shows a descending chain of typings for (x x), with contexts {x : forall a . T_i}, where T_0 = a->a and T_i+1 = T_i -> T_i. All these contexts are less informative than {x : a->b, x : a}, which doesn't permit a term to contain the application (x id).

The problem

The problem, as Luke implied is that with dynamically typed object oriented languages you could have something like this:

class Foo:
    def baz(self):
        return "Called Foo.baz"

    def f(self):
        return "Called Foo.f"


class Bar:
    def baz(self):
        return "Called Bar.baz"

    def g(self):
        return "Called Bar.g"


def callbaz(obj):
    print obj.baz()

They type of obj cannot be determined statically. So, how can the IDE know whether to give f or g as an auto complete option on obj? Obviously the situation could get much more complex as the number of classes with common method names grows.

Because Haskell and OCaml are statically typed, all types have to be determined statically (unlike in the example above where the type of obj isn't determined until runtime). As I understand it, this means if there is an ambiguity that cannot be resolved statically the type inference algorithm can't resolve the type and static type annotations will be required.

not quite

"Haskell and OCaml are statically typed, all types have to be determined statically (unlike in the example above where the type of obj isn't determined until runtime)"

they are statically typed... but just as long as you've already typed the body of the function!

Haskell/OCaml implicit typing of function argument is known from inside/out: once you've applied the operators to the arguments in the body, their type is known. It won't help you with autocomplete options, though.

am i right?

*edit, just to clarify*
It won't help you with autocomplete options while writing the function body, ok?

Correct

Specifically, you'd have to find a way to do Milner-Damas type inferencing incrementally. I just posted elsewhere in the thread about this.

thanks. sounds interesting.

thanks. sounds interesting.

Good point.

I guess I didn't catch that one. Obviously you could annotate all your types. Type inference isn't the biggest selling point of ML and Haskell type systems, so you don't loose too much if you just choose to annotate. There are still lots of things you can do in an IDE with ML/Haskell style type systems that are, in general, difficult or even impossible to do with a dynamically typed object oriented language because of the indirection offered by such languages (method renaming for example).

Anybody have other ideas?

It won't help you until

It won't help you until there's enough body to work with - but that doesn't have to be much, especially if you allow annotations on patterns (in which case explicit types for the parameters can be given).

But that's not necessarily so big a loss - how many bodiless identifiers do we typically have in scope at once anyway? We already know we're only worried about identifiers from within the current module.

The direction of inference

The direction of inference is completely an artifact of the inference algorithm. If your top-level definition has a type signature (usually the first thing Haskell programmer write), then there is information to propagate in from the outside. GHC will begin typing such a function in a mode where it pushes the expected type inwards (See Boxy types: type inference for higher-rank types and impredicativity).

Inference algorithms that explicitly maintain constraint sets, or the graph rewriting approach of Concoqtion offer even more freedom in the order types are resolved. In these frameworks it should be easy to make an incomplete inference problem from and incomplete program, and resolve the types as much as possible. Partially known types would still be useful for narrowing down search results if autocomplete is using some sort of unification-based search like Hoogle, which seems necessary anyway if autocomplete is supposed to suggest polymorphic functions.

An IDE probably would need to offer (function argument) and argument OP function application order, depending on what you know and what you want more help completing.

They type of obj cannot be

They type of obj cannot be determined statically. So, how can the IDE know whether to give f or g as an auto complete option on obj? Obviously the situation could get much more complex as the number of classes with common method names grows.

As a workaround I use a class ( or module ) identifier for autocompletion instead of the parameter name first and replace it finally by the identifier that shall be used. Using a type declaration would have the same effect on an IDE of course but it can be much slower to track a variable. Last time I used Eclipse I didn't found the editor particularly responsive. Talking about intellisense together with C++ and VS 2005 for bigger than small projects would lead to just another internet rant. So I stop my remarks here.

The solution, of course, is

The solution, of course, is to give both. They're both possibly correct, so notify the user that the method might not exist and let him make the decision. Just to clarify: by notify, I mean in an unobtrusive manner; popping up a dialog box every time that happened would be a boneheaded way to go about it.

Suppose the program in question is a compiler

Which program would it try to compile? What is the "neutral" control flow when the input is unknown?

Executing a type program

Could it be possible to 'execute' dynamic programs as they are written in order to get the possible underlying types of the variables?

Here's some previous discussion on that topic.

or...

Or, you can just detect when you won't be able to get all the possible uses and alert the user, or just be liberal and search for possible references.