Objective scientific proof of OOP's validity? Don't need no stinkun' proof.

Having just discovered LtU, I've been reading many of the very interesting topics on here. As a reasonably accomplished programmer myself, I find the discussions on the validity of OOP and OOAD particularly thoughtprovoking, especially those that refer to published articles that speak out against OO.

Having been a programmer in some sort of professional capacity for the better part of a decade, I have made the transition to OO myself, initially with a lot of innertia from my procedural ways.

In my own work and play with programming, I have absolutely no doubt that it is a wonderful way of designing and writing code, and I have myself reaped the benefits of OO that are it's raison d'etre. Sure, much of my work is still done procedurally, but it's done mostly because of the even-driven nature of modern GUI software. But when I create a complex system, I can now only ever think in terms of modelling system structure and behaviour after the real-life system's it's designed to automate/facilitate. OO techniques are perfect for my type of work, and I could never ever go back to pure procedural ways.

What bothers me is this: For me as a programmer, OO is a wonderful, useful thing, that saves me lots of time and mental effort. Why then, are some so vehement in their critique of it? Does that mean that as a former procedural programmer, my ways were so bad that 00 helped to make me better, and if OO is still bad, does it mean that my choice of paradigm brands me as a hopeless monkey?

If OO is so bad, then, is there some other panacea that I am not seeing? Personally, I need no scientific, objective proof that OO is worthwile...I can see and feel the improvement between the old and the new. If other programmers (as I assume the authors of aforementioned articles are employed) are not seeing that improvement, what are they measuring 00 against?

Or perhaps I am simply not grasping the critique properly?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Which articles?

Maybe it would help to focus on specific critiques, and address the particular points they make; a discussion about whether or not OOP is "bad" is unlikely to generate a lot of enlightenment. It would also help to clarify what is meant by OOP in this instance.

Also worth bearing in mind is that the discussion here is often about language and language concepts; the criticism may be of the design choices informing mainstream OO languages, rather than of OOP as a general programming style.

As a working C#-er, I would want to question the notion that OOP involves modelling system structure and behaviour after the structure and behaviour of real-life systems.

OO modelling is no less abstract than any other kind of modelling, and the purported correspondance between such abstractions as classes and objects and the "real world" entities they're used to model seldom stands up to detailed scrutiny.

Nor would a closer correspondance necessarily be a sign of a better, or more conceptually pure, OO model - the appropriate level of granularity and coupling for application code is unlikely to be the same as the level of granularity and coupling of physical (or business-process-conceptual) entities. OO models are simulacra, not snapshots of reality.

Very Briefly...

...I don't think anyone here will say that object-oriented programming has no value. What I personally will say is that:
  • In my current C++ job we've had at least one programmer who established a public inheritance relationship just because instances of the derived class needed to be able to "send" a value somewhere whose type was of the base class, and when there was a member function name collision because both classes had the notion of a "name" value, but the "names" meant entirely different things, he just changed the name of one of the member functions. And he didn't see anything wrong with any of this, even when it became necessary for one of the derived class' member functions to return more than one instance of the base class.
  • I've watched the antipattern of the One True Base Class turning into a dumping ground for shared behavior (not so bad) and shared state (very, very bad) over and over again.
  • I've dealt with way too much code coming from the Microsoft tradition of having public data members accessed willy-nilly, essentially turning an already-complex relationship of classes into an even more complex finite (?) state machine with no documentation and no abstraction.
  • Related to the above, I've dealt with too much code that has had implicit invariants, and then I or someone else comes along and violates those invariants because there's no application of Design by Contract.
  • In languages that conflate subclassing and subtyping, you can't always adhere to the Liskov Substitutability Principle anyway, even if you try desperately, as Oleg Kiselyov pointed out (I believe this is the critique you're referring to).
By contrast, I can look like a miracle worker by writing some code that does something time consuming, and when I'm asked to somehow break the task down into discrete chunks so that feedback can be provided to the user, and by the way, we don't have threads, an hour later I can be done and it just works. Why? Because I've employed a well-known, well-documented, and here's the kicker, semantics-preserving and automated Continuation-Passing-Style source transformation on my code. Note that I could still be doing this in C++; the benefit is in understanding programming as an algebraic process and how to apply that algebraic process in whatever language you happen to be working in.

Having said that, I do vastly prefer to work in languages that provide better support for programming as an algebraic process than C++, and those, rather obviously, are the functional languages. So once again, you're not a "bad person" (whatever that would mean) for doing procedural programming or for "seeing the light" of object-oriented programming. I fully expect to take advantage of the "O" in "O'Caml" in my current personal project. But 25 years of personal experience, including very early exposure to objects in Lisp and Smalltalk, gives me, I think, enough ammunition to compare OO tools and methodologies to other ones, and at this point I, personally, given the choice, will choose a functional toolset over an OO toolset each and every time, if I must. Of course, if I get to choose O'Caml or Oz, then I don't have to choose between functional and OO tools, and I still get appreciably, measurably better productivity than I do in C++, Java...

Tools?

That sounds very cool. What tools did you use for the transformation?

In That Case...

...I confess that I did it by hand. Since then, I've thought multiple times of trying to come up with a template metaprogramming implementation of the CPS transformation.

That's too bad. I've spent t

That's too bad. I've spent the last 12 hours imagining neat things I could do if I could say "CPS transform *this*, then wrap *that* continuation like *so*". Even having to do the transformation once, then edit the resulting text, wouldn't be too bad.

Ivory Towers

I think the majority of OO critics you find here are a bit more academic than your average programmer. That is to say, they are much more likely to be computer *scientists* rather than software *engineers*. Therefore, they can afford the luxury of using tools with sparse toolchains and spotty support for industry standards.

That being said, the currently most popular paradigm among the theorists is basically functional programming, which, ironically, is also one of the oldest paradigms. Its appeal lay in its closeness to mathematical notation and the theoretical systems used to talk about programming languages in the abstract. For writing very pure, very elegant code, FP is hard to beat.

However, the average programmer usually has not been exposed to FP style programming directly and would be shocked to see how it works. In the ideal, it is stateless (only pure functions, no side effects). This is about as diametrically opposite to the imperative paradigm as you can get (though declarative programming can be pretty bizarre to the imperative programmer as well).

There are provably good reasons to want many of the properties offered by the FP paradigm, but the one reason perhaps that FP has not caught on in the industrial world is control. That is, for the biggest, baddest, most performance sensitive applications (like operating systems, telephone switches, financial transaction middleware, etc.) you absolutely need to be able to say where critical bits and bytes get stored and how. You don't have the luxury of hoping your tail-recursive algorithm gets properly optimized by the compiler.

Unfortunately, one of the drawbacks of FP is that for all of the elegance and purity you get, you must pay a price in control. FP very much likes to make things on the fly, such that most, if not every FP language is garbage collected. That means that if you have a large, performance-sensitive, intricate data structure whose storage you must absolutely control, you pretty much have to resort to C/C++ or something similar. An illustrative example is sorting. I'm sure any number of FP experts here could write a beautifully elegant sorting algorithm in their favorite FP language in just a few lines...barely a paragraph. But when it comes to sorting a 100 MB file, I would pit C++'s std::sort() against it any day of the week. That's because the typical FP way involves creating many many intermediate data structures and many many copies of the data, whereas the C++ version will use a very efficient merge sort or heap sort, using few heap allocations and a very minimal number of copies.

The advantage of the procedural paradigm is that it more closely reflects how humans tend to solve non-computer related problems. Just look at a cake recipe or a car repair manual for examples. This makes it easy for most programmers to understand intuitively. They can leverage their real-world experience to learn the concepts involved. OO is a natural evolution of the imperative/procedural paradigm which tries to fix many of its shortcomings while staying in touch with how the common man thinks about problems. It succeeds to varying degrees depending on who you are and how you ask the question. It obviously hasn't delivered on all of its promises, but it works well enough that most people creating a new imperative language wouldn't dream of ignoring OO altogether (and even some FP langs have eyed OO with envy...witness CLOS, for instance).

Having said all that, it is almost certainly worthwhile for you to learn one of the popular FP languages, if for no other reason than to expose you to a new way of thinking about programming. What you will find is that there is a convergence of paradigms. C++, for instance, uses some FP concepts very heavily in its Standard Library, even though it is a canonical imperative language. And as I mentioned above, some FP langs are moving towards OO features. Someday (maybe soon, who knows?) someone will design a very clever language that combines the power of imperative and functional paradigms in a natural way. Until then, sample what's out there, but remember that imperative languages generally put bread on the table.

Ask, and you shall receive...

An illustrative example is sorting. I'm sure any number of FP experts here could write a beautifully elegant sorting algorithm in their favorite FP language in just a few lines...barely a paragraph.

qsort [] = []
qsort (x:xs) = qsort less ++ [x] ++ qsort more
  where less = filter (<x) xs
        more = filter (>=x) xs

Complexity?

I may be wrong (as I don't have working intuition for non-strict evaluation), but isn't append (++) killing complexity of this qsort implementation?

Yes.

Yes. It has horrible space complexity. I was just trying to live up to the what the original poster was looking for...

But when it comes to sorting a 100 MB file, I would pit C++'s std::sort() against it any day of the week. That's because the typical FP way involves creating many many intermediate data structures and many many copies of the data, whereas the C++ version will use a very efficient merge sort or heap sort, using few heap allocations and a very minimal number of copies.

Conceptual clarity has value

It's not a practical sorting algorithm.... but it's conceptually very clean. In fact, it was this example that turned one of my friends on to functional programming.

He was using SML for a Discrete Math class, and he was not getting it very well. So I sat down with him for an hour or two and throughly explained the basics of syntax, types, computation-by-substitution, and how to turn a recursive definition of both factorial and fibonacci into an iterative definition using accumulators. He picked up on my explanations pretty well, and afterwards he had no trouble with his programming assignment.

The next day he read "Why Functional Programming Matters," and saw that example of quicksort. He'd implemented quicksort before, but I believe that's when he first understood how quicksort actually *worked*, conceptually. After that he was a fan of SML.

While I wouldn't use this definition in anything past a demonstration, it must be pointed out that there *are* perfectly efficient ways to implement quicksort in SML and Haskell.

Example?

While I wouldn't use this definition in anything past a demonstration, it must be pointed out that there *are* perfectly efficient ways to implement quicksort in SML and Haskell.

While I don't doubt the truth of this claim, I would like to see an implementation so as to consider the sacrifices in clarity and elegance necessary to achieve performance in these languages.

Even assuming

Even assuming these languages need to get as ugly as C++/whatever to be efficient, it is still a gain that they can express these algorithms more concisely if that is desirable.

c++ needn't be that ugly

This is not so far from the Haskell version.
And is far more efficient.

(from http://erikhaugen.blogspot.com/)

template
void quicksort(Iter begin, Iter end)
{
if(begin == end)
return;

Iter pivot = stable_partition(begin, end, _1 

However...

It's only valid if you include a lambda library, which kinda reinforces the point that FP is more elegant. Also, you need to translate angle brackets even within a pre block. ;>

You have a good point

C++ can use functional style. I was more saying "as ugly as imperative style C++/imperative style in general", so my earlier comment was inaccurate. But that was not my intention.

You are unloading the implementation onto a library...

That is not a fair comparison. Everything that is essential to the quicksort algorithm is contained within the Haskell definition, whereas you've called a library routine to provide the bulk of your functionality. If I didn't know how quicksort worked, I wouldn't have a clue what your code did without looking at the library you are calling. Not so with the Haskell version.



And, by request, here's the definition that GHC provides for Data.List.sort:

mergesort :: (a -> a -> Ordering) -> [a] -> [a]
mergesort cmp = mergesort' cmp . map wrap

mergesort' :: (a -> a -> Ordering) -> [[a]] -> [a]
mergesort' cmp [] = []
mergesort' cmp [xs] = xs
mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)

merge_pairs :: (a -> a -> Ordering) -> [[a]] -> [[a]]
merge_pairs cmp [] = []
merge_pairs cmp [xs] = [xs]
merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss

merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
merge cmp xs [] = xs
merge cmp [] ys = ys
merge cmp (x:xs) (y:ys)
 = case x `cmp` y of
        GT -> y : merge cmp (x:xs)   ys
        _  -> x : merge cmp    xs (y:ys)

wrap :: a -> [a]
wrap x = [x]

I'd say this is certainly competitive with a mergesort algorithm written in an imperative language.

I wonder...

...how it compares performance-wise.

Performance critical applications

... but the one reason perhaps that FP has not caught on in the industrial world is control. That is, for the biggest, baddest, most performance sensitive applications (like operating systems, telephone switches, financial transaction middleware, etc.) you absolutely need to be able to say where critical bits and bytes get stored and how. You don't have the luxury of hoping your tail-recursive algorithm gets properly optimized by the compiler.
You are talking about performance critical things, an example would be the AXD301, scalable 10-160 Gbit/sec ATM Switch - using Erlang. Some old notes from Joe Armstrong (via Philip Wadlers homepage).

Functional programming in the real world might be an interesting thing to read.

OO is a natural evolution of the imperative/procedural paradigm which tries to fix many of its shortcomings while staying in touch with how the common man thinks about problems. It succeeds to varying degrees depending on who you are and how you ask the question. It obviously hasn't delivered on all of its promises, but it works well enough that most people creating a new imperative language wouldn't dream of ignoring OO altogether (and even some FP langs have eyed OO with envy...witness CLOS, for instance).
Why do we put people through college for so many years if they are supposed to think like a common man, didn't they do that before they started college already? The common man does not worry about convolution, double integrals and other mathematical things either, yet they are excellent tools for solving certain kind of problems.

Paul Snively has already mentioned O'Caml and Oz, googling for reactive objects gives me quite a few hits on Timber, O'Haskell, rHaskell and so on. All of them could be seen as functional languages.

Erlang is FP's show pony of last resort.

I knew Erlang would be trotted out. What feature(s) does Erlang have that make it more functional than Smalltalk? Modern Smalltalks have closures, map, filter, fold. pattern matching? Someone's done that in Smalltalk as well, though I've not tried it.

OO style polymorphic dispatch, especially predicate dispatch which subsumes pattern matching, is much better at preventing nonlocal changes to code resulting from adding new implementations of types than is FP's textual order pattern matching. And OO style dispatch guarantees the most specific method will be executed, which has to be maintained manually with textual order pattern matching.

Such a nice show pony - it trots, it jumps, it makes money

What feature(s) does Erlang have that make it more functional than Smalltalk?

Was a design goal of Erlang to be more functional (whatever that means) than Smalltalk?

To Have or Not To Have

What feature(s) does Erlang have that make it more functional than Smalltalk?

It's rather the features that it lacks, namely mutable state. ;-)

Erlang has mutable state

Erlang has plenty of mutable state. It is trivial to write the classic bad OO example of a movable point using an Erlang process.

Yes... but

I know, but you have to encode it using recursive processes, and the process layer is very clearly separated from the underlying pure functional core layer. Hence you are not easily seduced into messing around with state in-the-small.

You are talking about perform

You are talking about performance critical things, an example would be the AXD301, scalable 10-160 Gbit/sec ATM Switch - using Erlang. Some old notes from Joe Armstrong (via Philip Wadlers homepage).

Nice link. First of all, the presentation implies that Erlang was written with the express intent of supporting real-time systems. But how many mainstream FPs have this as a design goal? Also, on page 23 we have this very revealing comment:

Already Erlang is being chosen instead of C++, Java, even C because of "time to market" - we have examples of pre-study to product of nine months. We are now aiming at development times of less than one year. Here we have a significant advantage over conventional imperative languages. There is a "performance gap" - but we try to run on the fastest available processors, then the gap is less of a problem. We are "sufficiently fast".

That's a bit of a peculiar comment for the FP touted as being a performance go-getter. Maybe GC isn't always as fast as manual memory management. ;> Which is not to say that GC is bad; it certainly has its place. Just that my original point mostly stands.

Why do we put people through college for so many years if they are supposed to think like a common man, didn't they do that before they started college already? The common man does not worry about convolution, double integrals and other mathematical things either, yet they are excellent tools for solving certain kind of problems.

However, there is no such thing as a "commodity mathematician", whereas "commodity programmer" is unfortunately the rule rather than the exception. In an age where not so much as a calculus course is required to get hired as a "software engineer", the reality is that your average programmer is barely more mathematically literate than your average homemaker. Which is probably at least partly why Java has become so incredibly popular (that and the awesome toolchain, but establishing causality there might be tricky).

Paul Snively has already mentioned O'Caml and Oz, googling for reactive objects gives me quite a few hits on Timber, O'Haskell, rHaskell and so on. All of them could be seen as functional languages.

Ok, so I'm not current on my FP langs. Point taken.

FP performance

First of all, the presentation implies that Erlang was written with the express intent of supporting real-time systems. But how many mainstream FPs have this as a design goal?

The point is that there's nothing inherent in FP that would preclude it from being used in real-time applications. That's not the same as saying that every FP language is suitable for real-time applications.

From your previous comment:
FP very much likes to make things on the fly, such that most, if not every FP language is garbage collected.
Uh... last I checked, so was Java. And yet it seems to be perfectly acceptable for all sorts of financial middleware and other "mission critical apps".

Mission Criticial != Real Time

While many "mission critical" apps are also "real time" applications (such as medical equipment); there are plenty of examples of apps which are one, the other, or neither.

Some of the financial software used by banks, etc., to manage multimillion dollar transactions is mission critical. In most cases, such software is not real-time.

The software than controls the print engine in my $100 HP all-in-one is real-time software; however it isn't mission critical. If it screws up; the result is a messed-up page (and a pissed-off customer), but nobody dies or goes bankrupt.

Microsoft Notepad is neither mission-critical nor real-time.

Garbage collection is widely considered inappropriate for "hard" real-time applications (at least standard implementations of GC), as it often introduces unpredictable (and unbounded) latencies into the system; many of which exceed typical real-time limits. There are some "real time" GC implementations, but this is still an active area of research.

GC may, or may not, be inappropriate for mission-criticial work. For some mission-critical systems, ALL dynamic memory allocation is verboten; but for many other classes of systems; the reliabilty advantages of GC outweigh the disadvantages. Use of Java in critical financial apps seems perfectly acceptable; I see no (technical) reason why use of a FP language with GC would be less acceptable.

I wasn't claiming that it was

I wasn't claiming that it was. I responding to the OP's assertion that garbage-collected languages were inappropriate for "performance sensitive applications...such as finanicial transactions middleware". I am well aware that real-time GC is an active research area. However, there is a distinct difference between "hard real-time" (i.e. deadline sensitive), and "really fast" (which is presumably what the OP meant by "performance sensitive"). That is the only issue I was trying to address.

Real-time GC

There isn't a whole lot of mystery to the real-time garbage collector. There are perfectly workable collectors that reclaim resources in a bounded period of time and are guaranteed to consume less than X amount of CPU time in Y units of clock-time.

The downside is that real-time GC consumes noticably more CPU resources. The time is more broken up, and that the garbage is guaranteed to be collected in a certain amount of time.... thus, if real-time is not a requirement, you probably don't want to use a real-time collector.

Several of HP's printers are written in Eiffel, and are garbage collected.

FP Acceptance

The point is that there's nothing inherent in FP that would preclude it from being used in real-time applications. That's not the same as saying that every FP language is suitable for real-time applications.

The point is that until FP tool authors begin to write tools that are *perceived* by the industry as being competitive with existing tools on the performance playing field, FP will remain a niche market. I never said that FP *couldn't* compete with mainstream tools. Merely that it *isn't*. That's because the FP community is concerned with things like elegance and provability and adding the latest theoretical feature, while industry is concerned with things like toolchain availability, user support, and performance.

FP very much likes to make things on the fly, such that most, if not every FP language is garbage collected.

Uh... last I checked, so was Java. And yet it seems to be perfectly acceptable for all sorts of financial middleware and other "mission critical apps".

I personally don't think it's acceptable, and I wouldn't recommend that a performance sensitive project be implemented in Java (that doesn't mean hard real-time...just that all the transactions post in the same day). My guess is that people who implement such software systems simply throw more hardware at it until the performance is acceptable. But having used some fairly beefy Java programs, "speedy" is one of the last words I would use to describe them.

FP niches

My guess is that you probably mean "pure" FP languages. After all, there are many C++, Perl and Python practitioners doing FP these days.

IDEs, not languages

Mainstream programmers don't dig FP languages for the very same reason they don't dig C or Perl: there are no IDEs of the same caliber as Eclipse or VisualStudio for such languages.

I believe it's indeed a very strong point. Mainstream programmers think all languages are the same, with just syntatic differences, and that what truly matters is integration and productivity tools supporting programming in such languages...

Thing is, most FP languages are very concise, small languages and their expressiveness allows for saying a lot ( doing a lot of work ) in a few sentences, thereby not really needing a huge, bloated IDE. So, development concentrates on the languages themselves and associated libraries, rather than in such "useless" tools.

Fact is, though, that that is perceived as a huge weakness by the mainstream folks, specially those managing teams of average programmers...

sad.

Interesting Point...

...just a few days ago I reiterated something significant to me to an old programming friend and colleague: having spent the weekend working with O'Caml in EMACS with Tuareg Mode and Tuareg imenu, and having also recently spent a weekend with SBCL in EMACS with SLIME, I could honestly say that the primary productivity enhancer is interactive development. Everything else pales in comparison: static vs. dynamic, interpreted vs. compiled, whatever. And this is borne out in my use of Eclipse, too: the big benefit is having the project kept consistent at every save of the code, whether it's Java code, C++ code... It's not quite "compile as you type," but it's close enough that I don't slip out of the zone while I'm coding, and that's key.

So I think it's true that oftentimes people emphasize the language per se too much (I myself am guilty as charged) and don't emphasize enough that you have tools like Tuareg Mode and Tuareg imenu to make developing in the language extremely productive and even fun.

yes

and when i said C or Perl don't have a proper IDE, i was talking about mainstream programmers, for whom Emacs is a lousy reject.

If it isn't big, bloated, slow, graphical, with tons of little icons, drop-down menus, "intelisense" and doesn't use the "standard" notepad "ease" of use and keybindings for "common" editing operations, it isn't worth it...

Sure...

...what I find interesting is how these IDEs still fall far, far short of the environments of, say, Smalltalk-80 or Genera in terms of supporting programmer productivity. To see how far you can go with a good interactive programming language and IDE in EMACS, however, you may wish to see the SLIME video available here.

"far, far short" - oh really?

There's at least ex-Smalltalk programmer who disagrees:

"For me IntelliJ was first leap forwards in IDEs since Smalltalk. Power and usability that suddenly made everything else second rate. We had entered the post-IntelliJ era..."

- Martin Fowler

I'm not going to watch a 150Mb video, but the stuff on the SLIME feature list are things we already have in IntelliJ and Eclipse.

Well...

Martin Fowler: For me IntelliJ was first leap forwards in IDEs since Smalltalk. Power and usability that suddenly made everything else second rate. We had entered the post-IntelliJ era...

I'm just going to have to respectfully disagree with Mr. Fowler, having also used both Smalltalk-80 and IntelliJ.

skybrian: I'm not going to watch a 150Mb video, but the stuff on the SLIME feature list are things we already have in IntelliJ and Eclipse.

I've also used Eclipse (and still do, when I must work in Java), and no, you don't. The 150Mb video exists because in the end, feature-lists aren't anywhere near adequate means to determine how your productivity will be affected by any tool more sophisticated than ed. But remember, I wasn't talking about SLIME; I was talking about Smalltalk-80 and Genera. Maybe a 50Mb Genera video will be more palatable to you. In any case, the point remains that we're only now (Eclipse, IntelliJ), circa 2000-2005, seeing IDEs for languages like Java or C++ that might be productivity-competitive with Smalltalk and Lisp environments from 25 years ago. Unfortunately, even with IntelliJ or Eclipse, you still don't have the level of interactivity that you do in Smalltalk-80 or Lisp because these are interactive languages while Java, C++, etc. are still batch compiled (yes, I know about the scrapbook, etc. It's a start, but it's a different runtime/namespace and doesn't affect your project).

But all of this is just words and will remain so unless/until you actually try out one of these 25-year-old environments, or at least something that gets reasonably close to it like SLIME. That's not to say that IntelliJ and Eclipse aren't good environments—they are. It's merely to say that it's possible to do even better, particularly if your language design helps you to do so from the outset, and that it was done better 25 years ago using languages of that type.

Might be true for java, but...

For C++ Microsoft Visual Studio has "Edit and Compile" "debugging", which allows one to edit the code, recompile, and continue with the current programme state; one can also arbitrarily change the current statement. All in all, it makes for a reasonable development experience, at least after one has finished doing a lot of heavy code editing, for which I prefer to use XEmacs.

programming environment not language

you don't have the level of interactivity that you do in Smalltalk-80 or Lisp because these are interactive languages

Smalltalk-80 (and Cincom Smalltalk, and...) provide(s) highly interactive programming environment(s).

Smalltalk The Language is no more interactive than any other chunk of text - as we rapidly learn writing chunk-format Smalltalk in Notepad for GNU Smalltalk.

except of course...

"SLIME feature list are things we already have in IntelliJ and Eclipse."

except for a decent expressive programming language, of course... :)

IDEs and average programmers

A great programmer may be able to deliver a project quickly, reliably and cheaply using Vi (ignoring the maxim about choosing two out of the three). A 'bad' programmer may need a full IDE to help him/hir deliver the same project. I don't see how that is a bad thing.

Secondly, if these tools help an average developer become more productive, surely a better programmer could make use of them as well. If the worst crimes of an IDE are that it helps developers remember syntax, library functions, underlines possible errors on-the-fly, keeps book marks, refactors sensibly and identifies compronents of programs (interface, class, etc.) with little icons, then we need a huge crime wave for all languages. :)

Various developer tools may be many things, but 'useless' is certainly not one of them.

well

The original point i was trying to make was not that you won't be productive with an IDE, just that FP and other less mainstream languages don't go mainstream not for bizarre syntax or paradigms, but rather exactly because there's no IDE backing them up so that average programmers can pick them up.

Of course a good IDE helps a lot. I'm particularly fond of the interactive module and class navigation system that most IDEs call "code-completion". Other than that, one thing that truly annoys me to death is the fact that most of these IDEs come with really lame subpar code editors... gimme emacs anyday. And i'm used to VisualStudio and Delphi...

well, i guess nothing's perfect and there's always a price to pay...

I wonder if this is a self-pe

I wonder if this is a self-perpetuating state of affairs - programmers who do pick up FPs and don't need IDEs have better things to do than spend their time developing fully fledged IDEs. That said, there are some very popular development modes for (X)Emacs for functional languages that their users find give them the kind of support they want, like Tuareg and Slime.

Key Observation

Marcin Tustin: That said, there are some very popular development modes for (X)Emacs for functional languages that their users find give them the kind of support they want, like Tuareg and Slime.

Very well put—and it's worth remembering that in the Lisp and ML communities, programming support via what would be called an IDE today has had some 30-45 years of evolution behind it, so "the kind of support they want" isn't as idiosyncratic as it might sound offhand.

Indeed. And it's worth noting

Indeed. And it's worth noting that the kind of thing that a lot of people still think is pretty cool - code completion - can be largely accomplished with hippie-expand.

The "code completion" features that most modern IDEs provide seem to exist to allow complete novices who can't even look up documentation on their own to be productive. This isn't me being snide - this is an honest observation of people I've seen who think that programming in Java is hard, but that C# is fantastic.

Code Complete Ain't Just for Noobs

For library impoverished languages, code completion may indeed seem redundant. But when it's possible to pick up a library you've never used or seen before and get 90% of its functionality just by looking at the methods and constants available (plus a few examples, say), then it becomes apparent that code completion is very handy. It's also handy when you can't remember the exact spelling of a long identifier, or simply don't want to type it out.

In fact, code completion + Javadocs are probably one of the reasons why Java with a decent IDE is such an easy language to use, despite having more libraries than you could shake a trie at. The simple fact is, many well-designed libraries can communicate most of their functionality through their interfaces. So unless the library does something tricky, there is often not a need to study the documentation in detail; especially for canonical uses that are perfectly well illustrated by a few examples.

Having said that, I've never tried C#, so I don't know how code completion between that and Java fares, but I would feel pretty insecure with a programmer that says "Java is hard!" However, I suspect that people who say that simply haven't set up the Javadoc references correctly in their Eclipse projects.

I find that I can use a web b

I find that I can use a web browser or even man to read API documentation. As for saving keystrokes, almost every single time I want to do that hippie-expand and dabbrev-expand do just fine. The last time I needed an aid to reading was when I was following along the line with my finger; perhaps the next time I need language- and library- aware code completion will be when my memory begins to fail me.

I suspect that novices find such technologies helpful because they don't understand the way that programme elements can be composed. Unfortunately, cruddy libraries with poor documentation and very complex compositionality rules can still confuse, and code completion still doesn't help with that.

Code Completion + Sensible Types == Win

I find that I can use a web browser or even man to read API documentation.

Sure, but many times I don't even *have* to read documentation. For several handy Java libs, I've just installed a JAR, referenced the Javadocs, looked at a few examples, and wrote the necessary code to use it without ever looking at the documentation proper (or at least not until I needed to know some subtle point about its behavior).

As for saving keystrokes, almost every single time I want to do that hippie-expand and dabbrev-expand do just fine. The last time I needed an aid to reading was when I was following along the line with my finger; perhaps the next time I need language- and library- aware code completion will be when my memory begins to fail me.

Or when you go to use a library that is not in your memory because it's the first time you've used it. In which case, informative identifiers and well-designed types go a long way in making the usage intuitive and straightforward (and types + code completion can help you use the library correctly without actually knowing the library simply by limiting your available expressions to the ones matching the expected type). Also, code completion can help narrow your documentation search when you do need to actually look at the full docs.

I suspect that novices find such technologies helpful because they don't understand the way that programme elements can be composed.

It's really about productivity, not what can and can't be done.

Unfortunately, cruddy libraries with poor documentation and very complex compositionality rules can still confuse, and code completion still doesn't help with that.

True dat. But reasonably-designed libraries with good Javadocs are actually very well served by code completion.

So, your point is that you do

So, your point is that you don't have to read documentation with code completion because javadoc is documentation? Pray tell what "javadoc" stands for?

Whether or not this is documentation, I still find it quite readable without support from my editor. Well designed types (and better yet sets of operations over types) are easier to use whether or not you have code completion. I really have met very few competent programmers who honestly find code-completion does more than save them keystrokes. When deprived of it, that is what they bemoan, not that they suddenly have to use their windowing environment to switch between an editor window and a browser window.

Clarification...

Well, Javadoc is a tricky beast. If you don't have a nice IDE like Eclipse, it's apparent why you wouldn't immediately get what I meant. Code completion in Eclipse is really more than code completion. If you type the prefix to a method, you not only get a list of available completions in a popup box, but when you highlight one of the selections, you get a small summary of the documentation for that method/value. So what most people call "documentation", as in "a set of web pages or hardcopy that describes a program/library" does not actually need to be referenced in its typical form.

So the neat thing about Javadoc is that Java IDEs know about it. You can either view it as traditional documentation, a page at a time, and look up what you need to by index or text searching, or you can just get a code completion on the relevant term, which combines searching, documentation, and code assist all in one easy-to-use and intuitive operation.

Besides the documentation aspect, I use code completion to accurately spell things for me. When you end up with identifiers like REALLY_LONG_CLASS_CONSTANT, it is easy to miss a letter or two. Code completion helps you by not making you type out every letter of an identifier whose purpose you understand, but whose exact spelling you really don't need to memorize. It's also useful when you have a lot of constants in an enumeration and you would know what the name of what you want is if you could see it, but you can't think of it off the top of your head (and if you use a lot of third party libraries, or really big third party libraries, this happens more than often enough).

And looking up documentation is more than switching between the editor and browser windows (which is annoying enough all on its own). You also have to do a search for the item in question, which may or may not involve loading a different documentation page, or even searching for the set of documentation that you need to search in. Again, when you use a lot of libraries, this happens frequently enough to be annoying.

Code completion saves me time, but being a 72+ wpm touch typist, I really couldn't care less about saving a few keystrokes. And I find that when I'm coding in C++ in an impoverished IDE, and I need to look something up, I spend a considerable amount of time doing so as compared to Java.

I fully understand how code c

I fully understand how code completion/documentation browsing works - that is my entire point about supporting novice programmers. However, for people who do understand the language, it is my observation that they can cope with scanning their eye down the library documentation, where the library does not have an obscure design, and be perfectly productively with the minor difference in procedure of having to switch to a browser occasionally. Clearly, your perception is different on the matter of how much time looking up documentation consumes. In this case, only a decent study can resolve the matter. Where library design is obscure, I doubt that integrated documentation display would help, but it may be otherwise.

It's the context switch that matters

It's not the keystrokes that matter, it's the mental context switch needed to switch from "coding" to "reading" mode. I can code as fast as I can type as long as it has my full attention. But once I need to start looking things up, I start losing time remembering what I was just doing, and figuring out what I need to do next, and working out what I need to look up in order to do it.

I don't have any studies for you, but I have my personal experience. I find that looking up an unfamiliar method with code completion or intellisense takes about 3-5 seconds. Switching to a browser takes at least a minute (including the mental context switch), assuming I know exactly what I'm looking for. If I have to hunt, it's easily 5 minutes. That's a factor of 10-50, so I find it's often worth using Java even though it takes 3-5 times more code to accomplish the same task.

not so fast!

It really isn't a feature just for novices: it's a very handy tool to have in your toolchain.


import foo
import bar
...
foo.some.deeply.nested().attrib...

code-completion definetely is handy, particularly in OO environments. Actually, it makes me wonder if OO became so prevalent in the industry because of such convenient support tools like IDEs featuring such easy way to navigate through the class and module trees; or if it is the fact that such paradigm needing such powerful ( and costly ) tools that has made tool vendors push OO like mad to the market?

"OO: the solution to your problems, and ours too! hoho"

btw, still waiting for proper, semantic-level, code-completion in Emacs, other than the text-search hippie-expand: when that new function from the module you just imported isn't there yet, you either type it from scratch ( hoping it's short ) or copy-and-paste from your webbrowser ( hoping it's not Lynx, though it's so sweet to browse htmdocs ).

Both noobies and experienced programmers alike have huge productivity gains with code-completion.

Why can't you copy from lynx?

Why can't you copy from lynx? I copy from my terminal all the time. As to huge productivity gains, I type pretty quickly, and find that typing is not the bottleneck in code development.

My experience is that XEmacs-style statistical expansion works far more frequently than completion in Visual Studio, which typically fails to find expansions for anything except some things in the standard library, like type names. I am pleasantly surprised when it does find methods on non-standard libraries.

quick

"I type pretty quickly"

well, just imagine that + code-completion!

the fact is, one of the purposes of programming is to automate repetitive and boring tasks: code-completion basically reduce your whole cycle of
lookup functions in module;
choose and copy function and parameters to editor buffer;
to a single ctrl+enter after the desired module name ( or class )

"typing is not the bottlenexk in code development"

"Money doesn't provide happiness, but sure helps a lot"

^_^

I use code completion in XEma

I use code completion in XEmacs, and one of the reasons I like XEmacs, is because, as I said (on edit), code completion works more than it does in Visual Studio. But, I think it was pretty clear that code completion is not a big win - speeding up my typing has minimal impact, but money "sure helps a lot". Do you see the difference?

I don't look up functions every time I use one, because as I've said before, my memory works pretty well; when looking at documentation for the first time, it provides more than the names and signatures of operations.

Also, you haven't explained why you can't copy from lynx. IS IT BECAUSE YOUR ARGUMENT LACKS CREDIBILITY???????

Easy there

Don't blow a gasket, we're just talking about IDEs here. Not a huge deal.
Detachment is a good thing.

How can you say? Detachment i

How can you say that? Detachment is a tool of the UN Shadow Government. If you condone detachment, it will lead to you being forced to programme all your applications in LaTeX. The worst part is that once they're finished with you, you'll like that.

hippie-expand != code-completion

"I use code completion in XEmacs, and one of the reasons I like XEmacs, is because, as I said (on edit), code completion works more than it does in Visual Studio."

Yes, i understand: it works on text that is not code, including words in comments, for instance. That is exactly why it doesn't help much in coding: it is just a convenient backwards-text-search, not an interactive class and module browser.

"I don't look up functions every time I use one, because as I've said before, my memory works pretty well; when looking at documentation for the first time, it provides more than the names and signatures of operations."

Your memory may work well, but wouldn't it be far better if your fingers could type about just as fast as you remember the function names? Code-completion gives your that.

And that's just half of the story: for modules you've never seen before, code-completion shows you their interface in a glance. If the function names are descriptive enough, you have a win. It isn't a substitute for docs, but for people used to C headers as the only docs, it's not much of a big deal...

"Also, you haven't explained why you can't copy from lynx. IS IT BECAUSE YOUR ARGUMENT LACKS CREDIBILITY???????"

perhaps i just don't know how to? care to enlighten me? i'm using a stupid terminal called rxvt, which doesn't even seem to use xterm conventions...

it isn't important, anyway: this whole discussion is about people like you bitching on such a great interactive automation tool and editing enhacer like code-completion, urging us instead to rely on memory or manual lookups and copy-pasting...

i'm wondering if you ever used code-completion in an IDE just for purposes of comparison?...

"i'm wondering if you ever us

"i'm wondering if you ever used code-completion in an IDE just for purposes of comparison?..."

I refer you to my previous comment comparing the performance of Visual Studio code completion against that of XEmacs.

Edit: Have you ever performed a comparison of your productivity using only an editor that does not support code completion as against using your IDE, not using any "integrated" features other than code completion/documentation referencing?

Not a VS fan...

Or a fan of C++ IDEs in general. After 12+ years of coding in C++, I haven't found a single one that is as good as Eclipse is for Java. That's really pathetic. However, I dare you to compare Eclipse 3.1 + JDK 1.5 to your favorite FP IDE and tell me that code completion isn't useful, or is just for noobs. As someone who is more or less forced to use Java, I have to say that Eclipse is the one compiler tool I've used (of any language) that makes a barely tolerable language somewhat fun to use. I couldn't imagine trying to write Java in a raw text editor, even if I were an "expert" (though I know C++ well enough to do so and do so regularly). Even if I could do it to prove a point, I would not want to as a matter of course.

Frankly, I can't stand eclips

Frankly, I can't stand eclipse. The extended set-up procedure before being allowed to edit a file seemed to have absolutely no pay-off when I tried it.

If you tried...

...to use it to do something other than Java, then I agree that there is little to no payoff. I haven't tried the C++ plugin yet, but I don't expect high things from it. However, it's absolutely essential for Java development (though if you're allergic to Java, then there's no reason for you to try it).

I've written a lot of java in

I've written a lot of java in my time, for employers as diverse as engineering companies and banks, and I've never been impressed by eclipse, which I found excessively cumbersome in every single way, from staring it up and find that it required me to perform some heavy configuration before it would allow me to even open file, and them became even more confusing when I wanted to do such advanced things as manage the files in my project, even to the level of files in different directory trees. Frankly, I'd rather be using the old Visual Basic 6 IDE, to programme Visual Basic, because all I had to contend with them was incorrect documentation.

Which version?

It's definitely true that Eclipse is a big beast. My installation takes a good 90-120 seconds to load up, whereas C++Builder usually fires up in seconds (except when loading a large project, of course). However, I'm not sure I know what "heavy configuration" you needed to do to open a file. I just select "File | Open File..." like I would in any IDE. If it's a file in my project, I just double-click it. I don't recall having to configure anything to get to that point. The configuration I've done is things like tweaking the syntax highlighter's colors, installing plugins, etc., which were all optional to the actual development process.

The VB6 IDE is faster, but compared to Eclipse, rather impoverished (especially the editor and syntax highlighter). I like Borland's IDEs better (though they have their own problems too).

I'm referring to being challe

I'm referring to being challenged to select a bewildering array of directories on first running eclipse, and then another bewildering array when attempting to start a project, something compulsory as far as I could tell, before it was possible to actually edit a file. This makes eclipse seem more like an elaborate joke than a tool.

Ahh...yes.

You must be talking about the workspace. I think common practice is to set up one workspace and make that the default until/unless you need more than one. Once you do that, you don't have to select any more directories than you would in a CLI environment.

now i know what your problem is

you think small. You don't seem used to big professional software projects. Perhaps you're just too young?

Fact is: no one uses eclipse to open a file, edit it, then close and wait another half hour as it initializes again to edit another one ( well, you don't do that in emacs either, do you? ).

People use eclipse to manage huge software projects covering hundreds or thousands of classes and modules scattered across several files. They start it when they get at work and close it when go home and work is done.

BTW, just to answer you: i use Borland Delphi and ObjectPascal at work and have my personal projects coded in C + GTK at home, using Emacs and hippie-expand, gtkhtm docs opened in lynx. Yes, i'm positive i'm not as productive as with Delhi's code-completion.

As i said before, manually looking up interface declarations and copying them to the text buffer is not as fast as an automation tool that does the same in one keystroke.

Almost orthogonal persistence

They start it when they get at work and close it when go home and work is done.
Just for a record: I start it when I boot my laptop, and usually do not close it until Windows crashes (or a new security update requires me to reboot it). So my average Eclipse session lasts more than a week :-)

Usage patterns like that really test memory leaks in software (it used to be an issue only for server software, not anymore).

On memory leaks...

And unfortunately there're all too many. I regularly close and reopen Firefox, Thunderbird, BitTorrent, and Gaim because they've grown to hundreds of megabytes of RAM usage, even though my normal usage pattern is to leave them open indefinitely. The worst was AIM 4.7, which would regularly grow to over 500 megs after a couple days of usage.

Though (to drag this semi-back-on-topic) Eclipse seems fairly good in that regard. It's a pig in general, but at least it's not a pig that keeps getting piggier.

I'm not talking about opening

I'm not talking about opening and closing it, which is why I didn't say that. I said that it takes "too much setup". This is the second time where you have clearly failed to read what I said, and now you're thrown in a low insult for good measure.

What we do with IDEs is open files, edit them, then compile a bunch of them together, then maybe use some other tools to debug and analyse the programme. Because I've used other IDEs than eclipse, I know that they do not require an extended configuration phase before you can even start programming - I'm talking about JBuilder, Visual Studio, Borland's C++ environment, and several others I've toyed with. Oh, and if you find that opening and closing programmes is a big hit on your time, consider maybe not turning your computer off.

Please explain how I "think small".

thinking small

"This is the second time where you have clearly failed to read what I said"

sorry. wasn't on purpose.

"they do not require an extended configuration phase before you can even start programming - I'm talking about JBuilder, Visual Studio, Borland's C++"

Commercial IDEs without such level of extensibility via plugins and not having to deal with such disparate external tools such as Ant or JSP servers can't really compare to an open-source tool like Eclipse. The features are all there, but you'd better take some time to tweak them.

"onsider maybe not turning your computer off"

my boss won't like it. at home i have power bills to pay for...

"how I 'think small'"

somehow, the idea of telling my colleagues to drop automatic code-completion and instead rely on memory, fast typing, manual lookup of functions in modules and then copying to the text buffer, because we'll be as productive as ever, seems kinda awkward...

the productivity argument

"somehow, the idea of telling my colleagues to drop 'automatic code-completion and instead rely on memory, fast typing, manual lookup of functions in modules and then copying to the text buffer, because we'll be as productive as ever,"
the productivity argument in this case is always that these methods force better understanding of the objects being used by their nature, thus leading to higher productivity later.
The code-completion argument has no problem if it is assumed that one actually does read the documentation for objects being used, and does frequent manual lookup of functions. However the code-completion reality seems to be that people install x, get an example of code for x, start typing and guessing at the methods to run on x using their 'common sense'.
as problems are encountered only the manual section referencing the problematic method for x is used.

Thus there are people who observe this state of affairs and conclude that the IDE with code-completion is actually a productivity waster.

Documentation leads to productivity?

the productivity argument in this case is always that these methods force better understanding of the objects being used by their nature, thus leading to higher productivity later.

Is that actually the case, though?

I read a lot of documentation for the hell of it. Not work related stuff - I just like exploring the manuals for new languages, new libraries, and new tools that might or might not be useful later.

Only very rarely does anything I've read actually turn out to be a productivity booster.

That's okay - it's certainly more productive than watching TV or looking at porn, which is what most other people do with their free time. But I'd be skeptical of an approach that advocates reading documentation as part of a daily work pattern.

In my experience, top productivity comes from doing as little as possible to accomplish the task, and hence getting as many tasks accomplished as possible in a given period of time. You memorize the stuff that's actually important then, not the stuff you think is important. The two very rarely coincide.

Could you point to where I ha

Could you point to where I have told you to stop using an IDE? Instead, I have questioned your claim that experienced programmers derive a genuine increase in productivity, to which you have responded not with anecdotal evidence or even reasoned argument, but with pooh-poohing and low insults. Now, when challenged to justify your insult, you go on to an irrelevance based on a factual inaccuracy.

Anecdotal evidence? I'm givi

Anecdotal evidence? I'm giving you plenty but you don't listen to it.

No way an experienced programmer will be as productive by manually looking up functions and copying them to a buffer or relying on photographic memory and fast typing as he will be with fast typing and code-completion automating those silly steps in a keystroke. No way!

And i'm done here, since argumeting with you is as productive as argumenting with a doorstep.

Productivity is a hard measurement

Do we measure productivity in terms of volume of code produced? And is the speed of coding even indicative of the overall level of productivity for a project? Most of the evidence in this industry slants towards the anecdotal and conjectural. Excuse me if my warning bells of scepticism start ringing loudly when anyone makes claims of huge productivity gains.

In terms of anecdotes, I find writing code to be extremely easy. Between books and google, I rarely get writer's block. After I code in a language for a while, the available routines get burned into my brain. If I start coding in libraries that I have never used before, I have more problems than just locating the particular property or method that I need. I have the problem of learning that library, so that I don't shoot myself in the foot.

I do have my days where I can code up a storm - thousands upon thousands of lines of code can slip through my fingertips. Yet, these momentary bouts of epiphany can keep me busy for weeks on end, trying to clean up the messes I made in the wake.

My personal goal on any project - once I get past the prototyping stage - is that the volume of code for a project should peak early and actually come down as the product gets more mature. This in spite of the fact that the number of features become larger and larger. On a mature project, I measure success in terms of how many lines of code I can eliminate (not vice versa). As I work more and more with the software, I stumble upon better abstractions and the true patterns of the domain.

At any rate, that's just me. I've seen programmers that whip out code that makes your head spin with much cruder tools than vi and emacs. Programming is a game of logic, and good tools are no substitute for for good thought. That said, good programmers can use tools to good effect and be more productive. But I wouldn't say that the productivity is orders of magnitude. More likely to be matters of convenience than an earth shattering change in the way programming tasks are accomplished.

quantity != quality

but tell that to managers! They love huge amounts of lines filled. Be it comments or declarations, with a few loops and conditionals here and there to spice it up.

Really, code-completion can be a blessing for allowing for quick navigation through a huge library tree of many modules and classes, new and old. But it can quickly translate into badly written code, something you just notice when you noticed during repair time.

For instance, because of such ease to navigate though classes, you can quickly endup with long sequence like this:

class.method().attribute.method( obj.attribute, obj.method() ).getSomething();

this, for the sake of clarity, should be much better if decomposed in several steps:

obj1 = class.method().attribute;
arg1 = obj.attribute;
arg2 = obj.method();
obj1.method( arg1, arg2 );

but this generally doesn't happen, since it involves declaring some more things, but specially because code-completion makes it so damn easy to write it like in the first example...

it can be a curse, at times. specially in the hands of noobies...

"I measure success in terms of how many lines of code I can eliminate"

100% agreed.

Better thought out software doesn't benefit much from code-completion: most of the time, you see the errors you commited in the first place, and then begin to see patterns here and there and to copy-and-paste chunks of code, parameterize them and put them in functions for reuse in those other parts...

Let's measure your success

class.method().attribute.method( obj.attribute, obj.method() ).getSomething();

1 line.

obj1 = class.method().attribute;
arg1 = obj.attribute;
arg2 = obj.method();
obj1.method( arg1, arg2 );

4 lines.

hmm...

The problem I see with this style is that names like obj1, arg1, and arg2 aren't useful, and often there is no useful name for them, so I consider it more concise and better to use the first style.
Although, I'm a functional programmer, and the first style is, in a sense, "more functional".

yes

"that names like obj1, arg1, and arg2 aren't useful, and often there is no useful name for them,"

Of course, it was an example of long spaghetti line generated by too much reliance on code-completion. I find it clearer to decompose it in significant steps and giving appropriate names to them.

"Although, I'm a functional programmer, and the first style is, in a sense, 'more functional'."

But in Lisp, for instance, you hide all that confusing function calling upon function calling by properly indenting the code. There's none of that when repetitively calling methods with code-completion: you just end-up with a very long line.

It's annoying when reading code to have to side-scroll in order to make sense of such mess. It is because of this that i don't mind wasting a few more lines to be clearer in my purpose.

I mostly agree...

...with your sentiments, but must observe that there is a huge difference between products that take months or years to develop and ship out the door, and products that take weeks or months to develop and are used in-house. The difference is that the former generally are (or at least should be) crafted, while the latter are generally "manufactured". By "manufactured", I mean that they don't involve intricate and subtle program logic, they can use many off-the-shelf components, they don't need to be pretty, or even proven correct by mathematical standards.

For large, crafted projects, you definitely should spend more time thinking about and refining the design. But manufactured apps are exactly the ones that benefit from RAD features like code completion and GUI builders. Since the average programmer is a code manufacturer and not a code artiste (due to economics, skills, or whatever), I dare say that most coders do get a true productivity boost from rich IDEs that offer "noob" features like code completion.

On the fringes

Unlike many, I'm more than willing to admit that the kinds of software I develop are usually on the fringes - the place where some of the most interesting software is developed. I've done software projects that range from several minutes from inception-to-design-to-completion all the way to several years (a decade in one instance) - ranging from in-house apps to hosted services to packaged products. At the moment, I have the entrepreneurial bug which has its own set of characteristics. No bureaucracy to blame - just my own stoopid self. :-)

Whether the noobs get enhanced productivity out of the tools, I suppose I couldn't disagree. But the level of productivity boost is an open question. And I've found that many get addicted to the tools and the manner in which they develop software is influenced in subtle ways.

Psychology

Whether the noobs get enhanced productivity out of the tools, I suppose I couldn't disagree. But the level of productivity boost is an open question. And I've found that many get addicted to the tools and the manner in which they develop software is influenced in subtle ways.

I notice that in hacker culture (and I mean "hacker" in the MIT sense, not the ABC News sense), there is a certain "warrior honor" in using impoverished tools as a sadomasochistic way of proving one's worth. I will be the first to admit that I would open up vi or emacs to prove a point to some young punk who thinks he's da bomb but never coded outside an IDE. However, I've also spent a lot of time building tools, and respect how difficult it can be, which is why I'm not above using the latest tools, even if they are perceived as enabling unqualified programmers.

I'm almost certain that if IDEs were primarily used by the programming elite, and beginners used plain text editors and CLI tools, the arguments in this thread would go the other way, holding all the technical points constant. Yes, I'm basically saying that there's a certain technical snobbery at work in the anti-IDE movement. emacs is a wonder of software engineering, but when it comes to the command line, I use jed because it looks and feels like emacs, but doesn't take 5 seconds to load. No, I can't watch Towers of Hanoi or chat with Eliza, but if I feel an overwhelming need to have an interactive Lisp session, emacs will be right there waiting. Trying to compare emacs to a richly featured IDE like Eclipse is very much like trying to argue that writing raw HTML in Notepad is superior to using the latest web authoring tools. There was a time when such tools were in such poor shape that using a text editor was actually better. But those immature tools have matured, in both arenas, and the only real reason to not use them is simple stubbornness.

EMACS vs. Eclipse

David B. Heid: Trying to compare emacs to a richly featured IDE like Eclipse is very much like trying to argue that writing raw HTML in Notepad is superior to using the latest web authoring tools.

The risk inherent in this statement is that EMACS can be as richly featured an IDE as someone is willing to invest the time in making it, and in the case of something like SLIME, people are investing a great deal of time indeed. I still urge people to give that 150MB video a look to see what I mean.

Well...

The risk inherent in this statement is that EMACS can be as richly featured an IDE as someone is willing to invest the time in making it, and in the case of something like SLIME, people are investing a great deal of time indeed. I still urge people to give that 150MB video a look to see what I mean.

Eclipse can be as richly featured as people are willing to make it, and judging by the plugins available, people have been willing with a vengeance. The difference is a console app vs. a windowed app. Let me describe what I see in my Eclipse window, and you compare it to what you see in your favorite emacs mode...

Taking up most of the space is a code window syntax highlighted in my favorite color scheme (which happens to be the Turbo C++ IDE default colors). Non-essential portions of the code have been folded into one line (imports, header comments, etc.). Decorations in the left gutter indicate which methods are overriding or hiding base class methods. Decorations in the right gutter indicate the location of TODO comment items within the file, as well as the location of warnings and errors (Java may not be an interpreted langauge, but Eclipse makes it extremely interactive on the syntax level). Each source code tab has an icon indicating which editor has opened that file (the default Java editor, a Swing GUI builder, a MyEclipse editor, etc.), and a decoration on the icon indicating whether that file has any warnings or errors in it.

The remainder of the space is taken up by the project window which shows me all the files and resources in the project in a treeview. Each file with errors or warnings is decorated with an icon, and those are propagated up to the package and project levels. I can see which jar files in my project have references to source or documentation specified, so that I know whether I can hit F1 on a method call into that jar and get help or step through its implementation. For projects in source control, I can see the path to the project within SCCS, I can see which files have been changed but not committed, and which files have been added but not committed. I can also see the version number of the file and what date it was last committed, as well as by whom. There are decorations for resources that are in SVN, and for the various project types (Java, simple, or some other type).

All of this information is available in a glance in a typical snapshot of my development window. I'm fairly certain that this amount of information would be difficult, if not impossible to provide in an easily understandable format within emacs. And it's not really a fair comparison, because emacs is essentially a console program. But then, that's kinda my point.

Definitions

I think you'd have to define what you mean by "essentially a console app" in an era when EMACS has been doing windows, multiple fonts, colors, and even images for about a decade now.

I honestly don't know what else to say. I use Eclipse 3.1; I use GNU EMACS 21.x.y. I tend to do my Java development in Eclipse and everything else in EMACS, and not once have I found myself saying "Gee, I wish __________ Mode in EMACS did __________ like Eclipse does." On the contrary, I have found myself having to hand-integrate processes into Eclipse painfully, but that has more to do with Java's relatively poor support for interacting with other processes in the system than with Eclipse. I also wasn't claiming that Eclipse isn't extensible or that it isn't more popular than EMACS, so I don't see the relevance of the plug-in point. My point was that EMACS + a good mode or collection of modes can be, and for some of us is, a perfectly good IDE, that's all. I fail to imagine why this is even controversial, given EMACS' longevity and popularity in hacker circles.

console

"EMACS has been doing windows, multiple fonts, colors, and even images for about a decade now"

but it's still a console app: you can do all that ( except for the images ) in a console as well. emacs -nw

By being a console app means that, yes, it has windows, but they are just other console outputs in big windows.

I understand that he's trying to say that a GUI-designed IDE like Eclipse does far better use of available visual space by having useful information nicely formated in the form of little icons, tabs, status bars, tooltips and drop-downs; rather than a plethora of confusing tiny windows showing nothing but loads of disparate text as in an essentially console app like Emacs...

But That's Just It...

rmalafaia: I understand that he's trying to say that a GUI-designed IDE like Eclipse does far better use of available visual space by having useful information nicely formated in the form of little icons, tabs, status bars, tooltips and drop-downs; rather than a plethora of confusing tiny windows showing nothing but loads of disparate text as in an essentially console app like Emacs...

EMACS doesn't impose "a plethora of confusing tiny windows showing nothing but loads of disparate text." I've got drop-downs, I've got icons, tabs, status bars, tooltips... I think the problem here is lack of familiarity: if you aren't using one of the better IDEs in EMACS, then you might not be aware that you can. :-)

measuring

'Do we measure productivity in terms of volume of code produced? '
a manager measures productivity that way

a programmer measures productivity in terms of how little code was produced that did the task to spec
:)
ok, maybe not, but it sounded pithy dammit.

Problems with libraries?

If I start coding in libraries that I have never used before, I have more problems than just locating the particular property or method that I need. I have the problem of learning that library, so that I don't shoot myself in the foot.

I'm curious - what problems do you typically run into? My experience with libraries has been that the bulk of time goes into looking up method names, types, and parameters. Yes, there are occasional gotchas. But if I can cut method lookup time by a factor of 10, that gives me a ridiculous amount of time to read the documentation, view the source code, step through a debugger, and contact the original developers when I do run across that obscure library "feature".

"Treat every problem as if it can be solved with ridiculous simplicity. The time you save on the 98% of problems for which this is true will give you ridiculous resources to apply to the other 2%."

There no IDEs for C that are

There no IDEs for C that are of the same caliber as Visual Studio or Eclipse? What about Visual Studio? Is that not as good as Visual Studio?

wouldn't that be for C++?

wouldn't that be for C++?

anyway, there's always Emacs, which has a pretty powerful and capable cpp-mode...

or anjuta, which i hope eventually becomes really good...

Visual Studio supports C. As

Visual Studio supports C. As far as using Emacs, I'm not aware of any stand-alone debuggers that could be integrated with Emacs (or XEmacs) to provide Visual Studio's "edit and compile" interactive development, which is why I'm currently using both.

There are very nice IDEs

Commercial Common Lisp IDEs (Allegro or LispWorks) are very nice (no flames about how CL "is not FP").
Bigloo Scheme also has a very nice development environment. Oz (multiparadigm) has a pretty complete environment based on Emacs (goes beyond the usual stuff). So, part of the problem is that people simply don't know this stuff exists. It's hard to compete with some PR departments.
I think, eventually, the FP community will get there. However, the sheer number of people using Java is hard to beat.

Maybe the answer lies in metaprogramming

Would you entertain the idea that languages that support some form of metaprogramming eases the pain of not having a supercharged IDE ?

The common understanding is to use tools when programming in such an environment. Development proceeds not so much in language X as in language X within tool Y. I think Ruby on Rails takes the opposite approach, where the underlying code is as terse as possible using Ruby constructs and metaprogramming (like has_many above), perhaps because Rails aficianados lack a custom-fit IDE like VisualStudio.NET.

Could Rails have been built without Ruby?

So it seems mainstream programmers are learning (Ruby is mainstream in my book...)

A good question.

How about the inverse: Does having a good IDE (one with automated support for refactoring and other goodies, not to mention as much support as possible for incremental/interactive development even if using an edit-compile language) ease the pain of not having good metaprogramming support? Is editor meta-programming a valid notion? What about visual programming tools?

And what about environments with the best of both worlds?

Performance of Erlang

First of all, the presentation implies that Erlang was written with the express intent of supporting real-time systems. But how many mainstream FPs have this as a design goal?

How many mainstream imperative languages have that as a design goal? Yet people (ab)use C in real-time systems.

That's a bit of a peculiar comment for the FP touted as being a performance go-getter. Maybe GC isn't always as fast as manual memory management. ;> Which is not to say that GC is bad; it certainly has its place. Just that my original point mostly stands.

Armstrongs notes were written in 1998 so they are a bit out of date. His thesis is a recommended thing to read. The High Performance Erlang Group at Uppsala University has done a lot of work on a Erlang compiler to native code for several platforms. I have not been able to find any claims in their articles that the GC or recursion are the big performance problems in Erlang.

... whereas "commodity programmer" is unfortunately the rule rather than the exception. ... (.. establishing causality there might be tricky).

Your causality comment was about Java, but it goes for the commodity programmer as well. Did the bad programmers show up before the bad languages, or the other way around?

Fast Erlang

First of all, the presentation implies that Erlang was written with the express intent of supporting real-time systems. But how many mainstream FPs have this as a design goal?

How many mainstream imperative languages have that as a design goal? Yet people (ab)use C in real-time systems.

The point being, naive implementations of FPs are not going to be performance-competitive, as we all know. Whereas, even a naively constructed C compiler is going to give reasonable performance. It's the nature of the languages involved.

The High Performance Erlang Group at Uppsala University has done a lot of work on a Erlang compiler to native code for several platforms. I have not been able to find any claims in their articles that the GC or recursion are the big performance problems in Erlang.

In this case, they felt that emulation was the biggest performance hog, which should come as no surprise. Also, it's the lowest-hanging fruit, from an effort perspective. There's not much optimization that can be done with the GC since it has soft real-time requirements, and tail-call optimization is the most obvious thing you can do to help recursion, and FP compilers pretty much have to implement that to be remotely competitive. But the other obvious optimization for function calls is inlining, and the HiPE paper specifically implies that this is important because hot-code loading makes it harder. On the other hand, C lived without inline functions for decades and still offered pretty good performance, while inlining in C++ is one of the things that actually makes it faster than C in many cases (as well as function objects). The point being that FPs can't really invoke inlining as an optimization because they need to do it just to get acceptable performance.

Did the bad programmers show up before the bad languages, or the other way around?

Depends on what you call a "bad language". You could blame a lot of things on BASIC, but is it really a "bad" language? The bad programmers showed up when the economy demanded more programmers and the market was not able to ramp up production while maintaining quality. Blame it on the dot-com bubble. Those dang internet market investors created a generation of bad IT professionals!

short and sweet answer

"You could blame a lot of things on BASIC, but is it really a "bad" language?"

yes.

Inlining

I feel a need to reply to this post with a couple of small remarks. A naive compiler implementation could trivially be made worse, I feel there is no point in discussing how bad one could make a compiler.

.. C lived without inline functions for decades and still offered pretty good performance,

C lived without the *keyword* inline for decades. I do not have a copy of some old C standard laying around but I doubt there is anything explicitly forbidding compilers to inline functions they see fit.

The point being that FPs can't really invoke inlining as an optimization because they need to do it just to get acceptable performance.

This brings up the point of partial evaluation. I don't have any references handy at the moment though.

Not standard

C lived without the *keyword* inline for decades. I do not have a copy of some old C standard laying around but I doubt there is anything explicitly forbidding compilers to inline functions they see fit.

And because you were not guaranteed that all your target platforms would inline your functions, most C programmers elected to use macros, which of course come with a whole host of problems (no typechecking, multiple evaluation of args, etc.).

Interesting...

Peter Jonsson said:



I have not been able to find any claims in their articles that the GC or recursion are the big performance problems in Erlang.

While this other reference you cited isn't about Erlang per se, it is interesting to note that two of the things Boquist explicitly sets out to do are:

We develop an interprocedural register allocation algorithm, with a main purpose of decreasing the function call and return overhead.

...and...

A combined compile time and runtime approach is used to support garbage collection in the presence of aggressive optimizations...without imposing any mutator overhead.

And two of the keywords for the paper are "interprocedural register allocation" and "garbage collection".

Since the Boquist paper is about compiling functional programs to native code, it stands to reason that it would not address issues like emulation optimization. From that, I conclude that were Erlang a natively compiled language, it would be subject to the same issues addressed in the Boquist paper.

Strict vs lazy

Since the Boquist paper is about compiling functional programs to native code, it stands to reason that it would not address issues like emulation optimization.
I agree that it does not address emulation optimization.
From that, I conclude that were Erlang a natively compiled language, it would be subject to the same issues addressed in the Boquist paper.
No, Boquist thesis is about lazy functional languages. Erlang is a strict, dynamically typed language so the problems are not identical. Ennals thesis (quoted in another post) basically claims that if one wants performance laziness is a bad idea. The interprocedural register allocation is something I find interesting in itself, but to call that code from 'the outside' (not compiling the entire program at once that is) is opening a can of worms. Efficiently Compiling a Functional Language on AMD64: The HiPE Experience makes the claim
HiPe can be used as either a just-in-time or ahead-of-time compiler, and compilation can start either from bytecode or from source.
Section 5.6 of this says there are new instructions added for switching between native and emulated mode, so I am not sure your claim "were Erlang a natively compiled language" is entirely correct. (The paper seems to be written in 2001 and talks about lack of SSA form in the compiler, the AMD64 paper describes the compiler as using SSA form internally. I am guessing it has progressed quite a lot in the last couple of years).

Upon rereading the article I notice section 5.1 in the AMD64 paper talks about tail recursion and branch prediction. I'll give your earlier claims about functional languages converting recursion to iteration some credit, but apparently the implementation (for Erlang at least) keeps the recursion in an "optimized" form.

Section 8.1.1 in Boquist thesis claims that the question discussed in chapter 8 is how to find the root pointers. This question needs to be addressed for all garbage collected languages I believe, functional or not.

Sorting, sorting, sorting

the C++ version [std::sort] will use a very efficient merge sort or heap sort, using few heap allocations and a very minimal number of copies

The specification of std::sort was tailored towards quicksort; it must not use more than a constant amount of extra memory (except stack-depth I think), it does not have to be stable, and it can run in O(n^2). Stepanovs implementation, OTOH, uses "introspective quicksort", which keeps track of the stack-depth, so that it can detect n^2-behaviour, and switches to heapsort, thus guaranteeing O(n log n).

For a really nice article on implementing quicksort, see "Implementing Quicksort Programs," Robert Sedgewick, Communications of the ACM 21, 10, 1978, although it is a little dated. Also, I would reccomend Sedgewick's homepage, http://www.cs.princeton.edu/~rs/, for links to more recent research.

--
Mikael 'Zayenz' Lagerkvist

Yup

Sorry, I was being lazy. You're right that merge sort is implemented in stable_sort(), not sort(). I knew that std::sort() was adaptive, but I forgot that the primary algorithm was quicksort.

My comments on OO

Since someone asked :), my two cents on the subject.

First, I think OO has contributed mightily to the discipline of programming; subtype polymorphism (the key defining feature of OO) is MARVELOUSLY useful in many programming tasks.

That said, a bunch of criticism of OO itself, OO as implemented in many languages, and OO practitioners:

* OO is often touted as the One True Paradigm; IMHO it's not. (No paradigm deserves that distinction). This isn't a fault of OO itself, but much of the anti-OO backlash springs forth due to the belief, common in industry, that OO is somehow magical and everything else unnecessary.

* Many OO languages did things (such as conflating inheritance with subtyping) which, in retrospect, might have been mistakes. C++ and Java are the primary examples of this.

* In the programming communities/cultures of many OO languages, one finds the "sea of objects" approach of programming, in which oodles of (unmanaged) widely mutable state are scattered 'bout the code willy-nilly, and changed here and there--making many forms of reasoning about the code difficult. Some PL cultures (Smalltalk and Java) go further, encouraging concurrent use of all of these mutable variables. Getting such programs correct is a difficult task for experts; let alone your average junior-level IT programmer.

* OO communities also over-use inheritance; I often see it used in many places where composition or other forms of aggregation are more beneficial.

Object orientation has failed.

Object-orientation has failed.

This is evident by the sheer number of objects being written over and over and that do the exact same thing, only with very little minor differences that are so important as to prevent reuse of code.

It is also evident by the sheer number of failures/lines of code/debugging hours spent in modern-day projects.

Object-oriented programming is helpful because it helps organizing code: instead of putting everything into 'if' statements, you put code into messages with the same interface. That's the good part of OOP. The bad things about OOP are:

  • run-time matching of type and code is done only on one argument of a procedure.
  • mixing and reuse in the context of implementations is very very difficult and time consuming (I've had such a problem recently and asked the LtU community; how would you solve it?)
  • it's very hard to think about inheritance in terms of interfaces; most people easily fall in the trap of thinking about implementation-state inheritance.
  • Categorization with classes fail: most entities have more than one categorization.

Don't be fooled by the progress of automation tools and programming language environments. If there was a C-like language with garbage collection and exceptions but without object-oriented features, it would still be the most used programming language. The success of Java is largely because of its 'safety' that allows code to be quickly developed without introducing quirky memory-related bugs that can bring whole systems down, not because it is object-oriented. The only reason I like C++ over Java is not because it is OO but because of templates and operator overloading (and because I can call C functions directly!).

So the only good thing about OOP is that you are forced to put things together...something that disciplined programmers can easily achieve without explicit support from OOP.

The other side of the story, functional programming, is a whole lot different ball game. Functional programming deals with everything as values (there are no pointers, nor references), and all it does is allow the programmer to declare some transformation rules: a value is transformed to another value by the help of some real mathematical functions. Functional programming does not allow state change, for referential integrity reasons.

In fact, FP recognizes the number one problem with programming: program invariants can easily be forgotten. Most successful imperative programmers thrive on having better memory than others: they tend to remember more of the invariants of an imperative program, thus producing more bug-free code.

Of course, current functional programming languages suffer badly, in my opinion, in many ways: a) their syntax tends to be peculiar, b) the available libraries leave a lot to be desired in terms of quality/documentation/support, c) tools are not available, d) they are slow. But all these problems will be overcomed when someone finds out the theory on how to convert a functional program to an imperative one, by keeping what's needed and updating whatever must be updated (ahem I suspect it is doable, as I have said in another thread).

By the way, all the above comes from a cast-in-stone C++ programmer.

PIck your paradigm

a) their syntax tends to be peculiar

Count me in the camp that thinks that all syntax tends to be peculiar. Every PL has its own set of dark corners to contend with. If people spent as much time learning and working with Haskell and ML as they had on the C family of languages, they'd probably come to the conclusion that C++ has a strange syntax. So the question of peculiar is colored by the countless hours people have spent hammering away at their current favorite language.
b) the available libraries leave a lot to be desired in terms of quality/documentation/support,
The availibility of libraries, docs & support are mostly a function of popularity. Had IBM spent the resources on Smalltalk that it did Java... Or had MS invested money in making a Visual Studio version of Haskell... And had a million programmers doing thousands of different domains.... In short, this is just a recipe for saying that languages are better because people have made it better (popularity breeds popularity).
c) tools are not available
Ditto. But any tools that arise from the disparate PL communities will invariably be usurped in some respects by the Borg. After all, if people like a tool or language feature that is available in a not-oft used language, someone will invariably try to emulate that in their popular language.
d) they are slow.
And if the resources for optimization were poured into these unpopular languages... But then speed is but one consideration in most domains. In my observations, programmers tend towards being very egotistic, thinking that the way they do things is best. And thinking that the kinds of problems they solve are somehow typical of all the different kinds of problems. And thinking that the compromises that they are wanting to see in a PL are the kinds of compromises that all PLs should make.

Personally I want it all. I want speed. I want maintainability. I want vast libraries. I want excellent tools. I want elegance. I want a language best geared for the domain I work in. I want more money for less effort. I want... I want... I want...

Given that no language can deliver on all these, I have to choose a language that best suits the problem, but even having to compromise here based on the legacy software that's been written, and the manpower knowledge at our disposal. I find it tiring to say that any language meets my requirements, given that the language to suit all these requirements does not exist (and I wonder if it ever will).

In the interim, I keep my eyes out. I use about a dozen programming languages at varying levels for paying the bills. I spend time exploring many more, because PLs happen to be a part-time hobby of mine. I find that through time, I have become aware that I know less-and-less about more-and-more. All the languages that I encounter have their strengths and weaknesses. Sometimes it grates me when I have to concentrate on the workarounds for the weaknesses.

Overall, I think it would be a mistake to say that OOP has failed. I also think it would be a mistake to say that FP will never gain traction. Programming languages tend to interact and breed with each other in mighty strange ways. C++ is barely 20 years old, and has only been standardized in the last 10. Templates were barely usable a mere 10 years ago, and Boost is of fairly recent vintage. Java is just over 10 years of age, and it's had to reinvent itself several times during that period in order to survive. Python and Ruby and PHP and... are just toddlers.

Then there are programming languages that never gain much following. Yet some of these languages are highly influential. The languages that you use have had to bend themselves because of the likes of Simula, Smalltalk, Clu and ML. You may not program much in these languages, but the languages that you use are very much a response to the perceived threats.

And then there are the kinds of problems that you have not yet dealt with. Very close on the radar and possibly threatening to blow away the kinds of programming we do, are these things that only a few foresee. My job as a programmer post-internet is very much different than it was prior to that time. Hard to see where the next crater will take place, though I'd speculate it'd be something to do with concurrency and distributed processing (which would totally rearrange the whole equation about which languages produce fast code.

Bottom line for me: opinions on which languages are fast and which ones have the best libraries quickly become boring. They assume a static environment of programming. Which given 10 to 20 years can go from having a language like Cobol for business apps to Visual Basic to Java and C#.

Byzantine syntax agreement

If people spent as much time learning and working with Haskell and ML as they had on the C family of languages, they'd probably come to the conclusion that C++ has a strange syntax. So the question of peculiar is colored by the countless hours people have spent hammering away at their current favorite language.

Absolutely. I've never understood the contention that one language or another has a "peculiar" or "byzantine" syntax (with the exception of those languages designed to have an impenetrable syntax). It's rarely a case of a syntax being bad or complex, but rather not what one is used to. Axilmar talks about FPS syntax being peculiar, but I've seen just as many complaints about the syntax of Smalltalk and Objective C simply because they don't look enough like C++ or Java.

And, as noted above, it works the other way as well: I came up with Pascal, C, and COBOL, but after using OCaml almost exclusively for the past five or so years, going back and reading C-style syntax has become almost painful, much like reading Ernest Hemingway...

-30-

So the question of peculiar i

So the question of peculiar is colored by the countless hours people have spent hammering away at their current favorite language.

I agree that it's a matter of familiarity, but why FPs don't follow more conventional syntax? what's wrong with the classic way of invoking functions? :

identifier ( arguments )

I think if FPs like Haskell or ML used more conventional syntax (by more conventional, I mean closer to more mainstream programming languages), they would be more easily accepted.

The availibility of libraries, docs & support are mostly a function of popularity.

Again, I agree. But, at the end of the day, I can't persuade my boss to choose O'caml over Java. It's not only the different approach, but the quality of the tools.

Personally I want it all. I want speed. I want maintainability. I want vast libraries. I want excellent tools. I want elegance.

Me too..me too!

Given that no language can deliver on all these

I am not sure efforts are focused enough in truly bringing us a language with all the capabilities we want. The problem is that the forces behind PLs are distant: from one side, there is the commercial sector who cares about the 'do it now', and from the other is the academic sector that cares about 'theoritical correctness'. I think there must be efforts from both sides to come together, for the benefit of us all.

They assume a static environment of programming.

That's because it is ..actually static! I have experience in real-time application programming for the military, as well as web applications. The same principles govern both, although the means are different. Programming is just the same in all environments, actually.

more of the same

"I think if FPs like Haskell or ML used more conventional syntax (by more conventional, I mean closer to more mainstream programming languages), they would be more easily accepted."

But then, it would just be more of the same. And, come on! syntax can be much more expressive than the common syntax following a language developed in the 70s ( C ). We have far better parsers and compilers!

But then, it would just be mo

But then, it would just be more of the same

Yes, that's the point: C syntax with functional programming. It would rock! it would enable imperative programmers to easily get into functional programming.

With have far better parsers and compilers!

C is very easy to parse...and it is context free, unlike C++.

Of course, some changes would be done in it to cope for FP features.

Actually, I think you'll find

Actually, I think you'll find that C is not context free - for example multiple declaration of a variable is an error.

However, a C-style syntax would have the problem that it is designed to deal with a completely different style of programming, in a language that simply lacks many of the features that one requires in a functional programming language, even something a simple as currying. This is why we have different syntaxes. Sticking to a C-style syntax would not rock - it would be so obviously flawed that imperative programmers would - probably rightly - conclude that they would be better off sticking with C++.

If you don't want to take my word, try this exercise: Construct a C-like syntax for Common Lisp, a highly impure, object-oriented functional language supporting global mutable state. Afterwards, compare it with Dylan, and ask yourself why anyone would a) prefer it over Dylan, and b) if anyone would prefer that over s-expressions, given the history of suggested syntaxes for Lisp.

C syntax does not rock at all

C syntax with functional programming. It would rock!

See, this is how tastes differ: I think it would suck big time - waaaay too verbose. And type expressions would be totally incomprehensible.

But if you like that approach you could have a look at Scala, which explores that road [on edit: except for a much more readable type syntax].

C is very easy to parse...

Not really, lots of quirks, inconsistencies and ambiguities. I do not believe it's context-free either. And be aware of its terrible types-become-keywords hack. More seriously, C's syntax simply does not scale to more expressive languages (which already becomes obvious by looking at C++).

in my country, all I see is C, C++, Java and Visual Basic...

That sounds bad, although I would at least expect familiarity with the Algol/Pascal/Modula family of languages.

Parsing C.

Others have already commented about how 'easy' it is to parse C. The behaviour of the preprocessor is a part of C99 and until one has implemented a preprocessor conformant to C99 I'd say one should be careful with stating that it is easy. The devil lies in the details.

Syntax

why FPs don't follow more conventional syntax? what's wrong with the classic way of invoking functions? : identifier ( arguments )

That it may be too inflexible for higher-order programming? Hence it was generalised in most FPLs (by either making argument tuples first-class, or by prefering currying right away).

Moreover, I somewhat question to call it the "classic" way - in maths you tend to omit parentheses, too.

I think if FPs like Haskell or ML used more conventional syntax (by more conventional, I mean closer to more mainstream programming languages)

As far as I can tell, (S)ML has a very ALGOLish syntax. Nevertheless I prefer Haskell's - which feels not that different from Python. So I don't understand your argument, except if you equate mainstream with C and its offsprings.

in maths you tend to omit par

in maths you tend to omit parentheses, too

It's very hard though for people that are not mathematicians. I was showing the Haskell sort algorithm to a colleague of mine, and he kept staring it as if it was Chinese.

So I don't understand your argument, except if you equate mainstream with C and its offsprings

I don't know about the rest of the people here, but in my country, all I see is C, C++, Java and Visual Basic...I doubt I can find another programmer in, let's say, a radius of 100 km who has even heard about ML Haskell and the like. I suspect that's the case in most other countries, judging from the ads I see in computer magazines.

Getting them to switch syntax is the easy part

Getting them to think differently is the more difficult part of the journey. We've had a whole host of C-based object-oriented languages around for a couple of decades. Yet the quality of object oriented programming is still pretty (being kind here) uneven. And the chasm between procedural and object-oriented is much shallower than the one that's required to be crossed for FP.

Functional programming requires what, for many, is counter-intuitive. It asks me to give up my state bucket tool - a tool I've used many times to solve hundreds of thousands of problems (big and small). Trying to cut the Pavlovian response of using state to code a quick and dirty solution requires a lot of stimulus in the opposite direction. The biggest problem you face when using a familiar syntax (or as an add-on to an existing PL), is that you have a lot of habits that have to be broken. Because the programmer is in the "familiar" breaking those habits is hard because of the temptations.

Probably varies from programmer to programmer, but my experience is that learning a new syntax and a new PL (even one with strange syntax) is usually the shortest path to actually learning new Concepts, Techniques and Models. For example, I programmed with C++ compilers for a couple of years. In that time, object oriented programming just never clicked in my mind. I had programmed in C for 5 years prior to that, and C++ just made for a better C. Then I finally got around to messing with Squeak and Eiffel. And I really didn't have to do much work in these languages in order to finally get the synapsis aligned. (Of course, the downside of all this is that one usually comes to see their favorite PL in a new light - and it's not necessary favorable).

That said, those who want to learn FP, without stepping outside of their syntax comfort zone, have plenty of material to work with. For Perl Programmers - find MJD's book on Functional Programming for Perl. For Java Programmers - go like at Nice. For C++ programmers - go look at Boost and FC++. (For VB Programmers, well, just carry on with VB). And that's just the tip of the iceberg. Of course, each of these FP communities will likely point you to Haskell resources. So you'll probably end up having to grok the Haskell syntax to understand where we've been and where we're going. Me, I like going straight to the source of ideas, so I'd just as well go learn Haskell, and then come back to these more familiar syntaxes once I've figured out the full potential of the paradigm (then again, sometimes I wish I could stay on the mountain and not have to go back to work in the mines).

Less than Failure, but Less than Success

I think the single most important concept in OOP is encapsulation. And for any stateful paradigm, I think we can all agree that encapsulation is a Good ThingTM. It cracks me up when dogmatic C programmers insist that you can code in the OOP paradigm in C by using opaque types, as if that were a first-class form of data hiding.

Inheritance is problematic because it's a powerful tool that allows you to do many things, many of which you are probably better off not doing. But if anything, I would say that the main failure of inheritance is that it is often used as a poor man's type parameterization. In my opinion, templates/generics are what come to the rescue of inheritance abuse. Not only do templates rescue the programmer from the singly rooted object hierarchy (or the oppressive Mother of All Base Classes), but they work synergistically with inheritance to build powerful, reusable components, which is what OOP originally promised.

Witness Boost. It's a collection of libraries that actually delivers on the reusability promise of OOP, but it took type parameterization to make it a reality. However, templates without inheritance would have made the whole framework difficult, if not impossible. I believe that policy-based design gives imperative programmers a tool for building generic components on par with what FP has to offer. It's not quite as powerful as pure FP, but it gets about as close as you're going to get within a solidly imperative paradigm that lacks native FP support. Of course, it does so by leveraging the FP aspects of the template engine.

The problem with excluding pointers and references is that you are at the mercy of the compiler to make your code efficient. For programs that do not have complex data relationships, that is a realistic and achievable goal. For programs that involve large, intricate data structures that tax the available memory, believing that a compiler can produce the best code without any help is a bit of fanciful thinking.

The problem is that what's best in theory is often not what's best for performance. Recursion is expensive, which is why FPs must translate it to iteration whenever possible. I see the problem as similar to why we even need relational databases. OOP factors programs by code, separating out data into objects and operating on that data with common functions. Relational theory factors programs by data, separating data into its components and reassembling it with code. Almost every non-trivial normalized database stores data in a form that is not directly suitable for end-user consumption. And yet RDBMSes only get more popular every year, for the simple fact that factoring the data is the most efficient way to deal with it on the storage end, even though it gets reassembled into objects on the client end.

We can whine all we want about how an object-based database would obviate the need to perform queries when bringing data into an application for use. The simple fact is, storing data as objects is not as efficient as storing it by components. In the same way, we can point out that FP is more logically consistent and mathematically elegant; but assembly language is vehemently imperative, and that is the ultimate target of the FP compiler. Any language that lets you get closer to that bare metal is going to have certain advantages, whether the language overall merits more attention or not.

So like it or not, I'm convinced that the imperative paradigm is going to be with us for a long time.

Performance of functional languages.

The problem is that what's best in theory is often not what's best for performance. Recursion is expensive, which is why FPs must translate it to iteration whenever possible.
I have seen several comments from you similar to the one above, my conclusion is that you do not trust your compiler. Considering how little faith you have in your compiler when it comes to producing efficient code - how can you trust it at all to produce correct code?

Performance in functional languages might be poor but not for the reasons you think. Robert Jonathan Ennal writes in his thesis:

Perhaps the biggest weakness of Optimistic Evaluation is that it is an implementation technique for non-strict languages. During the course of this research, we have come to the conclusion that, although non-strict languages seem superficially appealing, they are not, in general, a good idea. While Lazy Evaluation is often useful, we do not believe that it is wise to make it the default evaluation strategy for all expressions. Although much has been written about the supposed expressive beauty of non-strict languages, most non-strict programs we have investigated contain only a small number of expressions for which laziness is useful, and it is usually obvious which expressions these are. We now believe that, if a program makes essential use of laziness, then this should be a deliberate design decision, and the way in which laziness is used should be stated explicitly in the program text. This does not however make Optimistic Evaluation completely redundant. Optimistic Evaluation obtains many of its biggest wins when applying chunky evaluation to infinite data structures a technique that is equally applicable in languages with explicity laziness. In addition, Non-Strict languages are just one example of a general class of problems in which a computer may find itself having to decide whether or not to do some work that may or may not turn out to be useful. We thus believe that, even if it turns out that non-strict languages are a bad idea, the basic principles of Optimistic Evaluation still have practical value.
There are other theses written on the subject as well, Urban Boquist has an in depth treatment of code optimisation for lazy functional languages. Several other posts have pointed out that given the same amount of work put into FP would make it more on par with imperative languages.

Sea, Islands

A sea of strictness with islands of laziness always seemed the best arrangement to me. Though the Haskell addage that 'it keeps you honest' has some credence (W.R.T. keeping imperative features out of the language or hidden).

Regarding relational databases

Relational theory factors programs by data, separating data into its components and reassembling it with code. Almost every non-trivial normalized database stores data in a form that is not directly suitable for end-user consumption. And yet RDBMSes only get more popular every year, for the simple fact that factoring the data is the most efficient way to deal with it on the storage end, even though it gets reassembled into objects on the client end.

We can whine all we want about how an object-based database would obviate the need to perform queries when bringing data into an application for use. The simple fact is, storing data as objects is not as efficient as storing it by components.

I just have to comment on this because I think you're giving the relational model much less credit than it deserves. The real benefit of storing data the way a RDBMS does is not that it's the most efficient storage mechanism. The reason why relational databases are so powerful lies in their extreme flexibility. In OO languages you have to encode all the access paths you need. This means that you have to think really carefully about how you need to access the data. The benefit of this is of course speed, but this comes at the cost of a great deal of flexibility. Predicting what paths through the data you will need is extremely hard, resists change, and tends to produce tangled structures that can be hard to navigate in. This was why network databases died out. OO suffers from exactly the same problems.

When your data is in a properly normalized relational database, you don't need to predict how you need to access data. Everything can be accessed, you just need to write a query. In some cases you may need to add an index or two to speed up access, but that's a very simple thing to do, and is completely independent of the rest of the system as such. It is this extreme flexibility that makes the relational model so important for data management. Of course, it offers a lot more. But in the context of this discussion I wanted to highlight this particular point.

Don't be mistaken...

...I use databases very heavily to put bread on the table, so I have all the respect in the world for the relational model. When you're talking about million+ record tables, it soon becomes apparent that storing full-blown objects is simply not going to cut it. But really, it has more to do with space efficiency than time efficiency. When you need to query three or more tables at once that are all in the million+ row range with low row selectivity, you start to think about denormalization schemes, and how much space you are willing to trade for some time.

Really, the only time that the relational model is outright faster is when you need to take statistics on one or a few relations. Otherwise, it's pretty much about the most efficient storage of data and the fastest way to get at it once it's in that form. Also, indexes only work for queries with high selectivity. A smart query engine will do a sequential scan for low selectivity queries. For queries that reassemble objects from their attributes (meaning, join several tables), it would usally be faster to store the entire denormalized objects than search in each individual table. But it would also be much fatter because of the redundancy.

The Relational Model

How can you talk about the relational model being "faster"? That's like saying "Euclidian geometry is faster than non-Euclidian geometry." The relational model of database systems is a mathmatical construct designed to let you abstract away from the data storage issues, which is the polar opposite of making things faster or slower.

As soon as you start "denormalizing for performance," you're tripping on a DBMS that's no longer letting you abstract properly any more. One ought to be able to deal with the performance issues (including physical table layout and indexing) completely separately from the logical design of the database itself.

not.

run-time matching of type and code is done only on one argument of a procedure

This depends on what OO language you are using. Dylan, Cecil, CLOS all support multimethods.

mixing and reuse in the context of implementations is very very difficult and time consuming (I've had such a problem recently and asked the LtU community; how would you solve it?)

This seemed to be poor design insight on your part and was answered adequately in that thread.

it's very hard to think about inheritance in terms of interfaces

Doesn't seem very hard to me. -- anyway, it is important to be able to distinguish interface from implementation. FP doesn't save you from this.

Categorization with classes fail: most entities have more than one categorization.

First, in Smalltalk or Objective-C you can use objects in any context where they support the messages that will be sent to them. The problem you have is with the nominal typing of C++, Java. The types of objects are not their class in Smalltalk.

Second, not all FP people would agree with you that most entities have more than one categorization.

By the way, all the above comes from a cast-in-stone C++ programmer.

By the way all this comes from someone who took Smalltalk as a base and implemented a language with currying, tail call optimization, list comprehensions, and who writes DSP code in C++ for his day job.

This depends on what OO langu

This depends on what OO language you are using. Dylan, Cecil, CLOS all support multimethods.

My comment was towards the OOPLs that are most popular: C++, Java, C#, VB, Perl/Python in no particular order.

This seemed to be poor design insight on your part and was answered adequately in that thread.

Actually nobody offered a solution - everyone spoke generally enough without giving a specific answer.

The problem you have is with the nominal typing of C++, Java

Yeap, C++ and Java are the main targets of my comments.

By the way all this comes from someone who took Smalltalk as a base and implemented a language with currying, tail call optimization, list comprehensions, and who writes DSP code in C++ for his day job.

Bravo! we need more people like you! How did you achieve constant time message lookup?

picky

First, in Smalltalk or Objective-C you can use objects in any context where they support the messages that will be sent to them.
Someone might misunderstand this and think in Smalltalk we can only use objects in a context "where they support the messages that will be sent to them".

In Smalltalk we can use objects in any context, period.

If they don't support the message sent to them
Unhandled exception: Message not understood:

same as Scheme, Erlang

In Smalltalk we can use objects in any context, period.

just as you can in any dynamically typed FP lang.

Goes for Oz as well


class Figure
   meth otherwise(M)
      raise undefinedMethod end
   end
end

And IIRC, Ruby also supports a message not understand dispatch.

let me rephrase

Let me rephrase. You made a qualified statement about Smalltalk which may have misled people into thinking that methods could only be used where they were "safe" (where the context implemented the message).

My point was that there's no such guarantee, and you can prevent misunderstanding by not qualifying the statement with "where they support the messages that will be sent to them".

of course

And neither would I want you to give the impression that dynamically typed FP languages are any different.

Object-orientation has failed

Object-orientation has failed.

Failed to do what?

This is evident by the sheer number of objects being written over and over and that do the exact same thing, only with very little minor differences that are so important as to prevent reuse of code.

Compared to which paradigm, in which reuse is trivial and commonplace? The barriers to successful code re-use in most places are as much political/economic ("Who pays for this reusable component? Who oversees it's development? Who is responsible for it's maintenance? Who has the authority to modify it? And are app developers actualy required to use it?") as technical.

It is also evident by the sheer number of failures/lines of code/debugging hours spent in modern-day projects.

And OO is somehow unique in this, how?

Object-oriented programming is helpful because it helps organizing code: instead of putting everything into 'if' statements, you put code into messages with the same interface. That's the good part of OOP. The bad things about OOP are:

* run-time matching of type and code is done only on one argument of a procedure.

True in languages like Java and Smalltalk (and a lesser extent C++, though numerous multiple-dispatch libraries exist for C++). Not true for things like CLOS or Dylan.


* mixing and reuse in the context of implementations is very very difficult and time consuming (I've had such a problem recently and asked the LtU community; how would you solve it?)

I'm not sure I understand this point.


* it's very hard to think about inheritance in terms of interfaces; most people easily fall in the trap of thinking about implementation-state inheritance.

I'll agree somewhat here; though this strikes me as a fault of current OO languages, not OO in itself.


* Categorization with classes fail: most entities have more than one categorization.

You've commited a related sin to the one you complain about above; classes are for object creation as opposed to categorization.


Don't be fooled by the progress of automation tools and programming language environments. If there was a C-like language with garbage collection and exceptions but without object-oriented features, it would still be the most used programming language.

How about ML? It lacks subtype polymorphism; but has GC. While usually considered a "functional" language, it permits side effects and can certainly be used as an advanced procedural language if you want.


The success of Java is largely because of its 'safety' that allows code to be quickly developed without introducing quirky memory-related bugs that can bring whole systems down, not because it is object-oriented.

The success of Java is largely due to a billion-dollar marketing effort by Sun Microsystems (and the fact that Java is "good enough"). Though I suspect that if Java weren't OO, it wouldn't have gotten a foot into the door (OO being in vogue in many IT shops at the time). Likewise, C/C++ owe much of their success due to being the de facto systems language(s) for the Unix and Windows platforms.


The only reason I like C++ over Java is not because it is OO but because of templates and operator overloading (and because I can call C functions directly!).

Of course, C++ doesn't force you at all to use OO if you don't want. You can make extensive use of the above and never type the words "class" or "virtual". Consider the C() dialect of C++. :)


So the only good thing about OOP is that you are forced to put things together...something that disciplined programmers can easily achieve without explicit support from OOP.

If you think that's the whole point of OO, you have failed to understand OO.

[CS101 lecture on FP removed]

By the way, all the above comes from a cast-in-stone C++ programmer.

There are quite a few ways in which C++'s implementation of OO has been shown to be outdated by modern research on the topic. Which is OK--C++ is still genuinely useful (though lots of folks here think C++ is about as useful as a disposable diaper in a washing machine). But it is folly to dismiss OO based on it's implementation in the various industrial curly-brace languages in common use.

Failed to do what? failed

Failed to do what?

failed to make writing programs less bug-ridden; failed to reduce testing time; failed to solve design issues in terms of ease of modification etc etc

Compared to which paradigm, in which reuse is trivial and commonplace?

Compared to the ideal paradigm of just writing the program, test it a little and, voila, it's is ready. Isn't that the purpose of programming language research? We spend 10 minutes writing an algorithm, 10 weeks debugging it, 5 months writing the test documents, and 4 weeks of auditing and qualification...whereas it could have been much less.

The barriers to successful code re-use in most places are as much political/economic as technical

yeap, that's true. But it also shows that computer science does not yet provide a way to develop programs that stands out from the rest...hence the political differences.

And OO is somehow unique in this, how?

Wasn't OO promising less lines of code, less bugs, less testing? well, these promises are not fulfilled, as far as I am concerned.

I'm not sure I understand this point.

If a programmer thinks about inheritance based on implementation details, instead of interface concepts, then his life would be difficult, concerning the design of an application. I had 4 different implementation classes, which I couldn't mix them nicely without overcomplicating my design...and that was because I wa thinking of implementation inheritance instead of interface inheritance.

You've commited a related sin to the one you complain about above; classes are for object creation as opposed to categorization.

Hey, it's not my fault! the first thing we are tought about OO is categorization with classes. The classic example is about the animal kingdom: a dog says 'wof', while a cat says 'mew', but they all 'talk'...hence dog and cat both inherit from mammal, which inherits from animal, etc.

How about ML?

I like ML, because it was the first FP I learned. Unfortunately its syntax is very different from C's.

The success of Java is largely due to a billion-dollar marketing effort by Sun Microsystems

Let me disagree here. No matter how good the marketing, if the programming environment (Java is not just a programming language) is not good enough (as you said), then it would not suceed.

Though I suspect that if Java weren't OO, it wouldn't have gotten a foot into the door

I agree, but not because OO is anything substantially better than procedural, but because OO was hyped as the next big thing.

Actually OO is procedural: one procedure is executed right after the other. The taxonomy between OO and procedural is totally inaccurate. There are only two kinds of programming languages: imperative and functional. There are many ways to structure code: object-oriented and not object-oriented.

If you think that's the whole point of OO, you have failed to understand OO.

I don't think so. After lots of OO development, the hype wears off. OO is not about polymorphism, because polymorphism exists in non-OO languages. OO is not about encapsulation, because there are non-OO languages that have perfect encapsulation, including C (opaque types). So what's left? inheritance...i.e. a way to better structure your code.

But it is folly to dismiss OO based on it's implementation in the various industrial curly-brace languages in common use.

But I don't dismiss OO. I am just saying it is not the holy grail of programming, as it was presented a few years back (or still being presented as such in many areas of the world).

More on OO has failed

sj> Failed to do what?

ax>> failed to make writing programs less bug-ridden; failed to reduce testing time; failed to solve design issues in terms of ease of modification etc etc

But which extant paradigm has succeeded? It seems to me by this criteria, all paradigms have failed.

And I'll rebut further: I believe OO has led to many improvments, compared to the procedural + global variables approach which was previously in vogue. It's hard to say for sure--SW has gotten considerably more complex.

sj> Compared to which paradigm, in which reuse is trivial and commonplace?

ax>>Compared to the ideal paradigm of just writing the program, test it a little and, voila, it's is ready. Isn't that the purpose of programming language research? We spend 10 minutes writing an algorithm, 10 weeks debugging it, 5 months writing the test documents, and 4 weeks of auditing and qualification...whereas it could have been much less.

That's not a paradigm. "Write the program, test it a little, and voila, it's ready" can be accomplished in any paradigm. Many of the difficulties in SW development (including several you allude to) have nothing to do with OO or any other paradigm; instead they have to deal with difficulties in:

* requirements gathering and documentation. On any large (>1 person) project you have the problem of 2 people thinking the code ought to do different things.
* industrial SW development processes--many of which are designed around keeping the bosses' job secure, rather than efficiently and predictably producing software. (And many industrial SW houses value predictability over efficiency, even when the process was designed in good faith). If you spend 10 minutes coding and 10 weeks testing--that isn't the fault of OO (or any paradigm).

sj> The barriers to successful code re-use in most places are as much political/economic as technical

ax>> yeap, that's true. But it also shows that computer science does not yet provide a way to develop programs that stands out from the rest...hence the political differences.

But that doesn't mean OO has failed. If OO has failed, then so has structural programming, database-centric, prototype-based programming, and multi-paradigm. FP, constraint programming, logic programming, etc. get a free pass, if for no other reason than the fact that industry hasn't embraced these yet (and exposed their practitioners to the realities of commercial development). Rest assured, if F# ever gets productized (Microsoft's ML on .NET project) and widely deployed; we'll have rants about how functional has failed, too.

sj> And OO is somehow unique in this, how?

ax>> Wasn't OO promising less lines of code, less bugs, less testing? well, these promises are not fulfilled, as far as I am concerned.

OO didn't promise anything. Vendors of OO tools promised the moon and the stars; but again--how is OO unique. If FP (or another paradigm) jumps the barrier and becomes a commercially attractive industry buzzword (and a goldmine for tool houses to exploit), we'll see ridicilous claims of productivity increases for FP tools (which will make the claims of FP advocates here on LtU seem positively minimalistic in comparison).

sj> I'm not sure I understand this point.

ax>>If a programmer thinks about inheritance based on implementation details, instead of interface concepts, then his life would be difficult, concerning the design of an application. I had 4 different implementation classes, which I couldn't mix them nicely without overcomplicating my design...and that was because I wa thinking of implementation inheritance instead of interface inheritance.

OK. Many OO languages are migrating to greater interface inheritance.

sj> You've commited a related sin to the one you complain about above; classes are for object creation as opposed to categorization.

ax>> Hey, it's not my fault! the first thing we are tought about OO is categorization with classes. The classic example is about the animal kingdom: a dog says 'wof', while a cat says 'mew', but they all 'talk'...hence dog and cat both inherit from mammal, which inherits from animal, etc.

Classes can be used for categorization--after all, all objects of a class are in a different category from the set of objects not in the class. But often times useful categories exist which are not congruent with classes. But again, many OO languages provide for this.

sj> How about ML?

ax>> I like ML, because it was the first FP I learned. Unfortunately its syntax is very different from C's.

sj> The success of Java is largely due to a billion-dollar marketing effort by Sun Microsystems

ax>> Let me disagree here. No matter how good the marketing, if the programming environment (Java is not just a programming language) is not good enough (as you said), then it would not suceed.

OK, but part of the billions of dollars that Sun spent was spent on producing an adequate toolset (and documentation); and distributing it far and wide.

sj> Though I suspect that if Java weren't OO, it wouldn't have gotten a foot into the door

ax>>I agree, but not because OO is anything substantially better than procedural, but because OO was hyped as the next big thing.

I do believe that OO has advantages over procedural. I don't believe that OO (or anything else) could have lived up to the hype.

ax>> Actually OO is procedural: one procedure is executed right after the other. The taxonomy between OO and procedural is totally inaccurate. There are only two kinds of programming languages: imperative and functional. There are many ways to structure code: object-oriented and not object-oriented.

Excluding Haskell (and other lazy languages you might consider), most FP languages are also "procedural" in this sense. If you want to get picky and claim there are only two types of languages, I'd draw the line at "imperative" (including functional) and "declarative" (Prolog, Mercury, parts of Oz, etc.).

sj> If you think that's the whole point of OO, you have failed to understand OO.

ax>> I don't think so. After lots of OO development, the hype wears off. OO is not about polymorphism, because polymorphism exists in non-OO languages. OO is not about encapsulation, because there are non-OO languages that have perfect encapsulation, including C (opaque types). So what's left? inheritance...i.e. a way to better structure your code.

The one feature that is unique to OO languages is subtype polymorphism. Inheritance a means to achieve it, popluar in most industrial OO languages--but many OO languages (Smalltalk, Python, Ruby) can achieve polymorphism without inheritance (which then becomes a means for code reuse more than polymporphism). C++ and Eiffel both allow inheritance without polymorphism. Some languages, like Self, primarily use delegation rather than inheritance as a means of code reuse.

Also note that "subtype polymorphism" is more specific than polymorphism in general. ML has polymorphism; it does not have subtyping. O'Caml does have subtyping. O'Caml is considered an OO language, plain ML is not.

You are correct in that encapsulation is orthogonal to OO. Many "modular" languages of the 70s and early 80s (Modula-2, early Ada dialects) have encapsulation without subtyping. Many OO languages like Python and CLOS don't bother with compiler-enforced encapsulation.

sj> But it is folly to dismiss OO based on it's implementation in the various industrial curly-brace languages in common use.

ax>> But I don't dismiss OO. I am just saying it is not the holy grail of programming, as it was presented a few years back (or still being presented as such in many areas of the world).

If you had said that, I would have agreed with you.

You said "OO has failed", which to many of us parses as "OO sucks" and/or "OO is worthless". While OO is certainly far from perfect (and is not the holy grail of anything); it remains a viable and useful paradigm.

Tools that model

"The success of Java is largely because of its 'safety' that allows code to be quickly developed without introducing quirky memory-related bugs that can bring whole systems down, not because it is object-oriented."

Probably true. JAva IDEs help in that respect. Recently I've been thinking, while reading Memory as a Programming Concept in C and C++ by Frantisek Franek, why there aren't any tools to model how memory works in C/C++ in an IDE. I mean, people achieve substantially complex things in other areas (I'm thinking simulation and video games). I mean, how complex can that be compared to other things ? People model a lot of things, people write emulators. Is it impossible to write an IDE that incorporates a sumlated memory management ? Surely, it's not nuclear physics.
Here's an IDE written in Java that displays some visual information about control structures: http://www.eng.auburn.edu/grasp/

be amused

It cracks me up when dogmatic C programmers insist that you can code in the OOP paradigm in C by using opaque types, as if that were a first-class form of data hiding.
Then this comp.lang.object thread is for you.

So like it or not, I'm convinced that the imperative paradigm is going to be with us for a long time
Cobol is going to be with us for a long time, the more interesting question would be - is it going to remain the dominant paradigm for a long time?

Myths

There are several old myths recurring in this thread. Here the standard debunking:

FP is slow

Obsolete argument. Several FP languages nowadays have compilers that are on par with your average C++ most of the time (occasionally even better), and are likely to beat the hell out of Java.

Not to mention that the higher level of functional programming usually allows algorithmic tuning beyond the capabilities of mainstream language, at least for more complex algorithms.

GC is slow

For all but hard real-time applications GC is likely to perform better than manual memory management. This is because it is usually highly optimized as an integral part of the runtime system, can use smarter global approaches, and can exploit low-level knowledge not available on the language level. Note in particular that reference counting, which is the usual fall-back in GC-less languages, is the least efficient form of managing memory. Perceived slowness of languages is usually due to other factors than GC itself, as counter examples show (see the previous point).

FP can be done in any language

One, if not the central characteristic of FP is non-trivial [on edit: and frequent] use of first-class functions and closures. This practically demands GC. Any claim that FP can be done in something like C++ because of the possibility of occasional maps or folds holds no more water than claiming that the occasional use of a function put in a record is OOP.

Tragically, few mainstream languages have true closures. As a secondary requirement, many FP techniques also rely on a lean syntax for creating functions/closures on the fly, in order to keep code readable. Even fewer mainstream languages meet this requirement.

OOP provides for better extensibility

True only on one dimension: you can easily add new cases/types to an existing design. However, it is much more difficult than in other paradigms to extend along the operations axis: you have to touch all classes in a hierarchy to add an operation over all cases (and if it implements a non-trivial algorithm, scattering it over the whole program decreases readability and maintainability).

No.

"First-class functions and closures" is a tool, not a means.

The point of FP is to provide run-time safety by imposing stringent compile-time checks on program logic.

No to what?

"First-class functions and closures" is a tool, not a means.

I sort of agree, but how does it counter what I said? I said their use is characteristic, I said nothing about the reasons or motivation. [I should have said "non-trivial and frequent", though.]

The point of FP is to provide run-time safety by imposing stringent compile-time checks on program logic.

Here I have to disagree. Neither does this in any way cover untyped FPLs like Scheme, Erlang, etc., nor is it specific enough to distinguish from all other kinds of strongly-checked or declarative approaches.

Heh.

I should clarify -- FP is a specific way of ensuring programs have fewer bugs by removing side-effects from programs.

Basically, this is the only reason FP exists. This is important, since unless a functional programming language keeps this in mind and demonstrates a real and obvious decrease in bugginess, nobody is really going to be interested. (Witness the current popularity of OCaml.)

Correctness is only one aim

IMO, the primary advantage of FPLs is their higher level of programming and the expressive abstraction facilities they provides. This increases productivity and allows to approach problems from whole new angles.

It also pays back for correctness, as you get lost in error-prone tedium less frequently.

This is a dead-end.

"Expressive abstraction" cannot be measured.

If you're starting to argue on the merits of "expressiveness", then you've already lost by stepping on the slippery slope that leads directly to those guys that argue over the correct placement of braces and proper colors for syntax highlighting.

In short, correctness is a logical concept that can be objectively demonstrated, whereas "expressiveness" is a purely aesthetic feel-good emotion. (Remember -- Cobol was designed to be "expressive", too, after its own fashion.)

Well, one can probably induce

Well, one can probably induce a partial order over different programming languages based on their expressiveness. Criteria could be number of control structures and amount of code required to do, with for instance constraints such as a prohibition against runtime string-parsing being required.

For certain definitions of "expressive"...

In short, correctness is a logical concept that can be objectively demonstrated, whereas "expressiveness" is a purely aesthetic feel-good emotion.

Well, there has been work done to quantify the expressiveness of programming languages. For example, Are Ours Really Smaller than Theirs? by Booth and Jones.

-30-

tkatchev: "Expressive abstrac

tkatchev: "Expressive abstraction" cannot be measured...In short, correctness is a logical concept that can be objectively demonstrated, whereas "expressiveness" is a purely aesthetic feel-good emotion.

Leaving aside this not being true on the basis of Kolmogorov complexity measures alone, around here, when we talk about the expressive power of programming languages, we tend to be referring to... well... On the Expressive Power of Programming Languages. Highly recommended.

Wow.

So much verbage.

I skimmed the article biefly, but it seems to me that the main point boils down to the fact that the ideal uber-expressive language is some sort of a God-awful version of Perl.

True for some meaning of "expressive", but this is not a meaning that I am willing to live with. Though perhaps my wishes go against the grain of the general public.

Forest/Trees

tkatchev: I skimmed the article biefly, but it seems to me that the main point boils down to the fact that the ideal uber-expressive language is some sort of a God-awful version of Perl.

I'm confused. I can't even see where the authors make any attempt to define a programming language, let alone one that resembles Perl either semantically or syntactically.

tkatchev: True for some meaning of "expressive", but this is not a meaning that I am willing to live with. Though perhaps my wishes go against the grain of the general public.

Are you sure we read the same paper? The one I read is precisely an attempt to define "some meaning of 'expressive'" by developing some mechanisms for comparing the expressive power of any programming languages in a meaningful way. It's just an initial attempt to reduce the amount of he-said/she-said in the process. After all, programs are an artifact of a logic, and programming languages likewise. It seems only reasonable that we should actually be able to measure some things about them, and I don't mean just line counts!

The larger point in the context of LtU is that many of us, when we make claims that one language is more expressive than another, make some level (not necessarily comprehensive!) of effort to back up the claim more-or-less formally, and this paper is a common launching-off point for such efforts. Unfortunately, even that doesn't prevent as much of the he-said/she-said as one would like, particularly if at least one side of the "debate" rejects at least the specific definition of "expressive" being applied or, worse, is one of those poor pedants who insists that "all modern programming languages are Turing equivalent" is not only a meaningful observation, but the last word on the subject, as if there were no measurable difference between programming in Lazy-K and Epigram.

By the way, if the means of comparing the expressive power of programming languages in the paper aren't to your liking, by all means, propose another set of means. The authors make very clear in their abstract that they're only trying to get the ball rolling, so to speak. I'm keenly interested in good metrics for this sort of task.

Briefly,

The definition of "expressiveness" that the paper proposes is a definition that I find wrong and harmful.

Which is the point -- expressiveness is a subjective measure of aesthetics.

Remember, some people find Cobol expressive -- and who are we to say that they are wrong?

Without Any Point of Comparison...

tkatchev: The definition of "expressiveness" that the paper proposes is a definition that I find wrong and harmful.

OK, I'll bite. Why? It's extremely hard to take such a claim seriously, particularly without evidence or a contrasting set of standards or mechanisms to compare with.

tkatchev: Which is the point -- expressiveness is a subjective measure of aesthetics.

No, it isn't. Merely because you want it to be doesn't make it the case. Again, there's a big difference between saying "I would measure that a different way," which is something that can usefully be looked at and tested against the way that was presented, and saying "that is immeasurable," which, given that people are doing so, is merely false. If that's your position, then yes, you'll find yourself swimming against the current in a group of people who study and even create programming languages.

tkatchev: Remember, some people find Cobol expressive -- and who are we to say that they are wrong?

Without having a definition of "expressive," we would indeed have a hard time. It's not hard to imagine that in COBOL's case, there's a definition of "expressive" that has to do with a short conceptual distance between an English description of a business process and the COBOL code to implement that process. Even ignoring that context, they aren't necessarily wrong: one could make the case that COBOL is indeed more expressive than assembly language, FORTRAN, or Lisp 1.0 or 1.5, which is what COBOL was competing with. But we know that's not the context we're talking about here. Instead, loosely speaking, we're talking about expressiveness as the ability to effectively handle a large number of domains with a minimum of boilerplate. It's difficult to see how there could be anything "wrong and harmful" about that, especially when the means would allow us to compare, e.g. COBOL and C, let alone languages that might be considered "exotic" and therefore complicate the comparison by lack-of-familiarty issues being confused for lack-of-expressiveness.

Definitions are key.

You said: "expressiveness as the ability to effectively handle a large number of domains with a minimum of boilerplate".

I take issue with that statement. I find it harmful on so many levels that it would take a long time to just list them all. Your definition is the sort of thing that gave us Perl and Ada, and I don't think the world needs any more of that.

They are

Paul: expressiveness as the ability to effectively handle a large number of domains with a minimum of boilerplate
tkatchev: Your definition is the sort of thing that gave us Perl and Ada
In me Paul's definition evoked images of Scheme code, not Perl.

But psychotherapy considerations aside, it's an accepted practice on forums to either revoke a claim, or back it with arguments. Or face a huge loss of credibility.

Scheme is what I thought of as well

But mostly because the paper is written by noted Schemer Doctor Matthias Felleisen.

Hm

What people need in a programming language can be described in two short phrases: a) predictable algorithmic properties. (Not just CPU cycles, but also things like memory cost, response time, concurrency, scalability, etc.) and b) enforcement of programing logic. (i.e. detecting bugs.)

Anything else which does not fall into one of the above is outright harmful and deserves a place in the obfuscated code tarpit.

Unless you like writing obfuscated expressive code for fun. (I do, too -- but, in my opinion, this sort of thing has no place in the real world.)

-ilities?

Anything else which does not fall into one of the above is outright harmful and deserves a place in the obfuscated code tarpit.

What about productivity? Both when writing and reading, but also when refactoring the code? When interacting with customer and establishing a mapping between the problem domain and the code? Metaprogramming? Conceptual integrity? Learnability/teachability?

Not to be rude or anything...

...but I have a conviction that none of these really matter from any practical perspective.

"Productivity" -- measured as lines of code -- has next to none real-world value. Unless we are talking about code quality, but then we get back to the two points I listed.

"Customer interaction" and "problem domains" is a whole different topic that really belongs with the business and management people, not the programmers.

Teachability -- does being easy-to-learn really matter if you cannot produce workable code in the end?

Um.

solving lots of problems is being highly productive.

Did you really mean to say what you said? Because solving problems that don't need to be solved is anything but productive work.

To suggest that languages suited to the domain at hand (i.e. one that makes it *easier* to solve your problem)

A language suited to the problem domain is a language that a) has predictable algorithmic properties and b) detects logic inconsistencies in the program. I know, I'm repeating myself. :)

Finally, does producing correct code really matter if no-one uses the language?

Yes. In fact, it's the only thing that matters.

OK, OK, you win.

You're smarter than me, if it makes you feel better.

Enforcement of programming logic

Depends how you approach that. It's very hard to prove correctness of imperative programming. And structured programming, even when imposed by Fiat as in Pascal and derivatives, and the whole Algol family tree is little help. Bugs, bugs, everywhere.
Let me quote physics professor Julian Noble on the use of Forth (from the book Scientific Forth):

FORTH programms are automatically structured because word definitions are nothing but subroutine calls.

I think this applies to functional composition in FP.

Yes.

The halting problem shows that correctness in the general case cannot be proven.

However, that does not mean that certain specific solutions cannot be proven correct -- in fact, correctness proof is the whole point of having datatypes in programming languages.

I guess the way to approach this task is to find some simple and well-defined class of algorithms and then proceed to figure out what specific logical properties programs implementing such algorithms might have.

(Thus, for example, functional programming -- i.e. algorithms without state.)

Expressiveness is not always a *good* thing

For example, the introduction of State into a programming language allows the construction of a more expressive PL than either the Declarative or Functional model. Yet, the introduction of State multiplies the complexity involved in reasoning about programs. Anyhow, I don't think it unreasonable to postulate that expressiveness of a language can be measured. The larger issue though is whether high levels of expressiveness come at a cost in terms of other issues - both quantitative and qualitative.

What Metrics Cover vs. What They Don't

Chris Rathman: The larger issue though is whether high levels of expressiveness come at a cost in terms of other issues - both quantitative and qualitative.

I think this is where the real challenges lie: I don't think anyone would deny that some kind of "approachability" to a language is a desirable thing. But it's hard to define how you would make, e.g. Haskell, more "approachable" without impinging upon its semantics, let alone how you would measure success. That doesn't mean we shouldn't try. It just means that tkatchev has at least one point: maximizing semantic expressiveness at all costs isn't necessarily a win.

It's also true that expressiveness can come at a real cost in understandability. I think that point comes across extremely clearly in CTM. I'd go so far as to say that it's one of CTM's master strokes, which has two parts:

  1. You don't need state as much as you think you do. The first five chapters of CTM avoid it, and you do real work without it.
  2. State adds clear expressive power but does damage to the clean algebraic reasoning you can do in its absence.
That's great because now we can make an informed choice as to what to use when. Even if you're working in C, it's helpful to know these principles.

Suppose one can have state bu

Suppose one can have state but can also express the fact that a given piece of code doesn't use it? Would that count as a net gain?

Would that count as a net gain?

Absolutely. And that's what the designs of "pure" functional programming languages like Haskell, Clean, etc. -- that is, the full languages as opposed to their stateless subsets -- are trying to do. It's not clear that we've found the best ways of doing so yet (apart from monads and uniqueness types, there are also auditors, effect systems, and several other type-based approaches).

A previous brief LtU discussion is here.

Personally I'm quite aware th

Personally I'm quite aware this is possible - Haskell's my day-to-day language :-) Thanks for the links though, looks like there's some stuff I've not run across there.

Disjoint Sets

tkatchev: I take issue with that statement. I find it harmful on so many levels that it would take a long time to just list them all. Your definition is the sort of thing that gave us Perl and Ada, and I don't think the world needs any more of that.

OK, well, perhaps you'd humor us by enumerating, say, the top five.

And the definition that I find in the paper in no way gave us languages like Perl or Ada, neither of which went through a formal design effort with expressiveness relative to other languages as a goal. Perl is the explicit, deliberate result of an attempt at codifying informality. It was designed by a linguist, not a computer scientist, and it shows, to both good effect (it really is an excellent tool for chewing on text) and bad (there really is no such thing as non-idiosyncratic Perl code). Ada was designed to address essentially the same design space as Pascal or PL/I, but with essentially no ambiguity, being one of the few languages to even possess a formal semantics—at all. It turns out there are some weaknesses in the type system, cf. Ariane 5, but Ada was also designed before some of the more recent results in type theory.

So it would seem that we're still waiting for an approachable-syntax language with something like the type system of an Epigram or Aldor or GHC 6.4 or... and if your point is that we aren't there yet, I agree wholeheartedly. However, if your point is that we'll get there without having formal means of comparing the expressive power of programming languages, relying instead on personal aesthetic judgment, I disagree wholeheartedly.

I'll just list one point.

I already stated it above, but to reiterate -- the real world does not need "expressive" code. The real world needs code that works well (i.e. algorithmic scalability) and has no bugs. (i.e. has a clearly specified logic behind the algorithms.)

Anything that does not move us towards one of those goals is harmful. Moreover, many features which add "expressiveness" also degrade code quality in the above sense.

Name one.

Name one.

I'll give an example instead.

Python generators are a feature that lowers "expressiveness" (as opposed to using lists, iterators and operator overloading, for example) while greatly improving code quality. (By placing some strict restrictions on the space and time complexity of the respective algorithms.)

In short, the world needs more discipline. :)

Perhaps being able to express

Perhaps being able to express the discipline you intend to follow is useful?

Of course it is.

The idea boils down to being able to verify program logic, I think.

Which requires a whole class

Which requires a whole class of expressiveness that many languages lack. Certainly it's a useful concept when it comes to a language's type system, no?

Thanks for the Example!

tkatchev: Python generators are a feature that lowers "expressiveness" (as opposed to using lists, iterators and operator overloading, for example) while greatly improving code quality. (By placing some strict restrictions on the space and time complexity of the respective algorithms.)

OK, now we have something to work with. By what measure do generators lower expressiveness? It sounds to me (of course I could be mistaken) that what you mean is that they are more concise than the semantically-equivalent use of lists, iterators, and operator overloading. But I believe that if you were to apply the framework from Felleisen's paper, you'd find that Python generators don't reduce expressive power (since they support the same semantics as the more verbose alternatives). So instead we see a win-win: a retention of expressive power coupled with an increase in safety and, for many programmers, in legibility.

So with respect to increased discipline, I have to say that we're in vehement agreement. Where I think we part ways is in our belief in the possibility, let alone the desirability, of being able to measure expressive power. Your claim that Python generators reduce expressive power in Python reinforces my concern that in the absence of a formal effort to measure expressive power, all you're left with is opinions that might not even be coherent.

O'Caml doesn't have side-effects?

I thought O'Caml let you perform IO and update variable references pretty much anywhere. Or are you referring to something else?

If one wishes.

OCaml provides lots of nifty tools for verifying program logic automatically. You can write OCaml without side-effects, if you wish to do so; moreover, it makes very evident where exactly the side-effects are. (Contrast with Python.)

No.

The point of functional programming is to do your programming using... functions? Just because most functional languages have static type systems doesn't mean that's the point of functional programming, any more than garbage collection is the point of object oriented programming.

extensibility with multimethods

Myths: OOP provides for better extensibility
True only on one dimension: you can easily add new cases/types to an existing design. However, it is much more difficult than in other paradigms to extend along the operations axis: you have to touch all classes in a hierarchy to add an operation over all cases (and if it implements a non-trivial algorithm, scattering it over the whole program decreases readability and maintainability).

In languages with multimethods, you do not have to touch all classes to add new operations. A set of methods for a generic function can be written in one place.

Rebuttal

FP is slow



Obsolete argument. Several FP languages nowadays have compilers that are on par with your average C++ most of the time (occasionally even better), and are likely to beat the hell out of Java.

Which is fine when all you need to do is be as fast as average C++. But the fact remains that FPs aren't going to be allowing inline assembly any time soon.

Not to mention that the higher level of functional programming usually allows algorithmic tuning beyond the capabilities of mainstream language, at least for more complex algorithms.

Unfortunately, many of these optimizations are more theoretical than actual.

GC is slow



For all but hard real-time applications GC is likely to perform better than manual memory management. This is because it is usually highly optimized as an integral part of the runtime system, can use smarter global approaches, and can exploit low-level knowledge not available on the language level.

This reference seems to be relevant.

Note in particular that reference counting, which is the usual fall-back in GC-less languages, is the least efficient form of managing memory.

Reference? I mean, at the least, you have completely failed to indicate whether you mean time efficiency, space efficiency, or both.

Perceived slowness of languages is usually due to other factors than GC itself, as counter examples show (see the previous point).

So you're saying the perceived slowness of LISP is not at all due to GC? That's interesting considering the reference I gave mentions work by Steele that LISP spends 30% of its time in the collector.

FP can be done in any language



One, if not the central characteristic of FP is non-trivial [on edit: and frequent] use of first-class functions and closures. This practically demands GC. Any claim that FP can be done in something like C++ because of the possibility of occasional maps or folds holds no more water than claiming that the occasional use of a function put in a record is OOP.

I both agree and disagree with you. It is certainly a bit disingenuous to claim that you can do full-blown runtime FP in C++. Hopefully, nobody claims that. However, the template engine in C++ is essentially a very minimal FP language, but one which "executes" at *compile time*. As such, it enjoys a form of GC, and it does have first-class "functions" (if you are willing to accept D. Abraham and A. Gurtovoy's definition of "C++ metafunction"). I suppose one could argue that it has closures as well, but I'm not going to push that claim personally.

Tragically, few mainstream languages have true closures. As a secondary requirement, many FP techniques also rely on a lean syntax for creating functions/closures on the fly, in order to keep code readable. Even fewer mainstream languages meet this requirement.

This is one of the main reasons I don't include FP as one of the paradigms that C++ supports (except at the meta level).

OOP provides for better extensibility



True only on one dimension: you can easily add new cases/types to an existing design. However, it is much more difficult than in other paradigms to extend along the operations axis: you have to touch all classes in a hierarchy to add an operation over all cases (and if it implements a non-trivial algorithm, scattering it over the whole program decreases readability and maintainability).

I mainly agree. I think what OOP delivers is reusability, but not necessarily in an extensible sense. I think that in general, extensibility is a very hard problem and CS in general doesn't have the theory all worked out yet. However, I believe that patterns like policy-based design are a pretty good way forward.

Let's hope that ASM inlining stays out of FPs

Which is fine when all you need to do is be as fast as average C++. But the fact remains that FPs aren't going to be allowing inline assembly any time soon.

Been nigh on 15 years since I last felt the urge to do any inlining of Assembly. In those instances, I found it ultimately more beneficial to write the functions in C and then compile it to ASM. From there, I could handcraft the assembly code to squeeze out every fractional microsecond possible. Learned not to get carried away with such things (ASM code is harder to modify). But when you need performance, and only the absolute fastest method would suffice, specific hotspot functions could be converted over without affecting the higher level constructs.

Anyhow, I'd just like to weigh in that inlined assembly is likely to be rarely used. I would hope that no modern programming language would need to resort to such low level requirements. But I suppose there are various domains that still require getting down to the metal. My opinion would be that good design is by far the most optimal way to get the optimization that programmers need.


(Guess I should also throw in that newer CPUs are getting harder and harder to handcraft code that can be predictable in the performance characteristics).

Is Assembly Any Good?

But when you need performance, and only the absolute fastest method would suffice, specific hotspot functions could be converted over without affecting the higher level constructs.

And it's much easier to interface your low-level languages from something closer to them than from something far away. With C++, you have nearly source-level compatibility with C, whereas with Java you have JNI (ugh!).

Anyhow, I'd just like to weigh in that inlined assembly is likely to be rarely used. I would hope that no modern programming language would need to resort to such low level requirements. But I suppose there are various domains that still require getting down to the metal. My opinion would be that good design is by far the most optimal way to get the optimization that programmers need.

In general, I agree with you. In practice, no compiler generates all the instructions available on a given CPU. When you want to use some of those special instructions, you have no recourse but assembly. Most people don't need to use special processor instructions in production code, but it's very handy to be able to experiment with new techniques when you have access to the raw CPU (such as exploring lock-free programming strategies).

Also, I can imagine that embedded programmers might have more desire to hit raw assembly when they want to take advantage of special features of their platform. People complain that things like unions and bit fields break C++'s type system or look like ugly kludges. But the fact is that interfacing specialized hardware is often eased by such backdoor mechanisms. Yes, it's a hole in type safety, but it's a manhole, and people use it to go places they need to. FP langs don't like manholes. ;)

I no longer do embeded work

Well, I do, but only in the moonlighting sense. For most of my other work, what you are talking about is a catastrophic recipe. For one, I don't want to have to debug my code on every cpu variation that I am likely to encounter. I want platform independence in the programs that I write - at least as much as can possibly be delivered. I have a hard enough time writing this stuff once. I don't want to get into the business of writing it dozens and dozens of times.

For another, I'm afraid the liability issues could swamp any work that I do. If I start writing native apps for all the various platforms, then I am bound to run into many scenarios where the end user's desktop or server is suspect. Now whether or not my software is the culprit for the demise of a machine, just being in the proximity means that I can waste countless hours trying to figure out what went wrong. I've found that many users have computers that are hanging on to life by a thread - what with all the virii, spyware, adware, you name it. I really don't like being in the business of having to cope with these machines that are on the brink of death and dismemberment. Given those constraints of requiring a sandbox for security and sanity, C and C++ are the slowest languages on the planet for my domains - software that can't run at all, is a lot slower than the slowest Java app ever written.

In the spirit that 99% of all statistics are made up on the fly, I would offer that the overall speed of most software applications would reap much more benefit in optimizing their algorithms. Even when I was working in embedded apps, you find that really only a small percentage of the code you write (about 5% in the software that I was involved in) is "hard real time". And that 5% wasn't usually the culprit behind slow throughput speeds - in fact you pay so much attention to those spots, that invariably it runs circles around the rest of the software. Embedded systems aren't just about bit twiddling or FFT's. There's a whole set of software that is at a much higher level of abstraction that is encased in most embedded apps. C and C++ can be really nice for the 5% where performance requirements are insane. But you also end up using the language for parts of the software where it's not well suited. And these parts where it is not well suited is where you end up spending a lot of time.

Platform Independence

is one thing, but there are plenty of systems that are effectively closed (like hand-held bar code scanners). There are no portability concerns, code size and memory consumption are an issue, and you don't want your program to be perceived as slow by the end user. One of the observations cited in the GC article I referenced elsewhere is that GC is equal to or faster than manual allocation when the available memory is at least 5X the amount used by the program!! When you reduce that amount, it harms GC performance, such that in the limit, you can spend the majority of your time in the GC (because it is swapping pages out to disk). When you are working with a barcode scanner with 640 KB of memory holding an inventory database of, say, 50K products, that doesn't leave you a lot of room for data collection or programs. The code has to be tight and the data has to be tight. And this is hardly a unique scenario.

If you're coding for mobile phones or PDAs, by all means use Java or your favorite FP. But there's lots of much smaller systems where memory is not infinite and you don't have the luxury of tweaking the processor clock.

And there's lots of embedded devices...

...where even the memory footprint of C++ come into question. And the typical timeframe of design-to-market-to-obsolescence for devices can measured in months. Which means you best use a language where the transitions are not painful. Yes, this can be done in C++. But it requires a lot of intense learning and misfires to come to use a complex language. Good work for those craftsmen that can hone their skills. But I daresay it says more about the craftsmanship of the practitioner more than it says about the quality of their tools.

Anyhow, as I indicated in another post, the world of software applications is very large and multifaceted. The issues that you identify are important for certain domains (not to say that these are insignificant markets, as there's still a lot of money involved in the embedded arena). But languages like Ada and Forth are the more natural competitors in these domains, rather than Java, C# or VB.

My take on embedded C++, at least with what little exposure I have these days, is that C proper is still the most dominant language. A C++ compiler is used in most instances, but only as a better C implementation. Most of the vendors I've been exposed to still find it frustrating that OOP and Templates are not exploited to their fullest potential. Of course, there are software engineers are live, eat, and breathe all the subtleties of C++ and who have mastered the languages. Mostly they'll just say that the bad rep the language gets is from those who don't know the language trying to get by.

You can climb out of that hole now...

Also, I can imagine that embedded programmers might have more desire to hit raw assembly when they want to take advantage of special features of their platform. People complain that things like unions and bit fields break C++'s type system or look like ugly kludges. But the fact is that interfacing specialized hardware is often eased by such backdoor mechanisms. Yes, it's a hole in type safety, but it's a manhole, and people use it to go places they need to. FP langs don't like manholes. ;)

I think the manhole analogy is a good one, but the way I would put it is that mainstream languages tend to look to manholes as a solution before they've properly explored the alternatives. Of course, that's usually done for reasons of expedience, whether justified or not.

Unions and bit fields, though, are areas that don't actually need manholes, which has been demonstated in some of the MLs and in Erlang, for example.

BTW, you should be careful about using Java as a proxy for more advanced languages. Java welds a C++-style type system to a somewhat compromised Smalltalk-style object model, and the result is nothing like what you can get from languages with more principled designs.

[Edit: the last paragraph was a response to "whereas with Java you have JNI (ugh!)", plus some similar mentions of Java as a point of comparison in other threads.]

Inline assembly is still highly useful...

...in systems programming (OS kernels, device drivers, even language runtimes) when you need to tweak the CPU in a fashion that HLLs don't support directly, either with language constructs or library routines.

Things like configuring MMUs, flushing caches, messing with process tables, issuing atomic/reserved memory transactions, etc.

Externally-written .asm functions that are callable by your favorite HLL (or by C) are often a better alternative than inline assembly--but there are many cases where the inline assembly is an instruction or two.

Outside this particular problem domain (and a few others, such as numerically-intensive "DSP" code, deeply-embedded code that's highly memory/CPU constrained, etc.) I'm much more inclined to agree with you.

But the fact remains that F

But the fact remains that FPs aren't going to be allowing inline assembly any time soon.

Whew, yes, hopefully not :-) - I prefer to leave that task to a good FFI.

Unfortunately, many of these optimizations are more theoretical than actual.

I wasn't speaking about optimizations (by the compiler), but algorithmic improvements done by the programmer. I believe this is quite real, but admittedly I have no hard references to back this up. On comp.lang.functional, Jon Harrop recently gave an interesting case study with his ray tracer demo, though.

This reference seems to be relevant.

I wasn't aware of this paper, but yes. It's main conclusion is that GC is bad when memory is sparse, but that is no surprise. Not sure how this measurements amd numbers from Java in general translate to other languages, though (note that generational GC is more expensive for imperative languages, because mutation costs extra checks).

Reference? I mean, at the least, you have completely failed to indicate whether you mean time efficiency, space efficiency, or both.

Time. Obviously, GC is a method trading space for it.

Reference counting is expensive because it implies overhead for potentially every access to an object. GC only costs on each collection, when it touches every live object once (short-lived objects that are dead before the next collection cost virtually nothing). No references at hand at the moment, sorry, but I can try to dig them up.

So you're saying the perceived slowness of LISP is not at all due to GC? That's interesting considering the reference I gave mentions work by Steele that LISP spends 30% of its time in the collector.

Such numbers very much depend on the application, of course. But I would not be surprised to see similar numbers for similar problems in C/C++, if their just was a way to measure how much time is spent in new, delete, destructors, and "smart pointer" management (it surely is a large portion of development and debugging time...).

Reference counting...

Reference counting is expensive because it implies overhead for potentially every access to an object. GC only costs on each collection, when it touches every live object once (short-lived objects that are dead before the next collection cost virtually nothing). No references at hand at the moment, sorry, but I can try to dig them up.

Well, I have to be picky here because I spent a lot of time working on and with smart pointers. Reference counting implies overhead for potentially every *creation or destruction of a reference to an object*. It is certainly possible to create ref-counted smart pointers in which *access* is assembly-equivalent to a raw pointer. The advantage of a smart pointer is that you can decorate accesses with things like null checks, if you want or need that level of safety (such as testing vs. production).

However, the "overhead" for reference counting isn't terribly high. Just a compare and increment or compare and decrement once the object is first created. There are definitely some fat reference counting techniques (such as supporting weak references, threading, etc.). But there is a whole continuum of price/performance tradeoffs from which you can select. The advantage of refcounting is deterministic destruction. In a language that supports RAII, like C++ (meaning, do cleanup things in the destructor), that means you can allocate a shared resource (like a file, socket, etc.) with refcounted pointers, share it for some unknown length of time, and have the resource released as soon as the last client is done with it. I'm curious to know what the best practice is for this pattern within the stateless langs.

So you're saying the perceived slowness of LISP is not at all due to GC? That's interesting considering the reference I gave mentions work by Steele that LISP spends 30% of its time in the collector.

Such numbers very much depend on the application, of course. But I would not be surprised to see similar numbers for similar problems in C/C++, if there just was a way to measure how much time is spent in new, delete, destructors, and "smart pointer" management (it surely is a large portion of development and debugging time...).

But it is quite easy to measure how much time is spent in those functions. That's what profilers are for. And having written a fair number of C/C++ programs and profiling some of them, I can pretty much guarantee you that the typical C/C++ program does not spend anywhere near 30% of its time doing memory management. I suspect that's largely because of the tendency to do as much work on the stack as possible in those languages.

I wasn't aware of this paper, but yes. It's main conclusion is that GC is bad when memory is sparse, but that is no surprise.

Heh. That's quite an understatement. The observation is that you need 400-500% of the program's actual memory footprint to get competitive performance! When you are running one or two small to medium-sized programs on a workstation with lots of RAM, that's not an issue. But when you are running a database server, you certainly don't have the luxury of only using 25%-50% of available RAM so that your GC performs well, because there is *never* "enough RAM" for medium to large databases.

"Perceived slowness"? Perceiv

"Perceived slowness"? Perceived by whom - people who have used Common Lisp, or people who have heard rumours based on the fact that an interpreter in 1965 was not as fast as compiled FORTRAN?

On a more serious point - any

On a more serious point - any language that uses GC could use reference counting instead, but do not, typically because it provides superior performance in practice for its users, and because reference counting is susceptible to memory leaks from orphaned cyclic datastructures.

Refcounting is not a panacea

I agree that refcounting is not appropriate in all cases. Cycles are a problem, but weak references solve it (though not automagically). In C++, for instance, there are many times when I need heap allocation, but I know that it will be non-shared. Then I spend no time with refcounting. If a language with automatic memory management were to naively use refcounting in those instances, then I would expect the language to be slower. But the idea that refcounting is always slower is as wrong as the idea that GC is always slower.

Sure, it's not cut-and-dried-and-universal

Sure, it's not cut-and-dried-and-universal, but the current fashion for garbage collection does indicate that the majority of code being written is in domains where it is cheaper to require faster hardware than to spend more programmer time optimising memory management as part of the programme logic. High-level languages which don't give programmers explicit control of memory management also provide the flexibility to select different garbage-collection strategies. The applicability of such languages goes even further if equipped with mechanisms to interact with memory management, such as enabling and disabling garbage-collection.

A final comment, on edit: C++ for "fast applications" is often used by programmers who don't know about the speed cost of memory management, and don't know that real time doesn't mean "processing a thousand items per second". These guys would probably benefit from transparent memory management, because they probably don't even know that they could count references and use smart pointers.

Several FP languages nowaday

Several FP languages nowadays have compilers that are on par with your average C++ most of the time

What about sorting, for example?

occasionally even better

I recently came across a site with comparisons between Java and C++. The site was about Java, and of course the benchmarks all showed Java be faster than C++. The most hilarious test was the method call: Java was 10 times faster than C++! How was that possible? well, the actual test involved newing and deleting C++ objects, calling their methods and returning results by value.

Perceived slowness of languages is usually due to other factors than GC itself, as counter examples show (see the previous point)

It depends on the application. If there are few threads, and there is only one thread that abuses the GC, then an application with GC can indeed be faster than an application without GC. But if there are multiple threads that constantly create new objects, the page file and CPU's cache are continously filled with new data. Some algorithms can be done with pre-allocated objects anyway, so the actual benefit of GC, in terms of performance, is debatable. GC shines though in development speed.

I agree with all the rest you said, especially with the last point, which is a major development problem with OOP.

Useless anecdotes are the best kind of evidence... right?

Several FP languages nowadays have compilers that are on par with your average C++ most of the time
What about sorting, for example?

In a more or less useless exercise that may or may not illuminate something (but which did keep me pleasantly occupied for 15 minutes or so), a quick check on my machine has OCaml sorting an array of 4194304 random integers using the standard library's Array.fast_sort function in ~7.77 seconds, while sorting a vector of those same 4194304 random integers using std::sort from gcc 4.0 took ~6.89 seconds.

Is that "on par" with C++? Beats me... I guess for an appropriate definition of "on par" it is, and for some other equally justifiable definition it's not.

Now, again, this isn't really telling us a lot, as Array.fast_sort uses a fairly basic merge sort to do the work, while std::sort uses ... well, I don't really know what it uses, but I believe someone mentioned in this thread that it was a fairly well tweaked hybrid sorting algorithm. And of course both of these would likely be beaten by a well written radix-style sort, or some other sort that's tuned to the data being sorted... Still, I'm far too lazy to code up my own apples/apples algorithm, and that becomes more a point of discussing algorithmics than PLT anyway, so feel free to draw your own conclusion from this single faulty data point. In conclusion, I sincerely hope that I have not made the lot of you any dumber for having read this post.

-30-

It's Helpful

I assume that Array.fast_sort is written in OCaml? Some tests that I would find even more helpful are:

1. Use strings of varying length instead of integers.

2. Use record types instead of integers, but use an integer or string field as the sort key.

3. Use a user-defined function as the sort criterion (such as a function that merely reverses the sort).

4. Is it possible for you to see how much memory each process is using during the run?

Sort of stringy...

I assume that Array.fast_sort is written in OCaml?

Correct.

Some tests that I would find even more helpful are:
1. Use strings of varying length instead of integers.

OK, simply switching over to a set of 4194303 random strings with random lengths between 1 and 50 characters, I got the following times:
OCaml: ~21.54 seconds, max memory*: 157.12 MB
C++: ~33.22 seconds, max memory*: 182.74 MB

Note that the C++ code just used the standard string class.

* This was determined by looking at the stats supplied by OS X's Application Monitor... I have no idea if this is at all accurate.

2. Use record types instead of integers, but use an integer or string field as the sort key.
3. Use a user-defined function as the sort criterion (such as a function that merely reverses the sort).

I'll try to look at that this weekend.

4. Is it possible for you to see how much memory each process is using during the run?

I'll also have to check on this, with a tool that I can feel is accurate.

Also, on a lark, I re-ran the integer version of the test, replacing Array.fast_sort with a naive (possibly buggy) 11 line OCaml version of an integer radix sort which performs the sort on the numbers referenced above in just under 6.2 seconds, a bit faster than std::sort. Interestingly, this sort is almost 3 times faster than Array.fast_sort when working in the OCaml interactive toplevel (code below):

let int_radix_sort a =
  let rec refill tab arr i tabdex l =
  if i >= 0 then
    match l with
    | [] -> refill tab arr i (pred tabdex) tab.(pred tabdex)
    | h::t -> arr.(i) <- h; refill tab arr (pred i) tabdex t in
  let radix_fill byte tab arr =
    let shift = byte lsl 3 in
    Array.iter (fun v -> let idx = ((v lsr shift) land 0xff) in tab.(idx) <- v::tab.(idx)) arr;
    refill tab arr (Array.length arr - 1) (Array.length tab - 1) tab.(Array.length tab - 1) in
  for i=0 to 3 do radix_fill i (Array.make 256 []) a done;;
-30-

Note that the C++ code just u

Note that the C++ code just used the standard string class.

Which means that strings were copied around...I am sure O'caml does not do that, as the strings are managed by reference internally, aren't they? I wonder what the results would be if C++ strings where managed by reference.

a bit faster than std::sort

I think that STL::sort uses the inplace quicksort algorithm.

excessive copying

Which means that strings were copied around...

No, it doesn't. The ISO standard doesn't say whether a string assignment does a deep copy. In fact, the standard has been crafted to allow copy-on-write implementations of std::string. (To the best of my knowledge only a single implementation in existence uses COW, which is GCC.)

Whatever that means in practice, one has to ask why the strings are copied in C++ and not in O'caml. It might be because the latter isn't all that slow and the comparison isn't all that unfair after all.

On second thought... quick sort does not copy anything, it only swaps. std::swap of two std::strings is cheap, as they are managed by reference in C++, too. I smell an argument crumbling under its own weight...

Actually...

No, it doesn't. The ISO standard doesn't say whether a string assignment does a deep copy.

Technically, it does, but only indirectly.

In fact, the standard has been crafted to allow copy-on-write implementations of std::string. (To the best of my knowledge only a single implementation in existence uses COW, which is GCC.)

The Rogue Wave library does so as well. I don't recall whether the latest libstdc++ uses COW strings or not. Basically, there are some esoteric requirements regarding the proxies that COW strings return that actually forbid COW implementations. But this wasn't discovered until well after the standard was drafted.

Whatever that means in practice, one has to ask why the strings are copied in C++ and not in O'caml. It might be because the latter isn't all that slow and the comparison isn't all that unfair after all.

Well, it isn't slow once you write the important routines in an imperative style. ;>

On second thought... quick sort does not copy anything, it only swaps. std::swap of two std::strings is cheap, as they are managed by reference in C++, too. I smell an argument crumbling under its own weight...

Yes, that is a good observation. However, I disagree with the conclusion. My guess is that the program was not compiled with optimizations turned on. I'm guessing that with -O3 we will see much better times for C++ (not sure about what's available for OCaml). If that still fails to account for the difference, I would probably be interested in profiling the test program to see where it's slow. And I'm sure both the C++ and the O'Caml communities would be interested in the results as well (since, believe it or not, *both* communities must fight perceptions of slowness, though C++ perhaps enjoys less resistance on that front).

Thanks!

Would it be hard to post the implementation of Array.fast_sort? I'm curious to see how big of a beast it is.

1. Use strings of varying length instead of integers.

OK, simply switching over to a set of 4194303 random strings with random lengths between 1 and 50 characters, I got the following times:
OCaml: ~21.54 seconds, max memory*: 157.12 MB
C++: ~33.22 seconds, max memory*: 182.74 MB

Note that the C++ code just used the standard string class.

That's certainly informative. My guess is that C++ is going slow because of the copying in std::string. If you had a [non-conforming] refcounted implementation like Rogue Wave's, it would probably be more competitive. Of course, raw char* would probably beat the pants off the std::string version, but then it isn't as close to an apples-to-apples comparison.

I assume you are only timing the sort functions and not the whole programs, right? For reference, could you post your sources somewhere?

* This was determined by looking at the stats supplied by OS X's Application Monitor... I have no idea if this is at all accurate.

Well, it's accurate enough from the pragmatic perspective of that's how much memory the OS *thinks* the programs need.

Also, on a lark, I re-ran the integer version of the test, replacing Array.fast_sort with a naive (possibly buggy) 11 line OCaml version of an integer radix sort which performs the sort on the numbers referenced above in just under 6.2 seconds, a bit faster than std::sort. Interestingly, this sort is almost 3 times faster than Array.fast_sort when working in the OCaml interactive toplevel (code below):

Well, I would hope that a radix sort can beat a comparison sort. ;>

Minutia...

Would it be hard to post the implementation of Array.fast_sort? I'm curious to see how big of a beast it is.

It's the last function on this page. Well, second last; fast_sort is just an alias for stable_sort, defined right above it.

My guess is that C++ is going slow because of the copying in std::string. If you had a [non-conforming] refcounted implementation like Rogue Wave's, it would probably be more competitive.

Probably. I only touch C++ when I need to these days, and have forgotten much of what there once was in my head, so I just tossed together a quick 'n' dirty string + vector version... I'm sure tweaking in both languages could be done. But at the very least, it's naive stdlib v. naive stdlib.

I assume you are only timing the sort functions and not the whole programs, right? For reference, could you post your sources somewhere?

Yeah, I just wrapped the sort calls with a pair of calls to gettimeofday and reported the diff. If you want the sources, you can find them here. But you have to promise not to laugh at my weak tea C++...

Well, I would hope that a radix sort can beat a comparison sort.

Sure, as I said above for the appropriate data type, amount, algorithm, etc. that's what you'd expect. It was more of a "look for algorithmic improvements before other optimizations" type note, I guess. Though I was a bit surprised at how compact the radix sort was...

-30-

Is it just me...

Or does the implementation of Array.fast_sort look very imperative? Maybe I just don't understand *ML enough...

The ref's are the giveaway

ML is not pure in terms of side effects. Many of the data structures have both a purely functional implementation and a stateful implementation. Use of state must used with care, if you are writing concurrent applications.

To the best of my knowledge, the O'Caml for loops are syntactic sugar for inlining the function calls. SML doesn't have built-in for-loops, but they are easy enuf to mimic:

   fun for a b s f =
      let
         fun loopup c where (c = b) = (f c; loopdown (c+s))
           | loopdown c = ()
      in
         if (s > 0)
            then loopup a
            else
               if (s