2008 In Review - What Happened with Programming Languages?

With 2008 winding to a close, here's a question to you: what was noteworthy about 2008 as far as programming languages were concerned? To paraphrase Ehud, on topic are notable news about PLT research (direction, fads, major results) (2) notable news about programming languages (whether about specific languages, or about families etc.) and (3) notable news about industrial use of languages/language-inspired techniques (adoption, popularity).

While we're at it, let's score the predictions made at the beginning of the year and laugh at how young and naive we once were or at least make excuses about why our foretelling didn't quite pan out as predicted.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great topic! I eagerly (!)

Great topic! I eagerly (!) wait to see the responses.

Biggest story? Perhaps Clojure

I'd guess that Clojure was the biggest story in PLs this year. Barely on the radar at the end of last year, it appears to have generated a lot of momentum for a young PL.

Uhm, how would Clojure

Uhm, how would Clojure compare to Scala in terms of exposure growth in 2008? Not trying to rain on anybody's parade, but as far as specific languages go 2008 was a good year for Scala. Other interesting developments in 2008 I can think of:

  • The return of C with a vengeance. Various forms of C are used to program GPU hardware (HLSL, CUDA, OpenCL...). This is where performance-based concurrency is heading, and the future is only partially functional in the form of restricted memory models. Hopefully, more functional languages can get in on the party as a better higher-level alternative (e.g., Vertigo, PyFX, ...).
  • The Java death watch continues. Its future is tied up with Sun, which continues not to make money, and in this economy... JavaFX was late and didn't make the splash it needed to make. Can Scala (or Clojure I guess) save the JVM? And who would take over the Java mantle if Sun imploded, IBM?
  • Objective C is finally cool. Objective C is the real hot language of 2008 thanks to the iPhone SDK. Will Apple ever adopt managed code?

clojure and javafx

The onion once had a great article titled "Sociologist Considers Own Behavior Indicative Of Larger Trends."

I'll follow that tradition and claim that 2009 will be a year of clojure rather than scala. These two languages seem to be on the opposite ends of the JVM language spectrum. Clojure's syntax is obviously very minimal, whereas Scala's syntax seems overly complicated (I've felt this since I started looking at the language, since then I've heard others complain as well). Scala's type system looks very complex, compared to clojure (again, obvious). Scala's compiler doesn't look production ready yet. Scala comes out of a research lab where as clojure's stated goal is to be a pragmatic programming language...designed for real-world projects. Again, I'll admit, these are personal perceptions...but I've noticed similar criticism from others as well.

JavaFX would be more interesting if it wasn't so haphazard. I suspect that when more GUI components are released for it (such as JTable), programmers will take another look at it. The language is apparently designed to target 'designers,' and they've done so by calling some language constructs "Stage" and animation, etc. JavaFX's interesting programming language features (such as bind) will be overshadowed by Sun's marketing silliness.

Outside the JVM, F# looks very interesting and 2009 should finally bring it to the attention of mainstream .NET developers.

I'd argue that you want your

I'd argue that you want your compiler and type system doing heavy lifting for programmers rather than a language that does less and is super easy to implement (e.g., it has very little syntax, dynamic typing). Ah, but that is again the argument of static vs. dynamic languages and who wants to go down that rat hole? Of course, you can easily find those that agree with your criticisms as well as you could find those that disagree.

Agreed on JavaFX, except the designer part: designers are more likely to use a visual tool like Blend rather than code in a textual language directly (e.g., XAML, JavaFX). JavaFX is aimed at the ActionScript segment of the developer community, which often border on also being designers. Also, WPF already has everything that JavaFX has (bind, animation) without the language constructs, perhaps 2009 will finally be the year where animated UIs take off?

Clojure easy to implement?

The basic language is easy to implement (and that doesn't make it less powerful), but the library (very fast persistent vectors/maps) and transactions? I'd say that there's more heavy lifting in Clojure than in Scala.

Heavy lifting

Much of the heavy lifting in Scala has gone into its type checker - a problem that Clojure obviously doesn't have to face.

Rich has done some very terrific stuff with his library, and the Clojure community is beginning to contribute. But the Scala team and community are aware of the limitations of the current library and are working on it. For instance, the release a few months ago had new user contributed forms of maps similar to the Map and Vector implementations in Clojure. With the next release they've set the expectation to get a complete revamp of the collections library based on higher kinded typing which should mean a very rich set of reusable abstractions that are, along some dimensions, difficult to express in untyped languages. I expect great things from both teams and both communities in the coming year, but this isn't a thread about predictions.

There is a lot of work on

There is a lot of work on the Scala library also (actors, transactions, high performance collections, ...), so I don't see your point. Basically, Scala's library is incredibly rich (and functional), and takes advantage of Scala's advanced features.

There will always be people that prefer static or dynamic, and I guess this boils down to that. But I really have little sympathy for anyone who prefers dynamic languages because they are easier to implement (no static types! no real parsing if Lisp!). The whole point of building a compiler is to make it easier for users, not for the implementor. There are plenty of things you can do in a dynamic language that are hard to implement; i.e., an advanced dynamic type system that allows advanced type information to be inspect-able at runtime (I think this is coming to Scala soon), or even a full backtracking dynamic type system that takes advantage of the fact that it doesn't have to statically decidable.

Back to the topic. I predict in 2009 that no progress will be made on the static/dynamic debate.

Re "no real parsing":

"But I really have little sympathy for anyone who prefers dynamic languages because they are easier to implement (no static types! no real parsing if Lisp!)."

Granted, for whatever reason, static typers often seem to be simultaneously fans of horrible complicated syntaxes, but there's no particular reason you can't have a statically typed language with a pleasantly lispy (or other simple) syntax rather than some baroque thing like typical static languages. It just seems to be the kind of language implementors who like static typing also tend to like twiddly annoying syntaxes - even Qi, which is after all just layered on top of Lisp, added a load of syntax.

Some people actually seem to view the complex syntaxes as a feature rather than an annoyance, shrug. People are different.

Sometimes a more complicated

Sometimes a more complicated syntax is needed to guide the syntax-directed type inference.

Syntax-directed type inference?

I kinda think I know what you mean by that, but I'm not sure. Are you referring to syntactic clues used to identify the kind of a particular expression (is this thing a type, or a term)?, or to things like the +. and other FP operators in ML?

Well, a little of both given

Well, a little of both given your example. Special symbols are used to distinguish different terms, like records, variants and tuples, and symbols appearing in a certain position are for special calls (like infix call, eg. +.), and special symbols are reserved for declaring types and kinds, etc. Statically typed languages have a richer structure, and so the syntax is more complicated and symbol rich to disambiguate meaning while retaining succinctness.

I'm not arguing that LISP

I'm not arguing that LISP syntax is good or bad. Its just really easy to implement. The appeal of a language (static or dynamic) should be biased towards users, not implementors. That fact that a language is easy to implement or not is not important to the user, and could be a negative as users have to work around what the implementor didn't implement. Simplicity has its merits, but taken to its minimalistic extreme is absurd and unnatural: human being understanding benefits from moderate doses of complexity (e.g., syntax in human language).

disagree with implementor v. user

You say that the "fact that a language is easy to implement or not is not important to the user".

Wow!

I strongly disagree. That's one of the main reasons that I avoid C++, when I can. It's also one of the main reasons that, in spite of being raised a lisper, in my mature years I'm finding myself increasing skeptical of graph-tracing GC.

The ease with which a language is implemented has a lot to do with how easy it is to understand and manipulate. It has a lot to do with economic dependencies (as of users upon compiler vendors). It has a lot to do with the cost of testing or otherwise validating software.

Syntax in human language is not a very good comparison. We don't much "pay extra" to teach people most natural language syntax: they pick it up. (We may "pay extra" to teach particular, artificial grammatical dialects but that's a side show compared to the main fact that for most people, natural language comes pretty close to "for free".)

Natural language syntactic complexity takes advantage of a natural phenomenon. Programming language syntactic complexity comes at the cost of thousands or hundreds of thousands of lines of code.

That said, I'd agree that rich (and therefore, in some sense, complex) syntaxes for programming languages can be a big "human factors" win but.... the grail should be syntaxes that can be built up and customized in simple, robust steps, on the fly, as needed. E.g., I'm just fine with the notion of something like Fortress where, presumably, I'm eventually using a full blown 2-D equation editor for some parts of the code but... I hope for a software stack in which that quickly and simply reduces to some lower-level syntax like s-exps. Looking at the nice displayed equation, so to speak, I'd want programmers to be able to sleepily read off the corresponding s-exp-ish thing and vice versa.

Big, hairy, monolithic syntaxes are a burden. Someone's gotta maintain that mess of code, you know. :-)

-t

more disagreement

This:

the main fact that for most people, natural language comes pretty close to "for free".

is false, and this:

I'm finding myself increasing skeptical of graph-tracing GC

is just downright crazy. Easy to implement and easy to implement well are not that same thing. Plus, there are other virtues besides ease of implementation.

nat. lang syntax and GC

I'm not sure what you think is wrong about my claim that people pick up natural language syntax "for free". I'm thinking along the lines of Chomsky's "universal grammar" or Pinker's work. Programming language syntaxes are something we have to implement, sometimes in very tedious detail. In contrast, we don't so much "program" syntax into children - we live in environments that help children strengthen and stylize innate syntactic capacities. Language is an epiphenomenon resting on, among other things, those innate capacities.

On GC, without wishing to drag things wildly off topic, I'll just try to clarify a little bit:

I am not against ever using graph tracing resource management. I agree if you want to say that it's an important technique and that for some needs there is simply no substitute.

The question arises, however, whether graph tracing GC should be "built in" to a language and made the default behavior of all types. We can look at the consequences of GC and the alternatives.

The consequences, if GC is done "well", include considerable code complexity and fragility along with a performance landscape that is always strewn with land-mines. High-performance programming in GC-centric systems often seems to involve careful steps taken to work around GC. Usually, those careful steps are not usefully portable (won't have the same performance characteristics) across multiple GC implementations. GC is very nice, and very powerful, in this view, but also untrustworthy - full of "gotchas". It's hairy and flaky.

Alternatives? Well, combinator calculus shows us that, in theory at least, we don't need GC for a broad range of computations.

When I say I'm skeptical of graph-tracing GC I mean I'm skeptical of it being a "built-in", essential part of a language. I'm skeptical of language design "moves" like in Scheme: step 1) assume GC; step 2) ok, let's define "closure" (as these complex, often circular data structures). The designer who makes that move has already lost me at "step 1" -- I don't have any reason to accept that premise and he's got no way to compel me. I can point to plenty of useful, even high-level systems that don't "assume GC". I can point to the simplified implementation and reliable performance characteristics of those systems. I can even point out how if I really do need GC for some particular data structure, I can certainly implement it as a library for limited purpose use. Given the hair and the flakiness, I can't any longer get behind "step 1) assume GC".

(I used to feel the opposite way. For the longest time it seemed like "a really good GC" was always just around the corner - just a small matter of hacking - a couple years away, at most. I felt that way for something around two decades. Lately, I'm changing my mind.)

-t

'for free'

On natural language, differing complexity of different natural languages has impacts on literacy rates in different languages. Chinese is a good example here.

As for GC, what precisely do you think is shown by the combinator calculus? That GC is unnecessary if you have no dynamic allocation? You also seem to be comparing the challenges of writing extremely fast programs in a garbage-collected language with the 'challenge' of writing anything at all in a non-GC'ed language.

Finally, "really good GC" is here, and has been here for years. You sound like someone who's waiting to retire assembly until they get a "really good compiler".

On natural language,

On natural language, differing complexity of different natural languages has impacts on literacy rates in different languages. Chinese is a good example here.

I'd suspect that economic factors would have a larger impact on literacy rates than does language complexity.

I'd opine that natural language has three components - in increasing order of difficulty: Speaking, Reading, and Writing. I'd also suggest that programming languages have no real spoken counterpart and that writing is actually easier than reading. And that writing is the most onerous in terms of mastering syntax for both natural and programming languages.

Natural Language Aquisition/Dyslexia

This is partially off topic for LTU, but whatever. Reading and writing are a whole different ballgame than speech/listening when it comes to acquisition. You put kids in a room you will get a new language. The standard linguistic example is sign language. Frequently when you put large groups of deaf children together you will get a new sign language if they are not taught one. There is nicarguan school where this happened.

Reading and writing do not come for free. The main reason we know this is due to the various types of learning disabilities that arise. A child who is dyslexic in English would not necessarily be dyslexic in Chinese as different parts of the brain are used. This has to do with whether the language logographic(chinese),syllabary(japanese/indian languages), or alphabet based(english,etc). They all use symbols to map to something spoken/concept/idea in different ways. The running theory for is that written language hasn't been around for very long so our isn't used to this sort of mapping process. Not to mention writting often has slightly different grammatical and syntactic rules than spoken speech. Do you have commas in your speech?

As for bit of speculation on my part difficultly in picking a programming language or naturally written languages depends on how irregular it is. Is one able to apply the pattern the same way for all cases or are there cases where this fails? I usually cite C++ operator overloading as where one can cause conceptual irregularity for what the + stands for. I think people in general underestimate what a difficult task it is to pick up written language as most did it as children as do programmers do for people who are not programming literate. There are very difficult concepts that need to be understood before one can make sense of what a piece of code is actually doing.

where are we?

Thank you. We are risking straying and yet that is very interesting and it certainly won't hurt people thinking about programming language design (and, could help). You've given me a (fuzzy) idea for an experiment:

Let's take a pool of working programmers, randomly. Not luminaries. Not experts. Just a random sample of people hired off the job boards last year, say.

Show them various bits of code in languages they use.

Ask them to answer questions that test their knowledge of the meaning of those programs.

I'll bet we'd get interesting results and interesting correlations with the economics of the industry. (My guess? Most working programmers don't understand programming very well at all.)

-t

Depends

[[Finally, "really good GC" is here, and has been here for years. You sound like someone who's waiting to retire assembly until they get a "really good compiler".]]

Depends on the property you want for a "good" GC: AFAIK, there is no GC which has all the following properties:
1- realtime 'friendly' (for interactive applications and games)
2- scales with SMP
3- is robust against memory 'leak' with virtual memory (which is the case for hand allocated memory as leaked memory is often swapped to the disk and use only swap but no real memory).

endless features

If we define 'good' to mean 'has all features anyone wants', then nothing is 'good'. Are there any languages that are (1) high-level and (2) allow control over memory layout? Are there any languages that (1) scale to millions of threads and (2) are widely deployed on end-user boxes? I could obviously go on here, but I hope you see the point.

what's the minimum then?

If you consider the feature I listed as superfluous then your definition of 'good' is maybe too lax:
- first feature allow the use of GC in interactive application without annoying the user with pause due to GC collection.
I've read at least one article which list as a reason why Lisp lost to C is that the 'pause' induced by the GC which made a bad impression to users in demo.
- reason two is superfluous you're right, threading isn't necessary to use efficiently multi-processors, but it's still quite fashionable thse day, so it'll be perceived as a big restriction.
- reason three is making sure that applications runs well for a long time even in case of a very common programming error (memory leak) given that this is already the case for manual memory allocation with virtual memory, using a GC which make the performance sucks in this case is a quite serious regression, so I wouldn't call such GC 'good'..

As for your questions about the languages, Ada and C++ allow control over memory layout whether you consider them high level or not is another question.

As for your latest point, why having million of thread would be necessary to be claimed as good? I provided sensible reason for the contraints given on the GC..

minimums

GCs without big pause times: there's a ton of work on this. It may be that pause times killed lisp in the early 80s, but that was a long time ago.

GCs that use multiple threads: lots of these too. The standard JVM collector from Sun, for example.

Your memory leak problem I don't understand. Is this about GCs that don't collect some portion of the garbage? Or where data is live that shouldn't be? And what does it have to do with VM?

Joe Armstrong can give you lots of reasons why you need millions of threads. My point was just that it's easy to give a list of features that no one can meet.

The devil is in the 'and'

For a GC robusts to memory leak point see http://lambda-the-ultimate.org/node/2391
it's about GCs which don't try to access unused data: otherwise the GC has bad locality properties which interacts badly with an OS' Virtual Memory Manager(I should have said VMM not VM, sorry).

?

The link you provide is about how to solve your second problem. I still don't understand the 'memory leak' issue.

Fixing GC memory leaks.

Thanks

This is indeed the kind of memory leak that I was talking about.

A note about the 'Virtual Memory Manager aware' GC discussed in the ltu link I gave above: it requires change in the OS's VMM so as far I know this good GC is only a research tool..

just like C

Note that this paper is not on fixing problems with GC, but, as the paper states in the introduction, correcting *more* errors that a programmer might make with explicit memory management.

gc (and lang.)

On the language part, I'm skeptical of your ability to define "complexity" and trace cause and effect well enough to back up your point. My point is just about syntax: we have a lot of innate, "proto-syntactic" capabilities and exposure to language and nurture develops those: our physiology already "wrote much of our parser and generator for us," so to speak.

On GC and combinator calculus: I think "yes" is the least misleading answer to your question but I'd put the question differently. Combinator calculus does involve dynamic allocation (e.g., a reduction step can increase the size of the expression being reduced). What combinators show is that, at least where the only side effect is the successive steps of reduction, you don't have to have any cycles: state is entirely tree-like. You don't need graph tracing to manage trees. (Less esoterically: AWK is a fine example of a very practical language, *with* a broader range of side-effects, but still a tree-like state that does not need built-in graph-tracing.)

If "really good GC" is here I haven't found it on my system. Problems range from imprecision (the various "conservative" collectors) to under-specification (defining a "live value" in Scheme) to un-even performance (pauses) to memory consumption (in copying collectors) to the mine-field of unexpected, idiosyncratic "heat death" scenarios (collectors that work "well" for 90% of real problems in general and work poorly for 90% of the problems you yourself need to solve).

I'd like allocation steps to be fast and have even, predictable performance and I'd like the locality issues that can lead to heat death to be easy to think about - all of which leads me more towards variations on good 'ol reference counting. From a language design perspective, I'm therefore interested in languages in which there are no circular structures as far as the collector is concerned.

I should qualify that a bit. There are lot of styles of programming (e.g., some kinds of "exploratory programming") and specific classes of algorithms where, yes, not only do I want graph-tracing GC for everything but I want a program notation that reflect that. I happen to like lisp, for example. I just no longer believe that graph tracing is appropriate for a general purpose, "systems" language and I'd rather see something like lisp done as a DSL on top of a more general, high-level, systems language that among other things lacks graph-tracing GC.

As for assembly: in point of fact, no. Unless you are doing something wildly outside of the normal calling conventions or are using some specialized machine instructions that the compilers don't know much about, there is no need for assembly. It *has* been true for quite a while, the popular opinion is right, that you'll have a heck of a hard time beating a good C compiler, for example. Especially on modern processors. So, I might be some kind of luddite or might not but in any event I'm not *that* kind of luddite.

-t

all of which leads me more

all of which leads me more towards variations on good 'ol reference counting. From a language design perspective, I'm therefore interested in languages in which there are no circular structures as far as the collector is concerned.

No need. There are good concurrent reference counting collectors available, with very low pause times and decent throughput. They're just not used.

And if you're only concerned with tree-structured allocation patterns, then region-based memory management fits the bill.

thanks and gc performance

Thanks for the link.

Many such papers begin from a premise - a kind of problem statement - that says 1) we know we want graph tracing; 2) we want low pause times and perhaps thread pauses in different threads at different times; 3) we want low overhead.

Then they add in constraints: a) can they use VM hardware directly? b) via the OS? c) how big is RAM expected to be compared to the working set? d) are read and / or write barriers available and with what constraints? etc.

An algorithm / data structure is then motivated and developed, analyzed in relation to completeness and performance characteristics, and measured.

That's all well and good but they lose me at the premises: that we actually want graph-tracing in the first place (and therefore are willing to pay for it with pauses, overhead, etc.)

I take as a non-scientific premise (a premise based on hunch and experience) that (1) reference counting is easier to implement naively but well enough to use for many practical purposes; (2) in almost every environment, X units of hard work improving reference counting will yield usefully better performance improvements than X units of hard work improving graph tracing; (3) to achieve comparable performance at the highest practical levels of optimization, graph tracing requires that more be spent on hardware and power consumption; (4) the semantics of reference counting and concurrent reclamation of dead objects allows useful control (e.g. varying the relative rate of reclamation and pause times explicitly, on the fly) in conceptually simple ways while graph tracing collection can not achieve that degree of conceptual simplicity (by far); (5) a natively reference counting environment and memory model can host a graph-tracing sub-environment while imposing only negligible (implementation) complexity and performance overheads far more easily than the converse case; (6) high level programming languages and environments can be created which (6a) involve no circular structures - assume a reference-counted memory model; (6b) are expressively and practically comparable and competitive with high level languages that assume graph tracing, yet which are far simpler to implement to comparable levels of robustness and performance.

Some of those claims should be uncontroversial but of course some of them, such as (6), are surely controversial.

In any event: a paper about graph-tracing (or otherwise cycle-collecting) GC would most directly apply to my hypothesis it it showed "0-overhead" cycle collection relative to a cycle-free reference-counting system.

-t

Some of your conjectures are

Some of your conjectures are borne out in Bacon's reference counting papers. IIRC, cyclical garbage which triggered the cycle collector (or was collected by a backup tracing GC), was between 5 and 25% of all garbage, for the real-world Java programs they measured. Bacon also proved that all GC algorithms are hybrids of reference counting for incrementality, and tracing for throughput (excluding concurrent GC algorithms). So ref counting does have advantages, but if you're going to avoid circular structures, then it's probably worth looking into region-based management. It will be far more efficient.

O(size of the term)

Normally, GC will need to visit the whole heap once in a while to reclaim garbage. In general, this will always add O(size of the term) to any computation, that's just the price you pay. There are ways of making this order manageable, but there is no way around it.

Btw. reference counting is for a lot of applications a lot more expensive than graph tracing, and if you do that, the cost of tracing cyclic structures is often negligible.

[This follows pretty natural: doing a small amount of work for every reference created/destructed is worse than tracing (just) all the life references after a large period. There also is no mathematical complexity difference between tracing a DAG or a graph structure, though it is easier to implement strategies which will reduce the amount of cache misses in the former. Unfortunately, it still is hard to beat a Cheney collector for all but non-trivial graphs.]

apples, oranges

Btw. reference counting is for a lot of applications a lot more expensive than graph tracing, and if you do that, the cost of tracing cyclic structures is often negligible.

I don't buy it. I mean: I do recognize that naive ref-counting-semantics collection generates a lot of memory traffic and overhead on updates and that naive graph-tracing can beat that. Crudely speaking, if I can get by with either, sure - gimme Hans Boehman's GC over ++foo->ref_cnt any day. I'm not naive about the performance and ease of use comparisons there.

Ultimately, though, I want to be rid of virtual memory; I'll take back some of those gates to get hardware assist on refcounting, and we'll meet up in that neck of the woods.

-t

Virtual memory hardware is

Virtual memory hardware is such low overhead now (with respect to everything else) that getting those few gates back wouldn't help very much. Hardware GC is a problem because the overhead is in the tagging (or ref counts), which are distributed in memory. The computations performed aren't really an issue compared to the fetches and cache destruction of the GC process; i.e., in ++foo->ref_cnt, the big overhead is the load and the store, not the add.

VM is actually simple (address translation, pages) and very efficient. Its also mostly orthogonal to GC.

memory

I don't just mean address translation. I want a much smaller address space.

-t

Not sure I understand. If

Not sure I understand. If you want a smaller address space, just use an 8 or 16 bit CPU. You won't get a smaller address space efficiently on a 32 bit CPU, it natively handles 32 pointers, and would require lots of extra work/inefficiency to go down to 16 bit pointers (remember the good old days of thunking!).

The only disadvantage of a larger address space is your pointers are a bit bigger, but everything else is good.

Not sure this has much to do with PLT anyway

While greater cooperation between an OS' memory subsystem and a PL runtime is often helpful--the whole memory hierarchy has little to do with programming languages.

Certainly there are things hardware could do to assist GC; such as providing tag bits for pointers, but hardware doesn't do these things.

Given the relative speeds of on-the-same-die silicon vs dedicated wires across a circuit board vs CPU busses vs magnetic disks, I doubt the memory hierarchy is going away any time soon.

joy, cat / mem. model / addr. space size

I like a wide data path - wider than 32 would be peachy. I only care to directly address a particularly small amount of memory, though - maybe around the size of a typical L1 cache. Having lots more physical RAM than that is a good idea, sure, but the interface to that can be more explicitly like the interface to a disk. I can save a lot of power that way so I'll take some of that power back and spend it on smaller, specialized, ancillary processors that can perform simple operations on the "big, not directly addressable RAM" asynchronously.

Generally speaking, I don't want the "big indirect RAM" to be mutable, from a semantic point of view. It should contain strictly "write once" objects, acyclic and reference counted. Object identifiers (in lieu of "addresses" for the "big RAM") don't need to be stable. The "big RAM" represents a forest of trees and updates consist of dropping nodes and cons-ing nodes; the contents of nodes can be copied to (or constructed from) the small amount of directly accessible memory. Ancillary processes can also cons and drop tree nodes in the "big RAM". Hardware / firmware (for the "big RAM" device) can assist with reference counting, structure sharing, and optimizing the case of linear updates ("drop node X but construct Y which is X with a slight change").

I am pretty sure we can make very fast, efficient machines that implement that model. That model has a lot of pseudo-theoretical appeal (e.g., it *is* a kind of combinator engine). That model trivially yields transactional memory and gives all kinds of plausible approaches to multi-core computing. That model implies a butt-simple instruction set.

The programming language design problem is: how does one program it?

I'm pretty interested, these days, in Joy and Cat, as at least a place to start.

-t

Cell is the first half of your dream

The hardware you're describing sounds a lot like the Cell architecture, as seen in the PS3 and IBM's QS20/22 blades, except with the roles of the PPE and SPE reversed. SPEs, the asynchronously executing co-processors that do the bulk of computation, have a local store architecture and a fast DMA engine to main memory, rather than a caching hierarchy. It would still need a handful of new instructions to do everything you're asking for, though.

Unfortunately, the economics of things like LISP machines didn't seem to work out.

lisp machine economics

Unfortunately, the economics of things like LISP machines didn't seem to work out.

That's part of my thinking, too. No, it didn't work out but, hang on, something has changed. Lisp machines didn't work out for lots of reasons but we could simplify and say they didn't work out because if, at the time, you had two equivalent teams of chip designers and told team A to optimize for C-like languages and team B to optimize for Lisp, your ROI would be higher for A. Nowadays, team A would have a harder time making much further improvement and, meanwhile, more and more computing systems are being built mainly out of high level languages. So, it's time to revisit the question.

-t

For what it's worth

This was all predicted by the architect of the REKURSIV object-oriented processor in his book describing his design and rationale (in 1988). This is easily the best book on object-oriented programming in my collection. It also predicted the need for concepts like functional reactive programming and functional languages using software transactional memory.

REKURSIV

Agreed. It's also interesting to look back at his earlier book about Poly, and compare some of the ideas with those in currently popular languages.

I think he was just a decade or two ahead of his time ("was" in the Erdos sense; last I heard he's still alive in meat space). I was very sad to see David go into popular writing (though I enjoy his writing) after beating his head against the industry.

-- MarkusQ (aka the guy who played the "CISC"-foil in REKURSIV)

I think he meant that he

I think he meant that he wants a much smaller working set. Large address spaces or not, there are many advantages to having a smaller working set.

I don't virtual memory is

I don't think virtual memory is orthogonal to GC, especially when the size of the heap approaches the OS's estimation of the application's working set. Then paging effects become huge in proportion. See Matthew Hertz, Yi Feng, Emery D. Berger: Garbage collection without paging, PLDI 2005 and Ting Yang, Emery D. Berger, Scott F. Kaplan, J. Eliot B. Moss: CRAMM: Virtual Memory Support for Garbage-Collected Applications, OSDI 2006 for some recent work. There was a lot of exploration on this topic in the 80s as well.

Sure, a page-table based virtual memory system seems pretty cheap at the software level. It's easy to imagine what a naive implementation in hardware is like. But fast forward to now, and making address translation fast in today's processors is a huge undertaking. It impacts everything from the pipeline structure to the cache organization. It typically means adding a cycle or two latency to the L0 cache in the "hit" case, because if the cache is indexed by physical address, address translation must happen first. Considering that this is the most important fast path in the memory hierarchy, this is a huge loss. I talked to the chief architect of the Pentium 4 and he told me flat out that eliminating virtual memory would be an enormous win.

Intel experimented with virtually indexed caches in the Pentium III line, so that L0 lookups could happen in parallel with address translation. In order to avoid the inevitable aliasing issues and make the cache still transparent, some nasty logic was required. On MIPS, all the caches were virtually indexed and therefore had to be managed by the OS.

Even the TLB is a bottleneck. It is super hot, eats a ton of power (relative to its size), and never seems to be big enough. Why do you think 4mb pages were introduced in the Pentium Pro and 2mb pages on SPARC?

Sure, on IA32, the page tables were 2 levels, but on AMD64, the page tables are 5 levels deep. That's a pretty slow slow path.

There are still architectures around (SPARC) that don't use page tables, but TLB misses are faults and go directly to the kernel, which implements a TLB replacement policy in software. That routine is a super bitch to write and still super slow. That hurts a lot.

I'm with the GP on this one; we need to get rid of virtual memory ASAP.

And replace it with what?

Terabytes of RAM would be nice, but too expensive for most applications.

Of course, many OS's will happily let you dial the swapfile size down to zero. That doesn't completely eliminate the VM subsystem; said OS's still do address translation to allow each process to have a logical contiguous address space (and to permit read-only pages to be more easily shared) but it's interesting to run that way.

Terabytes would be nice...

I think the conceptual role of the GC and virtual memory are nearly identical from a single application's point of view (i.e. provide me an abstraction of memory and manage it efficiently). Virtual memory, of course, is good at providing the illusion of a larger amount of memory that is available by using a larger, slower storage space and relying on temporal locality to hide its latency. While GC today, at the application level, focuses simply on reclaiming unused storage, I have always thought it was the perhaps the right place to provide the benefits of virtual memory as well. A GC has more information about the dynamic behavior of a program through its allocation behavior and subsequent GC cycles, and if armed with the same sorts of information that a VMM gets, it could probably accomplish that task too by moving infrequently accessed objects to the slower space.

The cost is that everything in the system has to be managed code, and so far, only prototypes have been built. I am not an expert in the field of GC, especially GC for persistent systems, but a lot of good work on persistent object systems has gone by the wayside.

From the view of the system software (e.g. OS kernel), virtual memory provides each application with its own logical address space. Perhaps this is the wrong interface to provide to applications. After all, why would they care about the actual addresses of their objects if that level of abstraction was not only unavailable, but unnecessary to their purpose? Perhaps virtual memory in that sense is an answer to an interface question that was gotten wrong a long time ago....

Goto BLAS as a counterexample

Virtual memory hardware is such low overhead now (with respect to everything else) that getting those few gates back wouldn't help very much.

Our approach is fundamentally different. It starts by observing that for current generation architectures, much of the overhead comes from Translation Look-aside Buffer (TLB) table misses.

I think it's little too early to call MMU overhead negligible. At least from HPC view point.

Sidenote on Boehm

Small addendum. It's the Boehm, or Boehm-Demers-Weiser, collector.

I found it very 'amusing' to see that Boehm, in some of his presentations, very carefully disregards traditional copying collectors.

In general, if you have a very high allocation rate of garbage, most copying collectors will beat the conservative non-moving Boehm collector, and likely most reference counting collectors.

That is not to say that it is not an excellent collector.

Copying collectors are not

Copying collectors are not generally appropriate for the Boehm collector's problem domain, which is to work with mostly oblivious languages. For such languages, the closest you can get is probably Bartlett's work on mostly copying collection (and derivatives), and the results there just haven't been that positive (so far).

Ok

I know Boehm works well with most C/C++/Java/C# like languages. Is that what you mean with oblivious languages? [Uh, and can you explain why it is called oblivious? I am not a native speaker, so sometimes I'll miss stuff.]

C/C++ good examples

Boehm collector was designed for C/C++. These languages don't have provisions for precisely distinguishing pointers from integers. Structures like 'union' make it pretty clear that the language is not designed with garbage collection in mind.

'Oblivious' language describes languages that are unaware or have no design provisions to support garbage collection. Java/C# would not qualify, though, obviously the Boehm collector may also be applied to such languages (and readily modified for precision using a GC-visitor to return refs from objects and structures).

Copying collection requires ability to precisely identify pointers from other data, and thus simply isn't an option in the target domain of the Boehm collector.

Combinator calculus

With regard to a cycle free term tree, is SKI reduction very different from a lambda calculus evaluator implemented by term copying? You can get the pure fragment of LISP (minus referential equality and maybe some other features) with term copying and you won't ever have any cycles. My guess is that performance would be totally unacceptable, and that's why this isn't done.

SKI

In the translation of a lambda term to SKI I think it grows at least linearly for every 'look-up' of a bound variable. [That's the cost of SKI] Hypercombinators, or smarter translations, can be used to get rid of this blow-up, just don't restrict yourself to SKI.

[Sorry pressed the submit button to soon.]

Copying is no alternative, if you copy after each reduction, well wow, that would really be slow. And if you don't copy, reductions will introduce gaps in the stack/heap, since references to subterms will be lost, so you're stuck with performing GC once in a while.

Lastly, no there is no real difference between SKI or lambda evaluators except that you need to keep a local environment in the latter. But they both are term rewrite systems, so implementation-wise you're stuck with the same strategies.

[Lastly, it seems, without any proof, that it is faster to implement an evaluator which performs evaluation with destructive updates on a graph, than to translate to a DAG rewriter (for instance with a CPS transform). That's what I did, anyway.]

final point

So, it seems from this post, and the others on this thread, that you're so unhappy with GC that you would like to throw away circular structures, closures, the memory abstraction that we've had since the Von Neumann machine, etc. I think this requires *way* more evidence than some vague unhappiness with the current state-of-the-art in GC technology (which is very good).

Further, I think my assembly analogy is exactly on point. You're not happy with the current compilers [collectors] you have, so you think we should all write in a language that doesn't require a compiler [collector]. You recognize the craziness of the argument for assembly, but it's just the same for GC.

re "final point"

So, it seems from this post, and the others on this thread, that you're so unhappy with GC that you would like to throw away circular structures, closures, the memory abstraction that we've had since the Von Neumann machine, etc.

Yes. Big, flat memory (the illusion of infinite relative to most problem sizes) is a poor model of physical reality and a lot of hardware and power goes to sustaining that illusion.

We already have, in effect, a tiny (say, L0 or L1-size) directly addressable memory and lots of complicated hair to swap things in and out of it. Currently, we mostly direct our "tricks" there to create the illusion of big, flat address spaces. So, the question is, what might be a better semantic model for an efficient memory hierarchy? And how do we program it?

I think this requires *way* more evidence than some vague unhappiness with the current state-of-the-art in GC technology

Is that "requires" as in "requires before speaking of the idea" or "requires" as in "requires before budgeting a 5 year, 100-man chip-design project"?

-t

So you would be happy with

So you would be happy with something like pixel shader programming? I'm sure that doesn't involve address translation (textures are flat).

backwards

So, the question is, what might be a better semantic model for an efficient memory hierarchy? And how do we program it?

I think this is exactly backwards. We should think about how we want to program, and then figure out how to implement that as efficiently as possible. Things like throwing out closures because our GCs haven't impressed you is just nuts - people come before chips.

design constraints help

We should think about how we want to program, and then figure out how to implement that as efficiently as possible.

If we do things in that order, what is to stop us from dreaming up the wish for an impossible programming language, and then wasting more time trying to figure out how to implement it? Surely you agree it is best if we dig from both ends of the problem, thinking both bottom up about what we know how to make efficient and top down about what we might wish could be done efficiently.

Things like throwing out closures because our GCs haven't impressed you is just nuts - people come before chips.

I am putting people first, not least myself. The main selfish element is, I guess you could say, an attempt at virtuous laziness:

As a user of GC I find that I often have to reason about how the GC will perform under various complex usage scenarios but that that reasoning is comparatively very hard. As a user of GC I want very precise semantics about the lifetimes of various values and yet language specifications generally either let me down or give very complicated, hard to use answers.

As an implementor of GC I find it adds too much code and complicates too much code and/or wants too much hardware.

In short, GC is a sea of constant sorrow.

That, to me, sounds like an interesting constraint to explore: what can we do without cycles and hence without GC? A photographer might get fed up with the complexity of color and spend a year taking only black and white shots. A high-level language designer might spend a year playing with cycle-free run-time systems. The constraints are a way to focus attention in novel ways and a way to discover expressive possibilities we might otherwise overlook.

In various replies on this topic in these sub-threads I've laid out a plausible case, I think: excuse enough to believe it plausible that the no-cycles constraint is worth exploring.

I don't see how you get from that to "putting chips before people"!

-t

"sounds like an interesting constraint to explore"

good on you for being willing to think the transgressive thoughts -- maybe something will, in fact, come of it.

"Illusion" = abstraction

I agree with Sam. This backward approach is what left us with all these less than decent low-level languages still dominating the mainstream. Low-level languages (done properly) have their niche, but most of today's programming is not in there.

Also, what you call an "illusion" is what I call an "abstraction". And programming is all about defining proper abstractions to manage complexity.

On die memory

I am not an expert in this field, but I wonder how much you would gain if all memory would reside on the chip? Wouldn't that make the L0/L1 caches redundant?

Something very close to that

Something very close to that is now possible with chip stacking. Part of the current problem is that the best process technology for CPUs is not best for memories. When you stack multiple wafers, each having a chemistry appropriate to its role, you can get some very very interesting outcomes.

It won't change the story for L1 caches, but it might well be possible to reduce L2 latencies into the 5 cycle range, and main memory latencies similarly...

The implied premise is faulty

Granted, for whatever reason, static typers often seem to be simultaneously fans of horrible complicated syntaxes

A gazillion lines of Perl argue that the opposite correlation for "dynamic typers" can't be very strong.

Clojure...I don't get it.

Clojure's been on my things-to-look-at list for a while, and your post pushed me past the procrastination point. So being a little too pooped to code tonight I decided to spend a while reading what I could find on it.

And I don't get it.

I've already established that I'm not at my brightest, but as I was paging through the documentation it just seems like a decades-late entry in the scheme v. lisp wars.

It basically looks like what you'd get if you put (lisp + clos) + scheme + STM on the JVM and trimmed off the redundant / conflicting parts (e.g. nil v. (), #t, etc.).

What am I missing?

--MarkusQ

Clojure

it just seems like a decades-late entry in the scheme v. lisp wars

The fact that Clojure was able to gain traction, given the history of Lisp, should be telling on why Clojure is the surprise story of the year.

It basically looks like what you'd get if you put (lisp + clos) + scheme + STM on the JVM and trimmed off the redundant / conflicting parts (e.g. nil v. (), #t, etc.).

Rich Hickey's first talk in the Clojure video talks addresses the language from the perspective of Lisp programmers. That's probably the best place to start for those that know Lisp and want to know where Clojure is coming from.

Beyond that, I think Clojure is an opinionated Lisp where most of the choices and tradeoffs have been solid and well thought out. I don't think Clojure is aimed at converting Scheme or Common Lisp programmers, though I'm sure their blessing is probably appreciated. Clojure is more aimed at being a scripting language for the JVM, so the audience is the mass of programmers using Java.

As for what probably makes it special, I wasn't necessarily trying to promote Clojure - I was just opining that it was the big story in PLs for 2008. a couple of guesses off the top of my head: (a) integration with the Java APIs - in my limited foray, it appears to be seamless; (b) sequences - standardization and convenience of collections is very important in scripting language design; (c) functional programming - the facilities for blending pure and impure code are a nice tradeoff; (d) concurrency - the one area I am not knowledgeable about but it seems to somewhat impress others; (e) simplicity and ease of use - it is an easy language to learn and use.

I don't think Clojure should be looked at in terms of breaking new ground in PL research. However, it probably can be used as a good case study of how to go about integrating existing concepts into a programming language.

On a more awake reading...

Beyond that, I think Clojure is an opinionated Lisp where most of the choices and tradeoffs have been solid and well thought out.

Sure, I'll buy that.

integration with the Java APIs - in my limited foray, it appears to be seamless;

Not much of a plus for me, but I can see how it could be for some.

sequences - standardization and convenience of collections is very important in scripting language design

Definitely. And not just for scripting languages. Clean, uniform abstraction is a Good Thing.

(c) functional programming - the facilities for blending pure and impure code are a nice tradeoff; (d) concurrency - the one area I am not knowledgeable about but it seems to somewhat impress others

I'll have to think about this. It seemed to me that they'd compromised the former just enough to make anything but explicit concurrency just as much of a mire as ever.

I don't think Clojure should be looked at in terms of breaking new ground in PL research. However, it probably can be used as a good case study of how to go about integrating existing concepts into a programming language.

Certainly. Some of the other mentions I've come across have made it sound much more innovative, and I think I was expecting something with a higher "Wow, how'd they think of that?!" factor.

-- MarkusQ

can't comment on the F#

can't comment on the F# uptake, but haskell's professional star did appear to rise a little. Ultimately, though, Erlang's game to lose. +1 Haskell

Garbage collection in C++ (1?)0x has been remanded to a tr... Managed memory machismo? RT vendor cock-block? atomic shared ptr fallout? Hard to say exactly, but, considering the proposed GC implementation, a definite loss to the standard. -1 GC'd C++

OK, I'll take my chances

OK, I'll take my chances here and make some specific predictions, so you can all have a good laugh in 2010...

  • The new C++ standard will become official. (Yeah, this one is my freebie.) The two major compilers (Microsoft and GNU) already implement significant parts of it, and will have most of it by the end of the year. It will take a few years for many of the improvements to percolate out into widespread real-world usage, though, just as it did with earlier C++ features.
  • Concurrency, including GPGPU applications, will continue its rise in importance. To get specific:
    • OpenCL will make a big splash once it becomes widely available, probably with the release of Snow Leopard. This may give Apple's fortunes a significant boost (or possibly not, at least in 2009, given how long GPGPU applications are likely to take to reach the mainstream).
    • Microsoft will come up with an incompatible competitor to OpenCL instead of implementing it themselves, repeating the DirectX vs OpenGL competition. The outcome will be different because Microsoft, while still dominant, don't wield quite the absolute power they had a few years ago. This one will still only be beginning to play out by the end of 2009.
    • Erlang will have its moment in the spotlight, but it isn't well enough supported to become a major language. Instead, people will learn from its design and start adding Erlang-like features to existing major languages.
  • The wreck of the economy will have a big effect on the computing world. This is relevant to the PL sphere because I think Sun is going to be one of the casualties. Sun was born into the pre-PC world; they managed to survive the mainframe-PC transition and the PC-internet transition, but this time I think they're (like Microsoft) big enough to be too rigid to adapt but (unlike Microsoft) not big enough to survive anyway. I think Sun will be widely recognised as doomed, if it isn't already dead, by the end of 2009, and they'll take Java (and the JVM) down with them.
  • JavaScript's star will continue to rise. Someone (probably either Mozilla or Apple) will release a good non-browser-embedded, command-line implementation that will turn it into a popular scripting language (i.e. a direct competitor to Python and Perl for general-purpose coding, not just web applications). One of the major JS libraries (my guess would be JQuery) will be adopted widely enough that people start to think of it as part of the language, which eventually it will be (not in 2009 though). As part of this process, JSON will make more inroads against XML.
  • Functional programming will be more widely recognised as a useful technique, but it will enter the mainstream by being incorporated into pre-existing procedural and OO languages. (The aforementioned JavaScript already has a good head start there.) Primarily functional languages like F# will be investigated and experimented with (see also Erlang, above), but won't be serious competitors to the likes of C++, C#, JavaScript, Python, etc. The really aggressive functional languages (Haskell, Ocaml, etc) will never be more than academic curiosities.
    • In particular, FP will not be a significant player in the rising GPGPU field. The optimisation deficiencies of current implementation compared to C/C++ will ensure that nobody takes FP seriously for a field whose whole raison d'etre is extreme runtime speed.

Your last point is a bit

Your last point is a bit off: FP could be a player in the rising GPGPU field. You just have to step back and use code staging: you write a program in a FP language to generate a program to run on the GPU. The GPU doesn't run garbage collection or manage objects, that happens only when you are generating the GPU program.

I think most of the FP/"general purpose language" on GPU projects work through code generation right now, though none of them are very popular yet. It does provide a challenge to FP though: what languages are best for generating the simple programs that will run on GPUs? Haskell has some advantages: combinators/monads are good for code generating, and performance isn't such a big deal when generating a program.

Coming full circle

..from the days of compiled bitmaps and other uses of self-modifying code, used to speed up videogames 15-20 years ago. :)

It is weird. Maybe this is

It is weird. Maybe this is the year that we realize that making compilers smarter is simply futile, and that most software-based performance improvements in the future will come from going meta...

That still involves compiling though

Just done at a different place.

Surely, but the days of

Surely, but the days of transparent compiler optimizations might be numbered (at least in terms of advancement). Honestly, have we seen any new static optimizations in the last 10 years that are very notable? JIT + faster machines has succeeded somewhat in making managed languages usable, but has basically failed in its lofty dynamic re-compilation goals of reaching parity with or surpassing statically compiled languages. EPIC was an all out failure for Intel, turns out compilers just aren't very smart. Ironically enough, we've come back to specialized hardware and custom code.

Transparent is the only form of optimization

I think your comment is off the mark. Compilation is a mature field with solid core technologies and well-defined problems. Fix the input and fix the output and with a good knowledge of the traditional methods, an architecture and skeletal design reveal themselves. However, building a really good compiler involves a lot of cranking the handle and some clever tricks (even these days with fast machines).

The advancements these days in compilers aren't so much about coming up with some new whiz-bang optimization (there are already literally dozens of optimizations that target various components of performance, with less than a dozen very important ones). Instead, compilation nowadays, especially in a mature, industrial strength compiler, has become a combinatorial problem of putting together the many dozens of optimizations into the right set and right order, for the right scope, with the right heuristics. Machine learning has gone a long way towards squeezing the last little bit of juice of that.

I think you're being overly categorical when you say "turns out that compilers just aren't that smart" and give the example of Intel (and HP's) Itanium foray. There are many reasons that that line of reasoning (and that line of products in particular) weren't ultimately as successful as those two had dreamed. EPIC has succeeded in other areas, particularly the embedded space, and compilers there are very smart, and profitable enough to remain a self-sustaining business. I'd also like to point out that Intel does sill produce Itanium processors and though niche, they are the undisputed per-core floating point CPU king.

I'm not really arguing with

I'm not really arguing with the ability of a compiler to optimize for a single thread on a single core. But a compiler cannot automatically parallelize a program to take advantage of multiple cores or massive vector processing (e.g., a GPU). At least, a compiler can't translate a single threaded program and into a mutli-threaded/vectorized performance beast. And this is where the real performance gains are this year.

EPIC compilers might be ok, but they are not smart enough to justify the architecture. On a cost/cost basis, it makes much more sense to use a bunch of NVIDIA GPUs vs. Itanium CPUs. VLIW (extreme RISC) pushes work from hardware to software, but whatever gains in compiler is tempered against cheaper hardware logic that can optimize the instruction stream dynamically (e.g., dynamic vs. static branch prediction).

Granted, auto-vectorization

Granted, auto-vectorization wasn't that successful in the late 80s and early 90s when vector machines seemed like the new high-performance computing kings. But then again they were much more of a niche then than GPUs are today, meaning the topic will likely be reinvestigated in the compiler community in the next few years, perhaps with a different approach and with newer analysis methods and computational power employed. I'm not an expert in that field so I honestly don't know how well it will pan out.

I'm a bit of a skeptic in a couple of respects about this multi-core craze, I have to admit. For one, I don't think a radically new programming paradigm (or language employing an older concurrency paradigm) is likely to appear that will make parallel programming much easier or correct. Secondly, I'm skeptical about how many people are chomping at the bit to speed their programs up by going massively parallel. I suspect that actual number of applications that really need all that computational power is pretty low.

I'd rather see more effort invested in making single-threaded programming more intuitive and correct....god, it's embarrassing that we as a community can't even do that very well....

F# blessed by Microsoft

F# got its own CTP in September, and is now slated for a supported release as part of Visual Studio 2010, alongside C# and VB. To my ears, this sounds like: "You win, functional programming is now mainstream."

That's not to say the buck stops at Microsoft, but few organizations have as much influence among 9-to-5 programmers, so I think it's the greatest news I've heard in a long time.

Java going down?

Seems like two posts above suggest Java is going down. Let's hope it will be a fast and final death...

In order for this to happen faster, I wonder if anyone has good links to mainstream sites/research firms making this pronouncement?

Be careful of what you hope

Be careful of what you hope for. If Java goes down and takes the JVM with it, the only mainstream managed code solution left is the CLR. Will somebody at least acquire the JVM from Sun, or will a project like Mono try to fill the void for a multi-platform managed code solution? There is a lot of Java code out there that won't go away over night, but will the JVM just freeze in place and no longer evolve? Or will we regress into web 2.0/3.0 solutions like Air/Flex or Silverlight? The devil you know is often better than the devil you don't know.

Evidence: we just hear things in the industry. You could also just take a look at Sun's stock performance this year on Google: their stock price lost a third of its value this year. Before you know...you could expect Sun to limp by, but this year it will be very difficult.

JVM & Java will still be open source

One thing about Sun going down is the heavyweight JCP (process for changing the JVM & language) will also go away. Maybe finally the JVM can become an agile project and add all those extra features for supporting non-Java languages efficiently.

I'm not optimistic about the

I'm not optimistic about the OSS community coming to the rescue. Sun has devoted real engineering resources to Java that the OSS community probably couldn't bear. One could imagine the only viable option would be something like Eclipse where a big company foots most of the bill.

Wishing for the death of a

Wishing for the death of a language is [ Edit: I rather not be in this thread. Nothing to see here.. ]

New features in the JVM

There was recently a JVM Language Summit at Sun, where people making languages other than Java based on the JVM discussed, among other things, improvements they'd like to see in the JVM. Some talks were videotapes and can be seen here, including a talk by Rich Hickey on Clojure. Rich told me that one of the things many language implementors asked for is tail recursion.

Man, wouldn't that be

Man, wouldn't that be great....all those extra features might finally weigh it down to the point it collapses in a heap of rubble and dies! Then something else a little better thought out might have a chance.

When I joined Sun I thought I could launch a guerilla war on Java from the inside. Then I got here and realized that I'm not the only one....

How depressing is that?

Is the internal guerilla war

Is the internal guerilla war on Java or the JVM?

Both

Both are ongoing. Clearly the language war has caused the most carnage. The JVM implementors have played pretty good defense so far, but sometimes that's a bit of an illusion because of how tangled the JDK is with the JVM. Internal JDK changes often cause havoc for the JVM internals and sometimes new whiz-bang features at the VM level are exposed in dark corners of the JDK.

Luckily or unluckily, the bytecode set has changed not one bit, though the classfile format has accreted some extra garbage. I was shocked to find that all this hubbub about "invokedynamic" doesn't actually involved adding a new bytecode at all....instead there are marker interfaces and calls to special classes....why why why?

sunk costs

"....why why why?"

The sunk costs of Java vendors and their customers. On the vendor side, all of that work on JIT and, to a lesser extent GC. On the customer side, workforce investments and infrastructure investments premised on the possibility of an HLL platform without vendor lock-in.

Why would "invokedynamic" not involve a new bytecode? Hmm. Just a guess but perhaps a large influence is the goal of adding the new feature without necessitating changes to JITs and code validators?

A more tasteful, more general VM that isn't compatible would mostly be seen as a sideways move by the stakeholders. The case for how it would pay off has to be really strong.

I think to really fix problems like this, and avoid them in the future, takes changes in how we build systems: a return to an emphasis on tiny, orthogonal, do-one-thing-well tools rather than big monolithic systems. For example, the decision to retain or move away from Java in a big shop is likely to be more of an "all or nothing" decision and less than likely to be a change that people can make in one small piece after another. That monolithic quality of current architectures is why it was possible to raise so much money for things like JVM JIT projects. It influences everything from low-level hardware design to high-level IT-strategizing. It's very hard to break up and rationalize because the nature of the way we do business in this industry tends to encourage and reward doing software architecture for empire building rather than for making robust, serviceable systems with promising futures built from commodity, standard parts.

And, yes, this is even relevant to language design and PLT. These economic factors play out as judgments about what's "interesting" in language design and PLT.

-t

Is that a man in a gorilla suit walking across the room?

afaict there are many organizations, like IBM, with large investments in Java technology - I don't see why any of them would choose to discard those investments in the near future.

We might even suppose that IBM's own Java technology offers them and their customers a kind of insurance policy.

Old languages never die

I don't really see any signs that Java is losing popularity. And even if Java ceased to the a "hot language du jour", it's used in so many places now that we're very unlikely to see it disappear. For example, FORTRAN is still being used very heavily, in parts of the world that many of us rarely see. Java has reached that point; it's here to stay.

Can die for certain niche's

Perhaps being involved in academia, there is a certain wish that it would not be so prevalent as a teaching language?

Taking the java reigns

I agree that Sun's (apparent) demise doesn't look that great for Java/JVM. It seems many people are suggesting that IBM might be the one to step in and "take care" of the JVM in this scenario, but I'm surprised no one has yet mentioned Google.

Aside from Java being one of the four (?) "blessed" languages allowed in Google's walled garden, they also chose it(*) to be the language of choice for their Android platform. Being that they've invested a lot in Android, and assuming that the JVM would need such a "corporate steward," I feel like they're just as likely a candidate to step in and "save the day" as IBM, no?

(*) Java/Dalvik, whatever ... :-)

Java is dead, long live Java!

I wonder if anyone has good links to mainstream sites/research firms making this pronouncement

You can find plenty of mainstream links saying that. The one that gets the most attention perhaps comes from Bruce Eckle, author of Effective Java. He doesn't really say Java is dead, but he does say it's an evolutionary dead end.

As others have articulated, Java is too big to actually die in the next few decades. It is still and will remain for quite awhile one of the a handful of "safe" choices for an IT manager. To paraphrase an old chestnut, "nobody ever got fired for choosing Java." If nothing else, other languages have to contend with the common (and, IMHO, mostly wrong headed) belief that "a bunch of programmers have Java on their resume therefor it'll be easy to staff my projects with effective teams if we standardize on Java." Eventually, of course, something will replace Java in that role but even once it stops being the tool of choice for new projects the billions of existing lines of Java code will still need maintaining and cajoling for a long time to come.

Once again, Java is the new COBOL

Still gazillions of lines of COBOL in production.

So what will be the new Java? The .Net languages are obvious candidates on the Windows platform, but nowadays that platform doesn't have the same leverage that it used to.

There isn't one

Scala? I think it'll get some more uptake, but it won't make it into industry as a Java replacement

JRuby, Jython? Dynamically typed and the tooling will never be as good as Java programmers have come to expect

Clojure? Very slick Lisp on the JVM, but suffers from similiar problems as Scala

Java itself? It looks like Java 7(not scheduled for release until 2010) is starting to wind down into maintenance mode.

I still think that Boo would make a wonderful JVM languages (concise, extremely readable, very extendable (Macros), and gives you duck typing when you want it.

The .Net languages are obvious candidates on the Windows platform, but nowadays that platform doesn't have the same leverage that it used to.

Interestingly, there's probably more C#/Mono code on the Linux desktop than Java.

F# does seem to be gaining

F# does seem to be gaining steam, as I see it popping up all over.

My hope for the future is that Modula-3 becomes more popular. I thought it was dead, personally, but after using it a little, I found it's community (albeit small) is active, and the CM3 implementation is still updated. Also it's just as modern as any other language.

JVM languages coming into their own

There has been a lot going on with the JVM languages: between Closure, JRuby, Scala, and others, there has been a lot of activity in this space.

Russ's point above also strikes me as a good one: "Erlang will have its moment in the spotlight, but it isn't well enough supported to become a major language. Instead, people will learn from its design and start adding Erlang-like features to existing major languages."

Erlang's nice, but I see other languages picking it over for good ideas (concurrency), and wedging them in.

JavaFX is too new to really be able to judge it, but my instincts say "dud". But then again, these are the same instincts that preferred Gopher to the WWW, so I'm not claiming to have a good track record with these things.

Scoring myself

I made a bunch of Scala predictions here.

1) Book came out (check)
2) Not 2 but 3 IDEs (check)
3) Blogosphere buzz, positive and negative (check)
4) Moderately high profile open source mixing in Scala with Java (check)
Moderately high profile proprietary code mixes in Scala (check)
5) A bunch of lower profile projects are entirely in Scala(check)
6) Google announces Scala plans (bzzzt!)

One more prediction

One more prediction I forgot to mention earlier (surprisingly, because this is something that's been on my mind lately): some currently popular languages, most prominently Python, will start to bleed users specifically because of their lack of good support for concurrency. (C++ grew a standard thread library just in the nick of time.)

Malstrom of the blogowikiredditsphere

You might take the blogowikiredditsphere gossip too seriously and confuse it with "the user"? There isn't any noticeable downtrend for scripting languages precisely because they never actually peaked in enterprise scale applications but live somewhere in the infrastructure. Moreover I would say that LtU readers find the claim rather dubious that multi-threading with fine grained locks is "good support for concurrency" from a language design point of view. At least I hope so.

Expectations for 2009

  1. According to Tiobe C++ was "the language of the year" as predicted by me in January 2008. This makes all cynics among us glad. In 2009 there will be still more buzz explaining the C++09 features to a wider audience. No downtrend in sight.
  2. Adobe will still expand. They will finally get a working Flash 10 player on mobile devices and Flash will become the X-platform application toolkit of choice for mobile applications. Finally Flex will peak when Adobe releases their Catalyst CMS.
  3. JavaScript will not expand on its niche. However solutions like GWT or pyjamas aren't widely adopted as well. A GWT like code generator for Ruby will be developed.
  4. People continue predicting a bright future for functional programming languages and this prediction will be repeated in the end of 2009. Same goes with DSLs as the next-big-thing in software engineering after a slightly declining interest in Go4 style design pattern.
  5. One of the great-future-of-programming hopes like Perl 6, Rubinius or PyPy finally reaches a state where programmers other than core developers start to show interest.
  6. I'll take a closer look at Clojure.

Actually C was the Tiobe Language of the year

Actually C was the Tiobe Language of the year

Assuming there wouldn't be

Assuming there wouldn't be the GIL, would python with concurrency on current 2 to 8 core machines realy be faster or at least as fast as non-concurrent Java or C# on problems you solve every day? If not would writing concurrent python code be much easyer than writing non-concurrent Java or C#?

I don't buy it. If I need speed I would resort to languages which are known to have fast compilers, if speed doesn't matter as much I wouldn't go into the trouble of parallelizing things.

Productivity often wins. For

Productivity often wins. For example, I prefer to do EC2 code (or most networking, really) with python, and, depending on what I'm running on, would like to use the local cores available. Writing it all in C++ would be relatively painful just to make some computations better. An FFI is somewhat more reasonable, but I'd still rather not check in code like that.

I get bitten by the Python GIL pretty much every time I use Python nowadays (otherwise I'm in JavaScript land with the obnoxious and buggy shared-nothing worker thread system, or limping along with C++ :)).

Current != end of 2009

Two points in reply:

First, "current 2 to 8 core machines" are not going to still be the state of the art 12 months from now. If you buy a good workstation today, you're probably getting a Core i7 with 8 hardware threads. A year from now I expect the typical new workstation to have one or (at least as often) two next-gen chips with 16-32 hardware threads between them. the more threads your hardware has, the more you lose if your software can't match it, and I think a lot of people are seriously underestimating just how fast the move to large-scale parallel architecture is coming at us.

(NB. I prefer to use the term "hardware threads" instead of "cores" to avoid "but Hyperthreading doesn't count as another core" arguments. It counts as a hardware thread, and counts against you if you don't have a matching software thread.)

Second, I belong to the school of thought that believes "the trouble of parallelizing things" is an artifact of current mainstream language design; it is not intrinsically particularly difficult. I'm aware that I'm in a distinct minority on this point. (Those who disagree should probably take a good look at Erlang.)

Python

I think Python may bleed users, but for a slightly different reason. I originally thought that the Python 2.x -> 3.0 transition was done rather well, certainly hundreds of times better than the Perl 6 catastrophe. However, looking at the technical details more closely, I cannot see Python 3 being adopted very soon, because it involves either abandoning all your 2.x users or creating two parallel versions of your code. That's not good and creates uncertainty. The right way to have done it would be for Python 3 to have a backwards compatibility flag / mode, but it doesn't have that, forcing libraries and users to make a hard choice.

(Disclaimer: I'm no fan of Python)

Python and concurrency

Actually, Python supports two different models of concurrency, multithreading, and (new) multiprocessing. They are designed to have quite similar APIs (and the multithreading is based on Java's multithreading model).

What you are probably referring to is the fact that one implementation of Python (CPython, the one most everyone uses), cannot take advantage of multiple processors when multithreading (same as Ruby).

However, many times you want a multiprocessing model anyway (don't have to worry about deadlocking much if you don't share memory).

The bigger deal with Python tho, is that this year they released 3.0, and broke (in somewhat minor ways) backwards compatibility. This is a pretty bold move, we'll have to see how it turns out.

A couple of things that were added were a Scheme-like numeric type hierarchy, annotations of function arguments and return values, and assigning values to variables in outer scopes (of nested functions).

I like the changes, but mostly were not in a FP direction.

The bigger deal with Python

The bigger deal with Python tho, is that this year they released 3.0, and broke (in somewhat minor ways) backwards compatibility. This is a pretty bold move, we'll have to see how it turns out.

So what's your prediction? Thumbs up or thumbs down for Python 3?

If the multiprocessing stuff

If the multiprocessing stuff is really like the older pyprocessing library, it defeats a lot of the point of a dynamically typed scripting language -- to share memory, you have to go through too many hoops for casual use. There's a pragmatic sweet spot for shared-memory threads between the GIL and essentially isolated processes. The processing libraries aren't significantly enabling over what could have already been done with little effort; eliminating the GIL would have scratched a much bigger real itch.

Python multiprocessing

I haven't had a chance to play with it yet, but it looks like they allow Queues of dynamically typed objects between the processes. I'm guessing they serialize (pickle?) the data, so it will be obnoxious for passing large pieces of data around. However, that's not an awful model, and it will probably keep a lot of programmers out of trouble. (Note, I'm most definitely NOT one of those people who thinks writing multi-threaded programs is impossible or even all that difficult...)

Getting back on topic, I'm predicting Py3k won't be a big hit because they didn't fix enough problems, and they broke a lot of code. I don't see myself even wanting to memorize the new changes until it's the only option included with RedHat or MacOS. (bytes, buffer, memoryview, /sigh)

I predict JavaScript will eventually become the popular server and application scripting language - mostly because of it's various JIT compiling implementations. Someone just needs to start bundling it with server side libraries...

(Personally, I wish Scheme or Clojure would become popular enough that I could use them professionally...)

concurrency

Why do people consider concurrency so important. We have this meme going around and it goes something like this:

Applications have gotten faster due to Moore's law. But now Moore's law is dead and if we want apps to keep getting faster on multiprocessor machines then we need to start coding applications to run concurrently.

I think this whole idea is completely wrong. Their are many problems with it. Firstly applications will get faster with multiple processor without additional coding because of process level parallelism and the fact that people run many applications simultaneously. In fact using threads could even make desktop applications run more slowly because of additional thread context switching. Secondly is CPU usage the real speed problem? Many applications are slow for other reasons like memory latency or hard disk access. I almost never see the CPU usage on my computer go to even 50%. So why is CPU usage considered such a big problem? Thirdly is speed even a problem? Its rare that speed is a problem for me. As for web apps I think multiple processes is a simpler and better way to scale than multiple threads. That's what Rails uses.

And if CPU usage is the problem, why is concurrency the solution. We have these special tools which people call profilers. Seems like nobody uses them. Anyways I use them all the time and I often get speed improvements from eliminating bottlenecks that dwarf anything I could get out of concurrency. For instance, I did a project where I had to write a Financial Risk System. There were 8 other groups doing the same thing. All of the applications were running very very slowly. I profiled, found my bottleneck was due to a single line of code, optimized and improved the run time of my program by a factor of 100. I have often seen people take unprofiled applications and add threading. I don't understand. Seem like people are too infatuated with threads and 'scaling' to even bother profiling their applications. Its strange to me the enormous efforts people go through to improve speed without having any clue as to where their bottlenecks are. In one instance I saw a very smart engineer state that he thought the speed problem in a large Java application was due to passing around very large objects. I was skeptical but thought his answer was plausible until I took 5 minutes to think about his theory and realized it was completely wrong because Java passing objects by copying references, not objects. He came from a C++ background and I guess he assumed that since Java wasn't using pointers it must be just coping objects.

My rule: You don't know where the bottleneck is no matter how much you think you do. So before you do any optimizations RUN A PROFILER.

Agreed, but.

Much of what you say is absolutely true; modern multi-processor operating systems (any recent variety of Windows, MacOS, or Linux) are perfectly able to take advantage of multiple CPUs, and for most things the bottleneck is I/O and not the processor.

But... there are lots of applications that do tax the CPU (videogames--even if only generating instructions for a GPU; MPEG; etc)--these things don't partition well across multiple processes, meaning you need threads (or something finer grained, though most OS's won't support finer-grained scheduling) to take advantage of a multiprocessor. And the shared-memory multithreading model is widely considered broken.

Making languages which support safe, automatic parallelism is just one of many ways to better support the demands of today's computing applications.

Uh?

[[these things don't partition well across multiple processes, , meaning you need threads]]

Uh? For me the main difference between threads and processes is the default memory sharing policy, so could you explain why you think that using thread is different from using processes (except when you're using a lot of threads)?

Actually, the current

Actually, the current general-purpose OS's stop scaling well somewhere between 4 and 16 cores, depending on your application. The challenge for multicore isn't multiprocessing. It's language-level multithreading in a very very fine grain way.

Marketing

Why do people consider concurrency so important. We have this meme going around and it goes something like this:

Applications have gotten faster due to Moore's law. But now Moore's law is dead and if we want apps to keep getting faster on multiprocessor machines then we need to start coding applications to run concurrently.

There is a simple answer to this: the CPU manufacturers want you to buy their latest products, and they are spending a lot of money to make sure you hear about it.

Having said this, the switch from ramping up the clock rate to putting multiple cores on a single die has sound reasons, the most important of which stem from power usage and production yield considerations. If you want to be able to afford a machine with a high speed CPU, you had better get used to multi-core variants. :)

This is a classic red

This is a classic red herring: applications are often slow because of other reasons THEREFORE concurrency is unimportant. Of course Ahmdal's law is alive and well: your net performance depends on your bottlenecks, and adding concurrency is only useful once raw processing is your bottleneck. Fair enough.

The problem being solved by concurrency is hardware utilization where additional hardware resources are increasingly based on some sort of concurrency. Now most applications can take advantage of ~30% of these resources (assuming dual core processor + a good GPU), in the future it might be ~10% or ~5%. See the problem? Now, assuming your single threaded/non-GPU application can only take advantage of 5% of your computer's processing power, after you eliminate all the bottlenecks through profiling...you still only have 5%.

1. The software (bad CPU

1. The software (bad CPU usage) is becoming the bottleneck. Here's something I'm working on. Think handhelds. Ever used an iPhone? Even on wifi/3g, the web browsing blows. Say your prayers if JavaScript is on a page. We're hitting the wall with sequential optimizations, but its clear that we will have handhelds with more CPU power than popular laptops -- except code will have to be parallel to use it.

Another example is features that are dropped or weakened. Local search, better visualizations, preloading, etc. Current apps already *should* be faster from a user-experience perspective, and popular app. design is already limited by CPU time. You don't put in features that slow you down. Many features you might suggest to be offloaded to a server farm.. but that's still a form of concurrency.

2a. Concurrency does not imply parallelism. Using multiple processes helps switching between tasks and provide low-level isolation guarantees (without high-level separation, but that's my pet peeve against Chrome ;-)). However, when we talk about parallel code, we mean getting the same amount of work done more quickly, which means useful work has to be both distributable, and in a way that isn't more expensive than the normal way.

2b. Concurrency is the way to go. Two basic reasons are power and productivity, and natural modeling. You might be able to get the right performance with hand-tuned assembly, but that's not likely going to happen. Instead, you'll want to write higher level code and do minimal tuning. You'll have extra cores sitting around -- exploiting them viz. high-level code with language support might give more impact for your time. The second reason, domain modeling, is again about productivity: many systems (eg., reactive systems like GUIs, or distributed ones like web apps), are notoriously buggy, so cutting down the cost of dev. is important.

It is important to realize that exploiting parallel hardware and better domain modeling are two different motivations for languages supporting concurrency. Unfortunately, and is publicly noted, folks often conflate the two. More subtle is the matching of concurrent languages to effective parallel evaluation. E.g., someone on LtU this week suggested parallel programming with Erlang, yet Erlang would not work with my earlier example of speeding up web browsers on handhelds. Even Cilk++ isn't enough for sufficient scaling!

More on concurrency

Some comments to several of the above, combined into one post for convenience.

renox wrote:

Uh? For me the main difference between threads and processes is the default memory sharing policy, so could you explain why you think that using thread is different from using processes (except when you're using a lot of threads)?

That's exactly the difference; some parallel partitionings of a problem require a LOT of communication; in some cases sufficiently so that any non-shared-memory based concurrency means (even separate processes on the same machine) is a non-starter. Shared memory concurrency gives you communication "for free"--as a particular thread's local copy (which it needs anyway) can be made available to another thread just by passing a pointer.

When mutable state is involved, shared memory multiprocessing also gives you reflection of updates "for free"--no need to worry about cached copies of objects that are out of sync with "the" object.

Of course, the drawbacks to this model are obvious and well-documented; and many experienced programmers have been burned many times by the horrible synchronization issues (and the lack of scalability to distributed systems, where true shared memory is not possible).

Regarding someone else's question regarding the role of concurrency in an I/O-bound environment--it depends on the reason that the processor is waiting for I/O. If bandwidth is the bottleneck; concurrency probably won't help you--you need a fatter pipe, not a faster (or more) CPUs.

But if latency is the problem; but the channel is not otherwise saturated; concurrency CAN help you if you can de-serialize roundtrips. Why does AJAX make the web experience so much more pleasant? Not because it uses multithreading to better utilize the local CPU; but because it deserializes a gazillion HTTP requests--a synchronous application protocol running on top of TCP and all its overhead.

Of course, running I/O transactions in parallel doesn't require multiple CPUs--one can do this with continuation-based parallelism ("cooperative multitasking") well enough, on a single processor. But many current languages don't natively support good asynch I/O, relying on libraries and system services for this.

Per-chip Licensing

If software suppliers find it hard to support todays multiple cores then maybe they should switch to per-core licensing for their software to give users better use of their hardware. A significant amount of software is run multiple times and results from single runs aggregated; or instead of buying two licenses for two people to use on separate machines, those two users could access one licensed copy of the software running on a single machine.

- Paddy.

A bit off topic for LtU

but people can do that today (run multiple instances of some program installed on a server) without multiple cores. And software vendors are well aware of all the licensing tricks they might employ to maximize revenue.

Astrology

It is often noticed that each year astrologers are asked for prediction for the following year, yet rarely do people check to see if last year's predictions turned out correct.. This thread, I take it, was supposed to be about last year's predictions, but instead it seems most posters wish to predict what will happen next year!

Predictions are hard

even when they are not about the future. Here is an incomplete and random list extracted from the "2008 Predictions" discussion:

The theme of the year will be concurrency.

Not really. When there was a fad in 2008 than it was called 'cloud computing' but this had little to with computing at all.

F# will be apprehensively adopted for production use in some .NET-heavy companies.

Wrong

Scala will finally get a mature .NET backend and become the new language of choice for both the JVM and .NET

LOL

No functional language will become significantly popular (i.e. F#, Scala, Haskell etc)

Right

Java will become more entrenched

Right

C++ with its new ISO standard for 2009 will become the language-of-the-year in 2008

Correct according to Tiobe. But has this anything to do with C++09?

No C++ third party library will use garbage collection.

Right

Continous growth of JavaScript

O.K. but no one predicted new hot VMs.

Arc may see the light of day this year.

Gold right. But it was Clojure not Ark that was the newcomer of the year 2008. Ark together with JavaFX were the great disappointments of 2008.

Model driven architecture will start to gain mass market appeal as open-source model compilers for UML become freely available.

LOL

Well done (in both senses)

Not really. When there was a fad in 2008 than it was called 'cloud computing' but this had little to with computing at all.

Go Kay! I leave it to you to make sense of cultural trends and amuse us as all at once. You are correct here on both counts, and correct on the rest as well.

I predict that for 2009 Kay Schluehr will make a number of funny-because-it's-true posts, and I'll be left wondering whether I'm the only one who finds him spot-on and hilarious. If my prediction is self-fulfilling, so much the better!

Aside: Is it strange that I take a break from my New Year's party to post on LtU? Never mind, don't answer.

Statically typed functional languages

No functional language will become significantly popular (i.e. F#, Scala, Haskell etc)

What's "significantly popular" mean? Certainly none are at Java or even Ruby levels, but all 3 are far more popular than they were at the beginning of '08. I think it's fair to say that even people who don't use such programming languages are far more aware of them than has ever been the case in the past.

Up until the last couple of years, the few programmers that were aware of any statically typed functional language were mostly advanced CS degree holders, and they're a tiny fraction of the programming population. Yet ML is 3 and a half decades old. So the fact that languages like these popped up so brightly on the mainstream radar last year is not just significant, but huge.

I like the idea of turning

I like the idea of turning something into mainstream just by writing a book about it. Once people reading and discussing the content of the book it changes the world. This has a magical touch.

Maybe future will even more accelerate and opinions mediated in the wikiredditblogosphere will materialize immediately in social behavior like a superfluid that flows without any resistance. At least the wikiredditblogosphere is not infected by postmodern ennui and lives up to the newest new.

Clever

Clever. But you've missed the point. Sure, the fact that such books exist means nothing. But the fact that such books are mentioned in the same context as "Java Power Tools" and "High Performance MySQL" is something worth noting.

F# will be apprehensively

F# will be apprehensively adopted for production use in some .NET-heavy companies.

Wrong

Hang on, how could you possibly know that? I want the mysterious all-knowing crystal ball you're using!

DSLs from a crab eye

Crabs (almost) bite by their hands, not jaws; they have no jaws. I am not going to say anything explicit but just an 'air inference'!

DSLs are going to be more generalized into layers and scripting is going to be more abstracted into layers.

One thing I think is going to happen is finding new approaches to employ scripting: Lua is a fairly successful example. One can imagine of a core part - could be developed in Haskell or C - and the liquid part; scripted in Lua; changes day by day; and 'melts down' to the core - to C - as it gets stable - stable in structure -; thanks to easy integration of Lua.

Funny I have mentioned Haskell at the core because since our core code - what we are using; Apache i.e. - is not pure, one can reverse it and employ purity in scripting part - as Clojure (Clojure is not a scripting language for sure; this is a reflection/analogy).

We can put it simple: a planet with an atmosphere!

I think we will see more and more of 'planet with atmosphere' in 2009. In fact it is a down-sized, pragmatic version of 'framework with code libraries'.

And this lets we bite the problem 'at hand' and chew and digest it in time; as crabs!

predictions

I don't see Java losing popularity. In fact I think its going to get more popular. I see it being used a lot more heavily for desktop applications. I think JavaFX will have problems mainly because Sun chose to introduce a new scripting language. In fact long term I see Java/C# becoming the new C/C++. It will be the normal language for developing apps in both for open source and proprietary purpose outside of C#.

As for concurrency, I think its a fad. It will gradually assume less importance and people just avoid it as they should.

my score and new predictions

In last year's thread I predicted better progress than was seen for Scheme implementations. That was wishful thinking, apparently.

Last year I also predicted large growth for XQuery. That part seems to be roughly true. I base this claim by looking at various sites that graph or chart job posting statistics. This was a year of decline for C++, for example. Python and Ruby grew (% of job postings) and, drumroll, XQuery grew by a comparable amount.

For next year, I reiterate my XQuery prediction: big growth. One of the drivers in 2009 will be the new administration in Washington and a stepping up the pace of various government "sunshine" efforts. Congress and various agencies of government will be publishing larger amounts of data in XML form and there awaits a large crowd of people who want to build interfaces for exploring and analyzing that data.

-t

C++

This was a year of decline for C++

That's debatable.

C++ v. debatable

I agree, of course, and not ultimately a very interesting debate, no? Well, ok, a little:

The TIOBE numbers you're looking at apparently are a measure of search engine results. So, for example, the number of pages talking about "C++ programming" went up (according to their methodology).

The numbers I was playing with were from various (easily found) job market analysis sites. So, for example, I saw that the number of job listings calling for C++ fell (per the studies I sampled) while listings for Python and XQuery grew by comparable percentages (and both by quite a lot).

There isn't obviously any contradiction between the two sets of numbers.

The numbers, in general, are interesting but a structural analysis of the technology and its economics also helps. My prediction of XQuery growth last year and continuing this year is, in my opinion, a boringly conservative, somewhat obvious prediction based on structural analysis. The eXist database, Mark Logic, Berkeley DB XML, XQuery features in DB2, etc. are all showing signs of active development. They technology is at a stage of maturity which is a sweet spot for growth: a lot of the systems are quite stable and yet there is a lot of head-room for low-cost (of implementation) performance and feature improvement, particularly in the systems with open source licenses. Meanwhile, as the example of government data shows, there's an increasingly large amount of interesting data becoming available in XML form.

That was all true last year, as well, and (imo) helped drive XQuery growth. The job market statistics, as imperfect as they surely are, hint at some confirmation.

-t

I'm not entirely surprised

I'm not entirely surprised you are praising the alleged success of XQuery. No one is called to be unbiased here. But up to the high standards of the discussion it would be kind if you present numbers and statistics just like James did.

numbers and stats

I'm sorry. I wasn't trying to be evasive. None of these numbers are worth much more than a "hint", I think, and they're easy enough to find. I didn't link to them because I didn't want to endorse any of them as all that important. However, for your convenience, here are some:

From indeed.com comes a graph comparing relative changes in job listings for XQuery, Python, Ruby, and C++.

From simplyhired.com is a graph and a table that says:

  • Xquery jobs increased 45%
  • Python jobs increased 31%
  • Ruby jobs increased 47%
  • C++ jobs decreased 9%

Mind you, in absolute numbers XQuery is minuscule compared to the others. I'm looking at the trends, not the absolute numbers.

-t

[edit:] p.s.: My Flower project is, indeed, XQuery-based so I suppose that, yes, I have a "conflict of interest" if my prediction is interpreted as advice. Good point.

My advice in that case is that while LtU is a good place to seek information, putting too much weight on these "prediction" topics is foolhardy regardless of whose bias is on display.

Flower itself is, so to speak, back in the shop for some re-tooling. People found it too hard to understand so I need to make it simpler and more obvious.

lua and jquery

On that score look at the results lua and jquery.

The graphs don't support conclusion

Both Simply Hired and Indeed show some growth in C++ over '08.

damn lies and....

You're right that I made a muddle of reading the graphs and that C++ wasn't down in '08. The growth rate for C++ was apparently very slightly better than 0 in '08. The growth rate for XQuery climbed in '08 from 150%, YoY, to over 400%. Or something like that.

I don't think your correction undermines my conclusion that XQuery is taking off or that C++ is, well, not.

-t

Anecdotally

This year it is likely that I will be using Javascript extensively as a scripting language embedded in a Java application, using Rhino. I will be on the lookout for other examples of Javascript being used outside the browser, and will probably conclude from the examples I find that "this year lots of people are using Javascript as a general-purpose scripting language". This may or may not be an example of selective perception.

Similarly, my interest in F# may cause me to mildly or severely overrate its importance and/or "enterprise penetration". F# isn't particularly interesting as a language - we already have O'Caml, Haskell, etc. - but as a new-ish addition to the language ecology in search of a niche, its progress may be worth tracking. I wouldn't bet on its not getting into production use somewhere in finance. And then being abandoned in 2010 in favour of C++.

I will be surprised if Scala really takes off, but disappointed if there isn't solid IDE support by the end of the year. Either way, Clojure will get more of the buzz. "Dynamic languages on the JVM" will continue to be a hot topic, but nothing concrete (tail-call elimination etc.) will be delivered by Sun. Maybe in 2010. There's an outside chance of a forked JVM getting some attention, maybe incorporating work from the Da Vinci Machine project.