What do you believe about Programming Languages (that you can't prove (yet))?

The Edge "World Question Center asks the thought provoking, but overly general for this forum, question "WHAT DO YOU BELIEVE IS TRUE EVEN THOUGH YOU CANNOT PROVE IT?"

So let's tailor it a little for LtU...

What do you believe about Programming Languages (that you can't prove (yet))?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Massive Concurrency

Languages that don't scale well to massively multi-threaded concurrency are doomed to the scrap heap.

CPUs are all going multi-core because it is no longer feasible to keep increasing the clock rate of a single core to gain performance. Languages that can't make effective use of these multiple cores will fall by the wayside, having finally run into the von Neumann bottleneck.

Great news for modern languages like AliceML, Oz, and parallel versions of Haskell. To me, this is the killer ability that can propel such languages into the mainstream. I don't see anything else that can take them out of their current niche.

Even better news...

It's even better news for Erlang, which is the only industry-proven mostly-functional (and implicitly parallel language) I know of out there. I consider your other examples to be research languages that haven't yet had a killer application.

http://erlang.org/

Still better news...

It's even better news for Ada, which has been deployed in mission-critical (and by that I mean either it works or people die) parallel systems for even longer than Erlang was around.

Eiffel, SCOOP

Incidentally, the only imperative language I know to have given concurrency serious thought is Eiffel, with SCOOP.

There must be others though that have recognised the problems of massive concurrency relying on primitive locks. I know database vendors have examined this. What else is out there?

Occam

Well, occam for one. It was explicitly designed for massive concurrency, and intended to run on networks of transputers. Transputers, in turn, were developed to harness the power of multi-core systems (well, multi-processor really), because the transputer designers believed that single-core systems simply wouldn't advance much further and the future lay in parallelism.

Unfortunately, with the death of the transputer (primarily due to an extremely delayed release of a promised next-gen transputer - at least that's what I've heard), occam fell into disuse, and many of its neat features were forgotten. Nor has occam evolved much to keep up with modern trends in software development. There is some work going on at the University of Kent to add new features to occam, but there're only a handful of people doing that work.

Ada (see above).

Ada (see above).

The need for full lifecycle languages

I believe that the next great productivity gains will be made by languages with explicit support for activities normally associated with the analysis, design, testing, integration, deployment, and monitoring phases of program construction, and that there are no large productivity gains to be found by further rejiggerings of the coding phase(*).

*except for safe large-scale concurrency support.

languages with explicit

languages with explicit support for activities normally associated with the analysis, design...

Since we are dealing with things we can't prove, let me just say that I think (and hope) that this is wrong approach...

I mean especially analysis and design. Am I the only one who smells UML hiding around the corner?!

UML

Since I also believe, but cannot prove, that visual languages are a complete waste of effort, you can be sure I didn't mean UML.

Besides, there's barely any support for design information in UML, beyond that provided by standard OO languages.

Since I also believe, but

Since I also believe, but cannot prove, that visual languages are a complete waste of effort, you can be sure I didn't mean UML.

Glad to hear it.

I believe, but can't prove

Since I also believe, but cannot prove, that visual languages are a complete waste of effort

I believe, but can't prove, that "modelling", as the word is used in the UML circles, is metaphysics. Taxonomic reasoning is metaphysics, and the correspondence between "real world entities" and code classes is metaphysics. Since there can't be a physical bridge between the metaphysical realm of objective contents and the physical realm of subjective programs, modelling is an activity forever removed from actual concretion in code. The leap between modelling and programming requires a leap of faith, irreducible and un-analyzable, opening up the door to byzantine discussions about the nature and accidentals of the angels, I mean, "real world objects". This is to me a throwback to primitive, animistic reasoning akin to the scholastic method of the Middle Ages.

Object-Orientation and Metaphysics

Very well said, and I also remember a quote by one of the founders of object-oriented programming (Nygaard or Dahl I think) where they say something like (paraphrased from memory!): ``The meaning of ``object-oriented'' is precisely this: the objects of a program model the physical objects of the program domain.'' Good thing they were able to figure out a true model of physical reality, somthing (meta)physicists have not been able to do for centuries! (I would appreciate someone telling me the correct quote, I cannot remember remember where I read it.)

Probably not the quote you're looking for...

...but the following from an article on Smalltalk expresses the idea:

object-oriented languages make it much easier to build dynamic systems, because they map concepts much more directly into software. There is a one-to-one mapping between the model (Class) and the phenomena being modelled. Instead of "programming", you simply "model".

Hybrid languages

Since I also believe, but cannot prove, that visual languages are a complete waste of effort, you can be sure I didn't mean UML.

I believe, but cannot prove, that there is room for what I'll call "hybrid" languages, which mix visual and textual forms of specification. Some things are just easier to represent visually (e.g. complex component interconnections), while others lend themselves better to text (e.g. declarative specifications, or complex behavior descriptions). The introduction of OCL into UML is a step in this direction (please do not interpret this statement as an endorsement of UML). Outside of the software world, the Simulink toolset provided as part of Matlab is a good example of a language that permits a mix of visual and textual descriptions of components and behavior.

Java + IntelliJ = Hybrid language ?

Since I also believe, but cannot prove, that visual languages are a complete waste of effort

I agree, programmers are comfortable with text based representation of programming languages because they already have the natural language skills necessary to represent complex ideas as text.

I believe, but cannot prove, that there is room for what I'll call "hybrid" languages, which mix visual and textual forms of specification

Would you consider the merger of a text based language, like Java, and a good IDE, like IntelliJ, to be a hybrid language? For example IntelliJ will display an icon next to an override, or a method being overridden, to visually indicate program structure. This eliminates the need for verbose keywords like "overrides" and "overridable".

re: Java + IntelliJ = Hybrid language ?

I agree, programmers are comfortable with text based representation of programming languages because they already have the natural language skills necessary to represent complex ideas as text.

Actually, I think it goes beyond that. Some ideas are just much more easily and clearly expressed in a linear textual form (while others are better suited to some kind of non-linear graphical notation). Note that doesn't preclude the use of graphical symbols in linear text, to provide more compact expressions (conventional mathematical notation, with its mix of letters, numbers, greek letters, and custom symbols such as the integral sign or fraction notation, is a good example). But if you've ever tried to build a complex graphical model (using, e.g. Simulink) you rapidly find that in some cases while you could define a particular construct using the graphical notation, a textual representation is substantially more clear. I suspect that many of the people who push the idea of purely graphical notations for programming have never actually tried to build a large graphical model.

If you have ACM access (sorry, I looked but I can't find a free version) you might be interested in Marian Petre's paper Why looking isn't always seeing: readership skills and graphical programming.

Would you consider the merger of a text based language, like Java, and a good IDE, like IntelliJ, to be a hybrid language?

Well, yes and no. Not having actually used the product in question I couldn't say for sure. But based on your description, it sounds more like a case of linear text with graphical symbols for compactness of expression. The same kind of thing can be found in mathematical notation. The hybrid languages I have in mind are instead languages that permit a mix of linear text and non-linear graphics. For example, one might use a graphical notation to define the interconnections between different components of a distributed system, and a textual language to define the behaviour of the graphical blocks that represent the components. In other words, I want the capability to use both kinds of notation, and choose between them based on which kind of notation is best suited for expressing whichever aspect of the problem I'm dealing with at the time.

Hybrid languages

I'd certainly add Modelica to this category, and a few other components-with-enclosed-scripts languages that I've seen come and go, which aren't worth naming, including 'wiring' style Java bean editors.

"override" is not useless

Though I agree with your general point that the language is changed by the representation, the "override" keyword in C# is not useless. The "override" keyword says "I mean to override something in a super class." IntelliJ can't automatically figure out whether you meant to override a method. Since accidental overriding is something that can happen quite easily, I think the extra keyword is useful.

However, there are other cases where being explicit about intent may not be worth the effort. IntelliJ also highlights references to static fields, which diminishes the utility of prefixing instance variable names with "m_".

Visual whitespace

Hmmm, I'd argue that languages with syntactically-significant whitespace, such as Haskell or Python, already have a mix of textual and visual forms of specification. Even languages like C that don't have significant whitespace seem to have a visual form of specification when common indentation styles are followed.

Functional and rule systems

Functional and rule systems often have a structure that leads to nice visual representations. Here is one I discovered recently.

Room for improvement

I think things are moving in that direction, and is driven not by PL but by an integrated environment like Eclipse. Marrying code and task list (or bug tracker) is already common, then you can of course marry tasks with requirements, with schedule, with design docs, etc. Testing is already integrated to some extent, with unit test attributes integrated into some PLs.

Post-deployment is something that seems neglected. Sure, you can press a button and deploy a web site on some IDEs, but it doesn't annotate the code with stack traces that occur in production and tie them to a customer ticket.

As far as analysis and design ... tools that allow better high-level views of an architecture would be nice. But, it seems these phases are driven more by politics than PL limitations.

toward full lifecyle languages

Dave Griffith: ...next great productivity gains will be made by languages with explicit support for activities normally associated with the analysis, design, testing, integration, deployment...

That's basically what I've been thinking for a while. Now what to do about this is what occupies my design cycles during Copious Free Time. But I might add an item not noted in your list above: explicit support for learning. That one's tricky to explain, but it basically means bounded documentation: enough docs to explain what you find, and few enough docs (small enough language) that you can plan to get your head around it in a time you can plan.

Typically languages don't have bounding their adequate docs as a goal. But to me this is as helpful as a line in a proof that says there are no more steps.

Okay, now back to your list of things needing explicit support. Productivity gains would derive from saving time on tasks that take too long. Today just hacking out a first draft of code takes a small part of total time consumed by trying to ship a new product, like a complex server. Just to pick a single thing on the list, it takes at least as much time to analyze what a system is doing under debug. In other words, after you write the code, you spend at least as much time wondering what the system is actually doing, even if the code is nearly all correct.

Starting about three years ago, I started telling folks you not only had to write code that runs correctly, you also must provide some kind of support to see whether code is running correctly. Even if there are no bugs, some emergent effect at runtime will puzzle you, and a long time can be consumed by tracing the cause and effect back to appropriate roots. When you work on a team with several other folks, after the first agonizing bug shows up, you can spend a lot of time proving your part of the system does exactly what expected, and didn't play a role in the observed chaos.

Lately I seem to spend most of my time in forensic system analysis as a local coroner for other folks' code fatalities. I never had an ambition to be a code coroner.

It would be enormously helpful if a language -- and not just one particular development environment -- had explicit support for generating evidence of what and how a system actually got from one point to another. At minimum, a count of every method called would be terribly useful for debugging and testing with a bit of scientific method. (Test A should fire method B between M and N times.)

Basically, you'd like to reify a lot of things about the runtime enviroment, so you can minimally evaluate a running system. Better yet, it would be nice to able generate code that automates your evaluation in various ways.

(Generating runtime breadcrumbs can be expensive, but in practice I see this expense being paid anyway, and sometimes in grotesquely ad hoc and inconclusive fashions. If a language supported such features at the conceptual model level, the cost ought to be less than otherwise. And real deployed systems typically want to have logging and auditing of some kind turned on anyway, if only to bill the folks using a system with some backup proof that the bill is not made up.)

These days during my Copious Free Time, I think about doing this kind of thing using a hybrid of Smalltalk and Scheme and XML, where all syntaxes parse to variations on the same parse trees and compile to the same runtime. You'd also want to implement Python too with the same runtime. Using existing syntaxes -- especially ver simple ones -- would reduce size of docs. (You'd need to cheat a little to avoid making syntax for primitive data types vary too much between one code style and another.)

The goal would be a Lisp/Scheme dialect (with full continuations) that also supported a Smalltalk dialect and an XML dialect, and also had namespaces for traditional Scheme and Smalltalk bindings, plus a lot of code for converting between language flavors (and sure, Python too, why not) and a expressly supported native C++ underpinningfor writing primitives. I basically like Patrick Logan's "shared nothing" model which resembles some parts of designs I did a few years ago, so I'd lean in that direction for the network i/o module and server support. I think a web server should be built into the langauge, so you can ask the runtime for a summary of current statistics. Everyone who writes a server wants a web interface, and having the language understand the idea would help. A hello world program ought to generate a web page or some XML someone wants to consume.

I like the idea of being able to nest XML inside of Lisp inside of Smalltalk in whatever order I like. But that doesn't have anything to do with the semantics of explicit support for all the things in Dave Griffith's list. I don't want to make up a new language to get the desired support, though mixing existing languages might seem like that. Because rules for symbols in Lisp and Smalltalk are fairly simple, it would be easy to have a namespace system that allowed one to name any particular runtime value you wanted to sample, as a first class reified object. You kinda want syntax styles agnostic about symbols so the namespace architecture can be beefed up to refer to everything in the conceptual model providing all that extra support.

Um, I am so having diarrhea of the mouth. Sorry -- gotta cut myself off.

DrScheme?

I don't know about strict "language-level" support for learning, but I think Dr.Scheme does the next best thing. Like you said, it provides varying levels of documentation based on the environment you specify--for example, there are several environments defined for various Scheme textbooks. It also loads up code for interactive evaluation from those same textbooks. Finally, it also provides some tools based purely restricted nature of the "beginner" environment so that students can actually inspect what's happening, such as a evaluation stepper, and a code-coverage checker.

So, if you're a student working through HtDP using this environment, you've got not only all of the documentation and code from the book at your disposal, but you also have some tools to help you work out exactly what's going on.
(edit 5-3-06 fixed link to point off-site)

Beats the Alternative!

Rys: Um, I am so having diarrhea of the mouth. Sorry -- gotta cut myself off.

If the alternative is not to hear from you but once every few months or so, please keep typing. :-) Naturally, you know that I'm a big fan of your ideas. How are things in life with respect to work and CFT? I know things have been tough for a while.

typed continuations

Paul Snively: Naturally, you know that I'm a big fan of your ideas. How are things in life with respect to work and CFT? I know things have been tough for a while.

It's easy to assume folks stop being interested in one's ideas after personal trouble; I'm glad you were not only a fair weather friend. But since life in work and CFT is off topic here, I should keep it really short, like a movie title. So instead of Sleepless in Seattle I might sign myself Insolvent in Palo Alto. (If I still signed myself David McCusker I might see more greetings -- maybe I should use that name again professionally.) Work could be better, but staying in one place has been my goal for a while now, so I'm fourteen months into a startup that plans to drop stealth mode during the next month or two. However, I plan to make few comments on work even then. Maybe after they announce I'll consider moving on. I should resume a blog to chat in a better place.

In this forum, hobby coding in my vanishing free time remains on topic when it's in the interesting programming language vein. I guess I'm gearing up my mind for more implementation. I added a Smalltalk object system to a Lisp dialect once around 13 years ago, and doing it again would be really easy, and this time my product would be enormously better in many ways. Only this time I won't be grafting Smalltalk onto Lisp; a generic VM suiting either equally well is better. Keeping the C++ at the bottom small and understandable is what slows my hand, since I don't want to just spew out an extra thirty thousand lines just because it's easy.

At the moment I plan a first draft in trampolined dispatch style to prevent C stack winding, with an interesting representation of continuations to simplify a mix of code in many formats including interpreted parse trees, VM pcode, and C++ primitives, etc. The idea is the same one I had about eight years ago, and was planning for Mithril, where basically code pointers (a "return address" or a "continuation") are typed, so calling or jumping to executable code is generic. While that slows dispatch with indirection through a how-to-dispatch step, it allows you to mix everything and have reified stack backtraces with return addresses of all flavors. And nothing stops you from having C calling into Lisp calling into assembler calling Smalltalk calling into Python, etc.

When dispatch is not antagonistic to high level tools like debuggers, it becomes incredibly easy to write them, and incremental just-in-time compilers, or whatever. And you can just experiment with code research you find amusing. Like you could just add a 68K emulator to use as pcode (as a random example) and it would plug-and-play with the other code types.

I'm late and really must stop right this second. :-) My girlfriend interrupted the second paragraph above to have a discussion about moving.

If I still signed myself

If I still signed myself David McCusker I might see more greetings

I know it's you Rys, and I am glad to see you are ok. The LtU forum isn't really a usuall place for personal conversations, which may explain the lack of greetings. I for one am always happy to hear from you, here and elsewhere.

+1

At the risk of getting scolded for being offtopic: +1.

Re: forensics

Check out a replay debugger tool.

More speculation

I believe, but cannot prove, that the languages which support these features will need to be not higher-level, but wider. Something like C++/Boost, Lisp/Scheme macros and Template Haskell but with closer ties to the type system are probably where you need to look here.

Goal orientation

Backtracking control will solve problems related to behavior understanding and representation. Why? Because behavior is goal oriented. Goal orientation always involves back-up logic; what to do if this doesn't work. There are ways to program goals in "all" languages but I believe that the clearest way is to use rules and either forward or backward chaining. For me back chaining works better. Goals also naturally relate to concurrency.

Turing Incomplete languages

I always believed that there was a way to design a just-almost-Turing-complete language that could do 99% of programming tasks. I also believed that if this language existed, you could optimize the bejeezus out of it. I mean, isn't a finite strip of tape good enough? :)

Already been done, yes?

That was the whole point behind SQL and it's various competitors, back in the day. Of course, once it solved 9X% of the programming tasks of the time, we then promptly grew a whole new set of programming tasks...

Isn't that the point with SPARK

As far as I know, programs in SPARK Ada have to be bounded in space, since neither recursion or dynamic allocation is allowed. I believe (but cannot prove) that this is what makes the SPARK approach of partial correctness proofs feasible.

Wouldn't SPARK be an example of an almost-complete language such as you mention? I'm sure the SPARK marketing departement consider it appropriate for 99% of all programming tasks, but how fun is it to get a message from your compiler saying that it can't handle programs this big, you need to get a version of the compiler which is compiled with higher static bounds? (The SPARK compiler is written in SPARK itself...)

Spark is well-suited to its target applications

Spark is intended for safety-critical applications where the restrictions you mention are the norm (in some cases they are legally mandated.) AFAIK, Spark programs are compiled using existing Ada compilers. The Spark Examiner is available in different sizes to accommodate different programs.

You're right about compiling,

I should have said the Spark tools.

SQL is the beginning

SQL was partially what I had in mind, but consider that as you add complexity you eventually need to turn to add a Turing-complete language (e.g. a stored procedure language like PL/SQL) to get some things done. I am wondering how far a language can approach Turing-completeness and still be useful. I should probably reference some other threads on LtU ... though I've read the threads on Charity :)

Since I asked the question....

...you may have guessed I had a few answers myself but wanted to see what else I could shake loose from you guys first....

Moore's Law and Feeping Creaturism implies ever larger and more complex systems. The solution to which is not ever larger and more complex languages, but simpler languages, both syntactically and semantically that permit tools to assist more in...
* Refactoring.
* Testing.
* Code Generation.
* Reengineering, understanding and visualization of very large systems.

Thus
* The ease of analysis, rewriting and refactoring of the Joy language will make the Joy language the "parent/inspiration" language for the next generations of languages.
* Syntax and semantics will become simpler. ie. Towards a simpler version of Lisp/Scheme or Forth.

Some other beliefs I have stemming from Moore's Law and creeping featurism driving the exponential growth in software complexity...
* Linear logic will supplant GC.
* Lazy evaluation is not an optimization (or hard to implement nice to have). It's mathematically the only way to have a consistent, analizable and manipulatable system.
* Single Assignment or Joy / Forthlike no assignment will become the norm.
* Cyclic dependencies at one or more levels (package / file / class / ...?) will be explicitly disallowed by the tools.
* Static typing will lose to Dynamic Typing, and Dynamic Typing will lose to compiler inferred Static Duck Typing.
* Languages will provide a transparent "each object has one thread inside it, every method invocation is just another event in that object's event queue" view to the programmer.
* OO hierarchies are too reminiscent of pre-Codd hierarchical databases. And SQL ignores Codd on too many points. I expect to see a language that...
** Is fully relational ala Codd Law's.
** The basic Object model is also in at least 4th normal form.

Can I prove any of this?

No. Not now. Not yet.

But it's what I keep my mind occupied with when I'm not earning my bread.

Linear Logic vs GC

Would you happen to have any pointers to information on Linear Logic?

I am always interested in the problem of garbage collection, and alternatives to it.

Linear Logic vs GC

A good starting point might be Henry Baker's Look Ma, No Garbage and Use Once Variables and Linear Objects. Also, search his
paper archive page for the word "linear". Also see Wadler's Linear Logic page. Note that systems based on linear logic can obviate garbage collection, but depending on the properties of the system in question, they don't necessarily do so entirely. As usual, a benefit such as "no GC whatsoever" comes with a cost.

Is it a cost though?

First lets deal with the "don't do so entirely". Since any turing complete language can emulate any other, any linear logic (or any other GC based) based language can say create an interpretor for standard C. ie. No matter how smart our GC we can _always_ do something sufficiently stupid to get us in as bad if not worse a state.

How ever the real benefit I see in linear logic is the way it matches so closely our natural perception of the world as ingrained from preschool days.

"The ball is in my Hand now. You take it from me. The ball is in your hand. The ball is NOT in my hand now. WAAAHHH!!!"

Compare this to the C pointer, the Ball is in my Hand, it's in your Hand, it's also in the cupboard, and Sam playing outside has popped it so you're going to fall flat on your back when you try kick it.

Is it a cost?

The cost is at the source code level: just compare code in a linear language to equivalent code in a non-linear one, and the cost should be evident. The advantage of GC is that it usually requires no explicit attention in code whatsoever - its costs are all hidden and paid automatically, so the net effect of GC is that it removes complexity from code, reducing the effort required to write and reason about the code. This is not the case for code based on linear logic, no matter how intuitive a micro-example may seem to a preschooler.

Codd

I agree with your point about relational vs. current object systems. Multiple inheritance and Java-esque interfaces try to mitigate normal object oriented limitations. The folks over at dbdebunk.com try to preach and defend the relational model, it is unfortunate that sometimes they get down-right nasty and rude.

I expect to see a language

I expect to see a language that...
** Is fully relational ala Codd Law's.
** The basic Object model is also in at least 4th normal form.

I predict that the language will die in obscurity, because the SQL people will see it coming, and add more cruft to SQL to implement object models a half-arsed way. "Cool", database vendors will say. "Guess we can stick with SQL."

Linear logic will supplant

Linear logic will supplant GC.

Um... unfortunately, no: this thesis shows what linear logic buys you, and single-usage is not one of those.

I believe that the next best

I believe that the next best PL will support both staged programming and dependent types. Also I believe that every useful invention in PLT is about introducing a new form of abstraction so we can write more generic programs, but it'll eventually lead to a place where the programs we write will be so generic that nobody will actually understand them. Then it will begin the golden age of DSELs.

Ahh...

The Epigram koolaid. :) I am too waiting for the real version of that... should be simple, right? Just add some IO and support for general recursitivity... :p

Seriously though, I would be overjoyed to see that as a useable style of programming in my lifetime (and I'm only 19...) -- ala what Haskell is nowadays.

A wonderfully formulated question

Some answers from me... May be a bit controversial.

1. I believe that large-scale (component, class, big subroutine) reuse fundamentally doesn't work and is doomed. I believe that the language should and will eventually contain all useful reusable components, just like in natural languages - or in DSLs like SQL or CSS.

2. Because of point 1, I believe that programmers will split (are already splitting) into "paper/canvas manufacturers" and "artists". The market for the first kind of programmer will radically shrink; the second kind will be in huge demand, but very fragmented. Say TeX, Excel, HTML, Flash, etc. - all tools of great depth, but this depth is not in the least correlated with "computer science interesting-ness".

3. I believe that programming at the systems level will become more static. Not in the typing sense; more in the "prefer isomorphisms to transforms" sense. Programs should and will become not sequences of actions, but compositions of isomorphisms and invariants into larger, more useful isomorphisms and invariants. (Example: a unit-testing system should be truly continuous - I change a line of code, the lamp turns red, I change another line, the lamp turns green. Ditto for conflict detection in version control systems, etc.)

Doomed at what level...

I agree with you reuse, as within the current paradigm, is doomed. But I'm a tad more optimistic than you. I believe reuse fails because...
* People fail to write reusable code.
* The "not invented here" syndrome.
both of which may change under a paradigm shift forced by the sheer weight of code we have to maintain.

You seem to be of the gloomy view that only tools may be (partially) reusable, code never. (Ah but then all my Russian friends seem to be innately depressed and pessimistic so maybe that has something to do with it...:-))

With respect your point 3, are you aware of the existence (and failure) of the (attempted) move to syntax aware editors a decade or so ago?

It's optimism!

Yeah. Tools and languages are reusable and widely reused now; code isn't. I believe this isn't going to change, for sociological reasons like the ones you cited :-)

Re the paradigm shift: we've already seen it. VisiCalc killed a whole market of specialized financial software. The same thing is going to happen with more and more segments of the software market; this is "progress in CS" in the most real sense. I think it's very good - a future with opportunities for everyone, not just wizard Haskell programmers.

Re syntax editors: when I code Java, I use IntelliJ IDEA. It's very syntax-aware, and does a whole lot of wonderful stuff. In Ruby, however, a simple text editor serves me just fine... But it would be even nicer to have a separate syntax checker that would watch the filesystem; I save the file in my favorite editor, and a separate window shows me the list of syntax errors. And test results. :-)

You Asked For It...

Vladimir Slepnev: But it would be even nicer to have a separate syntax checker that would watch the filesystem; I save the file in my favorite editor, and a separate window shows me the list of syntax errors. And test results. :-)

That would be omake -P.

I've said it before: the combination of O'Caml, GNU EMACS + Tuareg Mode, and omake, in which I can try things out interactively, put them in an actual source file when they work, and keep my build up-to-date at all times is priceless.

Wow!

That's really it, thanks! Will try it out.

For the record

Just for the record, I think I see a wonderful use for omake. It's called (giggle) Macromedia Flash.

We've had the MTASC compiler for ActionScript for quite some time. It is Open Source and lightning fast (and, incidentally, written in OCaml too). Compile times are almost unnoticeable. After we add something like omake to the mix, rich GUI or game development may look like this: you edit a source file, hit save, refresh the browser and see the results immediately. Faster than AJAX, nicer-looking than most graphics toolkits, integrated with sound and network, providing an advanced language for development (the MTASC brand of ActionScript has GC and type inference), and 100% portable across all browsers in the world.

There's a Flash plugin for

There's a Flash plugin for Lynx?!

Caught me

OK, I admit the 100% sounds a bit rude. I heard the actual figure is something like 98%. Not bad either :-)

no lynx flash plugin yet, but ...

maybe it could be created if you start with the technology used here (from the mashup experts at Poly9).

on MTASC

Hi Vladimir, did you also have a look at haXe ? http://haxe.org. It might be a lot more appealing to audience there than learning ActionScript 2 :)

Yes, I believe too that using the wide distribution of Flash makes sense for some applications, although you can't do eveything with it, and in particular the current VM is pretty slow (this is fixed in next version - Flash Player 9 will have JIT and should be available on Linux as well).

Player availability is still one of the big problems with Flash : it's a closed source player which is not available on all platforms (and of course not on Lynx ;). Some people are working on an open source Flash Player (FSF project named Gnash). Let's wish them good luck !

haXe

Yeah, I'm aware of haXe.

Problem is, ActionScript skills are "portable"; haXe skills aren't. Which means that if I use MTASC, I can (kinda sorta) do one-off consulting work on other people's Flash projects, claim Flash/ActionScript exeperience on my resume, etc.; if I choose haXe, I'm pretty much limited to living in a cave.

While we're at it, thanks for your wonderful work!

OMake

I finally made sense of OMake. It wasn't easy - I guess the authors didn't really think about migration from "make".

1) Where there was one Makefile, I have 10 files now: OMakefile, OMakeroot, OMakefile.omc, OMakeroot.omc, .omakedb, .omakedb.lock, src/OMakefile, src/OMakefile.omc, res/OMakefile, res/OMakefile.omc.

2) I can't use wildcards in dependencies. This might make sense for C or OCaml, but is utterly dumb for ActionScript (or Java), where you typically compile a directory of related files, and refactor stuff into new files a lot.

3) I can't use the "cd" command in the build process. (This is why I had to add OMakefiles in subdirectories.) For that matter, I can't use "del" - I'm on Windows, and "nmake" understands it fine, but OMake has its own shell interpreter and chokes.

4) I have to specify .PHONY for clean. Otherwise it just plain doesn't work. No idea why; good old make did fine without .PHONY.

This was just the issues encountered in migrating a really small project with one output file and six source files.

OTOH, after all this pain, the filesystem-monitoring part works great. It really is the productivity boost I had imagined. Now I want a 3-position graphical indicator: build succeeded, build in progress, build failed :-)

Bottom line: Certainly not a drop-in replacement for make, as I had hoped. Very good functionality, but will take half a day to set up, and the docs are mostly irrelevant if you aren't working in C or OCaml.

It Sure Isn't Make!

Vladimir Slepnev: I finally made sense of OMake. It wasn't easy - I guess the authors didn't really think about migration from "make".

That's a fair point. I should have pointed that out.

Vladimir Slepnev: 1) Where there was one Makefile, I have 10 files now: OMakefile, OMakeroot, OMakefile.omc, OMakeroot.omc, .omakedb, .omakedb.lock, src/OMakefile, src/OMakefile.omc, res/OMakefile, res/OMakefile.omc.

In your case, you might be able to get away with just OMakefile and OMakeroot at your top directory by using a .SUBDIRS rule with a body in your top OMakefile. A .SUBDIRS with a body means that the body will be applied in the subdirs, so you don't need separate OMakefiles in them, as documented here. The .omc and .omakedb* files are, of course, generated; you may wish to tell your version control system to exclude them from consideration, if necessary.

Vladimir Slepnev: 2) I can't use wildcards in dependencies. This might make sense for C or OCaml, but is utterly dumb for ActionScript (or Java), where you typically compile a directory of related files, and refactor stuff into new files a lot.

I don't know what gives you that idea; see here regarding "implicit rules" and wildcards.

Vladimir Slepnev: 3) I can't use the "cd" command in the build process. (This is why I had to add OMakefiles in subdirectories.) For that matter, I can't use "del" - I'm on Windows, and "nmake" understands it fine, but OMake has its own shell interpreter and chokes.

Here's the cd documentation, and here's the unlink/rm/rmdir documentation. The nice thing about having these built-in functions is that they work on either UNIX or Windows systems.

Vladimir Slepnev: 4) I have to specify .PHONY for clean. Otherwise it just plain doesn't work. No idea why; good old make did fine without .PHONY.

I suspect that OMake works by literally treating rules as functions taking files to files, so .PHONY serves the same purpose as "void" in C/C++/Java. The .PHONY docs imply as much: "A 'phony' target is a target that is not a real file, but exists to collect a set of dependencies." They use the example of the "install" target. It's easy to see that "install" and "clean" both fall into the category of "not a real file," and therefore are "phony" targets.

So I think points 1-3 are easily addressed with a bit more familiarity with omake, and point 4 is correct as stated. Regardless of that, your conclusion is perfectly sound: it isn't a drop-in replacement for make (it's quite a lot more powerful than make). Your first time out with it might take half a day, but future OMakefile/OMakeroot development can be much, much faster. Finally, I think most of the docs are perfectly applicable, but no doubt writing a good set of rules for languages other than C/C++/O'Caml will pose some initial challenges also.

Thanks!

Good points all. I'll take note :-)

Re wildcards: as far as I can see, I can't have a dependency like "a.foo: *.bar". Implicit rules don't help, unless I'm being dumb. But this restriction may actually be a good thing - not sure yet.

Yeah, I eventually figured out cd and rm. Not so hard, just a bit different.

.SUBDIRS with a body sounds like a really good idea. Will do that.

Regardless of the problems, OMake will definitely replace make in my personal ActionScript projects. I can't yet push it at work - the current project takes a night to build (C++ with heavy Boost), and the number of Makefiles is in the hundreds or even thousands, I haven't counted.

Thanks!

Wildcards

Re wildcards: as far as I can see, I can't have a dependency like "a.foo: *.bar".

Assuming *.bar refers to source files, not generated files, then a.foo: $(glob *.bar) should work.

Thanks!

Will try.

Huh?

With respect your point 3, are you aware of the existence (and failure) of the (attempted) move to syntax aware editors a decade or so ago?

Syntax-aware editors are in common use by virtually all Java developers, and have been for about five years. The editors track keystrokes, and interactively calculate compilation errors, syntax highlighting, style issues, and possible bugs, updating all of that in sync as the user types, as well as offering various semantic code restructurings. Syntax-restrictive editors, which prevent all invalid entry by requiring only validity preserving operations, failed due to usability issues.

On the continuous testing front, there exist systems that will automatically retest on user entry, but they are somewhat clunky due to the need to rerun all tests. Commercial systems which will 1) allow the developer to only run tests that could fail (based on previous test runs, awareness of changed code, and code coverage data) and 2) allow developers to checkpoint their codebase and run full build and tests on a remote server, reporting results back when done, should be available in July (unless we slip, which we won't). Between those, it's only a small step to practical continuous testing. I expect IDEs with continuous testing to be commercially available in the next three years in the Java space, and shortly thereafter for .Net developers.

What tool?

Commercial systems which will 1) allow the developer to only run tests that could fail (based on previous test runs, awareness of changed code, and code coverage data) and 2) allow developers to checkpoint their codebase and run full build and tests on a remote server, reporting results back when done, should be available in July (unless we slip, which we won't).

Just curious, what tool/company is this?

IntelliJ IDEA 6.0

from JetBrains. Here's the roadmap http://www.jetbrains.net/confluence/display/IDEADEV/Demetra+Roadmap .

Disclaimer: I work with but not for JetBrains, and do not speak for them.

Regarding reuse

Do you believe in the reuse of combinator libraries (DSELs)?

Yes

Guess yeah. The success of Rails shows that metaprogramming libraries can be successful. But whether something is a combinator library or a full-blown language doesn't matter to the user one whit :-)

"Do you believe that a HyperCard stack can be a top-selling game?" Yes, the example being Myst. But few Myst fans ever heard about its HyperCard origins.

What the business world REALLY yearns for is a "business DSL". Rails is a "Web 2.0 business DSL" - the next best thing :-)

Richard Gabriel agrees with points 1 and 2

..and does a great job explaining them in the first half of Patterns of Software

Abstraction is overrated

I believe that programs should be written using a small number of abstractions. Two or three new abstractions are enough for most medium-sized programs. Programming languages should provide basic datatypes that you can build your programs on without creating unnecessary abstractions of your own. Compare pattern-matchable tuples/lists/structs/etc to object-oriented classes.

Examples?

Two or three new abstractions are enough for most medium-sized programs.

Can you give an example? Perhaps we are using the word "abstraction" to mean different things?

For me, every good routine is an abstraction, so I can't see how you can reduce the number of abstraction to two or three...

Well..

You know how there are some big abstractions that introduce new vocabularies: POSIX files, Emacs buffers, BSD sockets, Enterprise Java Beans, and so on. I like to have a minimal set of these abstractions and to use them very directly in programs. If a program is supposed to copy one file to another then I would like to see calls to open(), read(), write(), and close() all sitting there together near the top level.

The good thing about these big and common abstractions is that they give you a framework to think in. For example, you can design a sane Unix program by imagining which systems calls it should make (how it ought to look in strace) and then writing the code that "fills in the gaps" in the simplest way. The same works for imagining how client/server programs should look in Ethereal, what processes and tables to have in Erlang programs, and what buffers and text properties to have in Emacs programs. If you get the initial sketch right then you'll probably end up with a simple program that won't go far wrong.

If your head is too high in the clouds then you can go really far wrong without even noticing. This happens more and more in proportion to the amount of "black box" code between the application programmer and the ground-level primitives that need to be used in a sensible way.

Really The Art of Unix Programming says it all much better than I can.

P.S. "Abstraction is overrated" is what I always think after my annual chat with CLOS hackers. :-)

I think I understand what

I think I understand what you have in mind, but I am not sure the best way to put it is to say "abstraction is overrated"...

First, it seems you favour big abstractions - yet these are built on top of lower and simpler abstractions. You don't like the semantic gap this creates (the distance between the abstraction and what really goes on). This is a well known issue, of course. I think that be defining good abstractions, with clear semantics, this problem can be mitigated.

Second, it seems you like application frameworks, rather than ADTs and such. It is important to note that these two classes of abstractions solve different problemsw (e.g., STL vs. MVC or Rails). More interesting is the fact that not long ago FW were considered to be doomed, since it seemed that most good FW solve only one thing: i.e., they were basically MVC frameworks. Nobody managed to show a wide variety of FW. Things have changed, as is witness by thins from Rails to Spring...

Finally, you like languages that provide a flexible and very useful set of builtin abstractions (e.g., process in Erlang, lists in Scheme/Lisp). This is one reason why you don't find a great need for abstraction facilities (e.g., ADTs). But, on the other hand, it just means that you favour another *kind* of abstraction. Thus, you are a fan of programming languages as abstraction mechanisms. I agree, of course. No wonder we are haing this disucssion on LtU...

Yeah

Thank you for clarifying my thoughts!

The picture you paint sounds right, except that I do like ADTs (like 'file', 'buffer', and 'socket') and I don't like frameworks (assuming framework = library + indirection). I wonder how I gave the opposite impression? I do like to be economical with ADTs, e.g. to use a plain list of strings instead of inventing a DirectoryFileList.

"Any problem in computer science can be solved by one more level of indirection." That philosophy is actually what I am objecting to. On reflection I think the problem is that it takes a lot of work to make a really good abstraction and most frameworks / metaobject hacks / code-walkers are too ambitious. They hide things that you actually should know about.

Catty thought for the Day...

Really The Art of Unix Programming says it all much better than I can.

There has long been the saying that those who don't understand Unix are doomed to reinvent it. Poorly. My catty thought for the day was "Embedded Systems Programming is the last refuge of those who don't understand Unix."

Which probably wins me The Sugar Coated Sandbox Award.

Well...

I hear what you're saying, but it's also true that Unix doesn't understand modern embedded systems programming. Thankfully, Unix, in all its incarnations, is adapting, as it always will. We can profitably put a Unix-alike in an MRI scanner, but we can't put it in a pacemaker yet.

My cynical thought for the day is that I currently work as an embedded systems developer. My main qualification: "Had an 8-bit micro as a teenager."

Re: Abstraction is overrated

This sounds like you don't believe in Abstract Data Types and encapsuluation. That every function should know the details of the data structure it is working with. Is that about right?

I don't think this works very well as systems get larger.

That is about right

That is about right. Unix and Emacs suggest to me that it works exceptionally well for some very large systems. I wonder what's wrong with other large systems that prevents it from working for them too?

Large systems

My opinion is that you can write a large system in just about anything, and still have it work. Unix is built on C, but I don't think you would advocate C over a language without all the razors and barbwire.

I'm curious, what do you think of Lisp's macro facility? Are you totally against it?

What you say reminds me of

What you say reminds me of my comment that good API design is like finding a basis for a vector space. You have to find a minimum set of abstractions that spans the whole space of problems to be solved, constructed such that no single element can be expressed as a trivial combination of the other elements. So if abstractions are overrated because of this, then vectors are overrated too. :)

If you can't create your own abstractions

... you'll end up overlaying them via convention. It's not that they aren't there. They just aren't being checked.

{:name ["Jim" "Kirk"] :rank :captain} is still a struct.

Definitive languages

I believe that a small set of "definitive languages" will eventually exist. By "definitive" I mean that they provide good enough solutions so that most computer scientists doing language research today will move on to other problems. I give some arguments for one possible definitive language in my FLOPS 2006 talk.

Interesting paper

Thanks Peter! That was an interesting paper. Although I'm not really all that familiar with Eiffel's SCOOP concurrency model, based on what I do know it sounds like Eiffel would fit into your proposed schema as well (there seems to be a lot of similarity between the E and Eiffel approaches).

Prototypes not Classes

I (almost) believe that Prototypes are better than Classes; but I just haven't seen enough evidence.

Are prototypes a more natural way of thinking about objects

Classes in languages like C# and Java have always seemed to rigid to me. I've always been partial to OO in languages like Dylan/Common Lisp (CLOS) and open classes in Ruby.

But prototype-based OO seems to be even more unconstrained than static structures.

Here is an interesting paper on the psychological aspects of prototypes versus class-based OO. The author's contention seems to be that prototypes are a more natural way of thinking about objects.

What is real?

I have similar misgivings about objects and prototypes. To me the reality of a thing or group of things is a method of "parsing" or recognizing that thing. This usually amounts to a set of actions and the results they produce. The results can be described as information and stand for the thing. Not suprisingly such a representation often looks like an object definition but put the emphasis in the right place. I would also define recognition as an event. A formalization of these ideas is found in Syntactic Pattern Representation.

Crystal ball moment...

The author's contention seems to be that prototypes are a more natural way of thinking about objects.

I believe, but can't prove, that to a first approximation, exactly zero programming language issues will be resolved by appealing to how "natural" a given solution is.

Classes considered harmful

I believe that classes have to many responsibilities: defining types, defining behavior and creating objects. All three features are desirable but one hammer to nail them all is awful. A better idea is to have separate features for each with decent combinators for mixing them. An object model I think it's great is this, but E's syntax could be a bit terser IMHO.

I believe, but cannot prove:

The essence of good programming is clarity. The key quality of a programmer is eloquence. The best thing a programming language can do is disappear.

I have mixed feelings on this one...

Part of me agrees whole heartily.

So let's try pick a language based on it. On one end we have something like C++, and at another something like Common Lisp.

C++ is horrible, it's in your face and Scott Meyer's has written Books and Books on how to write C++ without it upping and biting you in the bum in some very very obscure corner cases.

Common Lisp on the other hand disappears. You can write anything and everything in Lisp anyhow. And in the hands of an eloquent Lisp Guru it is all your comment can wish.

However the thought of Common Lisp in the hands of a very large team of so-so programmers generating mega slocs of code adapting to shifting goal posts while under huge schedule pressure gives me the cold shudders.

Common Lisp doesn't really disappear, it slides under the surface requiring deep insight and understanding of what it is and how it does it.

So some measure of Bondage and Discipline, some measure of idiot proofing, because there are days when I'm an idiot. And under shifting goal posts and schedule pressure we are all less bright than we wish to see ourselves.

So I agree with your post, but I'm not sure I can convert the warm fuzzy feelings it generates in me into programming language design criteria.

I disagree with that. A

I disagree with that. A good programming language focuses your thinking without dominating it.

Examples

Exactly. It's impossible for a programming language to disappear; it'll always be there influencing the way you think about problems.

For example, a very lispy way of thinking about a lot of problems is to simply shoehorn your data into some list form, and then you'll have a lot of nice tools for dealing with the data. It may not always be elegant, but it's remarkably easy to fire off some deeply indented algorithm involving REMOVE-IF and N-MOST-EXTREME and a few lambdas.

In contrast, Haskell seems to encourage programmers to define their data structures properly, then make functions that deal with those data structures, then eventually get around to writing the main program logic. The lazy evaluation semantics also encourage a different way of thinking about a lot of problems. Suppose, for example, that you wanted to find the N smallest elements of a list of type "Ord a => [a]". In most programming languages, the natural thing would be to define a function specifically for this purpose. But in Haskell, it's actually a valid option to say "take the first N elements of the list, sorted using this selection sort that I already have".

In Java, objects dominate your thinking even when they shouldn't. I've seen APIs for downloading a file which actually require you to instantiate an HTTP downloader object, call the setUrl() method, and then control it via other methods. This is fine for heavyweight control of the downloading process, but for most simple cases it's object-oriented overkill. Deep object hierarchies are encouraged (though I'm not sure if it's the language or the people who encourage this).

I believe, but cannot prove, that no single programming language can encompass all the different ways of thinking about programming. I also believe that many programming languages will still try to do so, and that the Lisp family of languages will cling tenaciously to its current undead status because its easy quick-and-dirty malleability makes it a prime candidate to assimilate new styles of programming, albeit not as elegantly as a language designed from the ground up for the purpose.

Mundane predictions

I believe, but can't prove, that there are all kinds of important questions in PL design that could be approached scientifically, but nobody wants to. I think someone will eventually build a large, scientifically gathered statistical sample (a corpus) of source code "from the wild".* I think it'll be profoundly useful.

I suspect "the next Java" will have some notion of effects in its static type system (like the Haskell IO monad, not like C++ const; but very different in feel from both). I think a lot of people who like that idea now will discover they hate it in practice.

* Has this already happened? I wouldn't know.

Code from the wild

I noticed in the recent Ginseng paper discussed on Lambda that they used real-world, open source projects to apply their ideas. I think this is great, and hopefully more papers will take this approach. However, I think this approach will remain ad hoc, and not be the "scientifically gathered statistical sample" you are looking for.

Interesting, but the data is too noisy...

The big trouble with that one is yes, Academics do these studies, on small teams writing micky mouse systems.

It's way too expensive to have a large team writing a very large system... and having another 2 other teams trying other process / language variants and a control team doing just the usual thing.

But unfortunately all the problems that really matter now are at the "largish team (5+)/ large system (200000+ SLOC)" scale.

Once you get into that scale it just too noisy, too many variables, etc.

Even at the small scale end it hard to get a Good experimental design since the primary source of variability (As Alistair Cockburn points out) is the Human one. Which Human is doing the programming? How is he/she feeling today? Was his wife nice to him last night?

My absolute favourite paper in all the CS literature is..

Characterizing People as Non-Linear, First-Order Components in Software Development
by Alistair Cockburn.

My absolute favourite paper

My absolute favourite paper in all the CS literature is..

Characterizing People as Non-Linear, First-Order Components in Software Development by Alistair Cockburn.

Having just read this, while I probably wouldn't say it is my favorite paper in all CS literature (perhaps in SE literature?), this is an extremely good read.

There is no future in natural language programming.

Apart from certain small niches like writing interactive fiction, maybe, and I look forward to seeing how that works out. (One day, in the far distant future, when machines really do understand language, natural language programming may eventually become useful, but not in my lifetime.)

I don't know if this is a programming language issue, more of a programming paradigm one: there is no future in general purpose quantum computing. Again I think there may be some niches such as quantumm encryption where it's useful.

There may be a future in quasi natural language programming.

As I have noted elsewhere, I strongly believe in Quasi Natural Language as a viable notation system for non-math geeks.

I know I can communicate with fellow programmers in *unambiguous* natural language.

I believe that subset of Natural Language to be parsable with a Parsing Expression Grammar; I believe that any potential ambiguity expressible in a QNL Grammar can be avoided by providing the User with negative examples of what pathological constructions should not be used or delt with by an IDE -- particularly if we move to a conversational interface where information extraction techniques can be combined with a restatement of the system's understanding or query for clarification.

However, I also believe that virtually all PL Researchers have written off NLP approaches or just aren't familiar/comfortable with the current state of the art in this domain which creates a self-fulfilling prophesy that we will never see a serious NL-based programming language. Likewise, NLP researchers might benefit from delving into conventional PL Research, but have probably written off our work the second they saw our notation conventions which in all candor often take a non-trivial amount of mental overhead to transliterate into English on an initial reading.

Mainly, I know that novice End Users, no matter how motivated they might be to take up some measure of programming to accomplish whatever they have been tasked with on their Day Job, will not embrace seas of nested parentheses, braces, C type definitions, or whatever other terse syntax might seem most "natural" to us after years of immersion in "the literature". I also know that I can and have explained the underlying ideas behind mundane programs to them once I cut through the syntax.

I also believe that our notation choices make Computer Science look scary and intimidating to women and minorities, which is not a good thing.

If you find yourself in agreement with any of this and would like to help us pursue QNL End User Programming, drop us a line at The Institute for End User Computing, Inc.

CS as a tool of intellectual oppression

I also believe that our notation choices make Computer Science look scary and intimidating to women and minorities...

Er... I suspect they make Computer Science look "scary and intimidating" to a lot of men (of whatever race happens to be the "majority" in your locality) too. And to people in general, for that matter. God knows they intimidate me at times.

Alas, I fear we're going the other way....

Yes, I agree. The current computer languages and interfaces are far less End User Friendly than is possible.

But, there are deep mathematical reasons why the heart of programming language design will actually go in the opposite direction.

As a counter example I hold up the work of boost.org and Alexandrescu in C++ template design and meta programming.

The syntax of C++ template meta programming language is an absolute horror and as unfriendly as you can get. Yet even if you were to devise the most pleasant and courteous face to it...

  • What the template meta programming guys are doing will remain deeply subtle requiring minds of profound genius to progress, probably quite beyond Joe Average programmer today. (I'll admit what they are doing strains my brain to the max.)
  • Since the easy computing tasks have been done... The next generation of real world tasks will require these tools.

Juristic texts prove that

If you want to write down something very concisely and try to catch a lot of possible misunderstandings, you probably end up in something like juristic prosa - not that readable, at least if you are not used to it. It is still my hope, that we don't need to understand public static void main() to figure the functionality of a piece of code.
So i believe but i can't prove, that it is possible to get rid of all the technical artefacts a programming system implicates, and thus make it easier to understand a program.

Cheers,
Ralf.

Programs are data structures, not text strings

I believe that encoding programs as text strings will eventually be seen as a dead end. Programs will express their semantics as literally as do data structures today. Program generation, reflection, and self-adaptation will be ubiquitous.

And visualization comes with that

And the same piece of program can be viewed as graphics, strings or whatever depending on the problem at hand.

Lots of Little Languages

Instead of building large systems all in one language, a number of different languages following different paradigms will be used for different parts of the system. This is due to (obvious?) fact that some paradigms are better for different parts of a system than others.

Need to coordinate an overall system, with IO? Use an imperative language like C# or Java.
Need to do data access? Use SQL
Need to do some logic checks? Use a rules/logic language.

I believe the approach in Java and C# of attempting to build the entire system in one language and providing support for all the relevant programming paradigms in that language will just doom the afflicted language to unusable complexity. C++ anyone?

Of course, this requires a lot of effort in making sure the languages work well together. The programmer needs to be able to jump across from one language to another as easily as calling a function.

I'm not thinking of the DSL approach where a minature language is invented to match the problem domain, rather these languages will be targetted at parts of a system, irrespective of the problem domain.

DSL Skeptic Here...

Having written some DSL's myself, and use several others I'm finding more and more I provide/download the Domain Specific framework for my language of choice and use that. Better faster more powerful.

When stuck in a DSL I find I have to learn yet another syntax, and I have lost all the power of my language of choice.

Personally I'm starting to think DSL's are so popular because language tools have become so powerful that DSL's are very easy to create. So Oo looky, I have created my own language. In the mean time it would have been easier and faster and more powerful to just to use a scripting language + a framework as a DSL.

I didn't say prove... but can you defend this assertion a little?

Actually, you said can't prove

Actually, you said _can't_ prove ;)

I don't mean DSLs, to me a DSL is a language specific for some problem domain "out there"; like a language for performing matrix arithmetic, or a language for the AI in a game. These sorts of things were not what I meant.

In my opinion, some programming paradigms are better at certain parts of systems than others. Imperative is good for the top-level co-ordination and IO of a system; declarative is good for data access; and logic programming is good for business rules and conditionals.

Instead of trying to beat one language into accepting multiple paradigms, or tweak the paradigm until it matches the language how about just using different languages for these separate parts? The Haskell IO Monad could be regarded as a case of trying to force incompatible paradigms together.

How about just putting the IO and interface management in an imperative language, and once the data is gathered the imperative part of the program calls some Haskell functions? The imperative part would then also be free to use a dynamic or declarative language to assist with its UI as well.

Achieving this would require much better integration between languages. Possibly when designing a language people should think not about how an entire program could be written in this language but how could a small chunk of logic implemented in this language be integrated into a larger program written in another language?

"How about just putting the

"How about just putting the IO and interface management in an imperative language, and once the data is gathered the imperative part of the program calls some Haskell functions?"

That's more or less what the IO monad does. It just treats the imperative code as yet another Haskell value while it's at it.

So long as it is first and formost designed as a language.

I like programming in R (a stats DSL). I don't mind octave a matrix DSL. But I hate hate hate hate GNU linker script.

I do embedded systems development for my bread and linker script is a classic DSL gone wrong.

It's obscure, the error and warnings are very weak by the standards of modern laguages.

But what gets to me is as soon as you step one tiniest step beyond the Specific Domain (assigning ELF segments to address regions) it does nothing. Can't even add two numbers, not even a sane conditional.

The net result is you tend to generate linker script either via the C preprocessor or a scripting language like ruby.

Moral of the Story. It would all be so so much easier if instead of linker script it a general purpose scripting language such as ruby / guile / ... had been used.

Of course, historically it all happened the other way round, in that linker script probably evolved long before the scripting languages.

I believe...

  • Partial evaluation will start to make commercial inroads and become the winner in the 'how to meta-program' debate. Also as it is the 'slider' between static and dynamic typing it may cool down the needless heat over the issue.
  • From the ashes of JVM, .NET CLI will emerge a system (or language as OS) that can help languages from all sides communicate and be implemented efficiently on most systems. I think this is necessary for any Lots of Little Languages to realistically happen.
  • C will outlive us all.
  • Eventually the tide will turn on the current fashion for scripting languages. They'll change or die and most likely have too much inertia to change.
  • IDEs, and by extension PLs, will change from a file based to a knowledge base/data base paradigm.
  • Something more malleable than GC will come about, most likely from some sort of logic turned type-system, e.g. Linear, Separability or Uniqueness logic.
  • Concurrency and imperative languages need to start working better together. Once multi-core CPUs become the norm, this'll work itself out one way or another.
  • As much as we'd not like it to happen Monads will remain mostly of academic utility and not take over the world.

Good one

Partial evaluation was one thing I wanted to write about, but forgot. It sounds like a really good idea, but not widely implemented. Any pointers to useful work?

Book about PE.

A good book to start with about partial evaluation is Partial Evaluation and Automatic Program Generation.

.

Now reading. Thanks!

Here's a good place to look

Here's a good place to look partial-eval.org.

Some Good Notions here...

* Partial evaluation...
Excellent! I Agree.

* From the ashes of JVM, .NET CLI ...
You aren't working on Parrot are you? Hmm. I'm prepare to "keep an eye on the field", but I wouldn't bet on this one.

* C will outlive us all.
Can I quietly go and shoot myself now. Please no, no, NO!!!

* Eventually the tide will turn on the current fashion for scripting languages..
Perhaps, but I don't believe that tide will return to C/C++/Java.

* Concurrency.
I'm going to be a heretic here and say 99.9% of concurrency problems are in category "Well don't do that then" and the only language support needed are tools to detect foolishness and a larger clue bat to use on nidjits that do.

I'm not working on Parrot

I'm not working on Parrot (or any other VM/JIT Compiler) and know little about it. AFAIK it's a dynamically typed VM which does have its uses but I doubt it will take over the world.

* Eventually the tide will turn on the current fashion for scripting languages..
Perhaps, but I don't believe that tide will return to C/C++/Java.

I'd agree, I should have also mentioned that now that Ruby, Python et al have become popular many of their LISPy and functional features will be expected in newer languages.

that's just what C is counting on

"* C will outlive us all.
Can I quietly go and shoot myself now. Please no, no, NO!!!"

Playing right into C's pointer, if everyone that doesn't want C to survive kills themselves at the prospect of C's survival then C will survive.

Considers matter for a few minutes.. goes kills self.

Partial evaluation will

Partial evaluation will start to make commercial inroads...

Do you know of any commercial organizations currently doing research into PE?

Supercompilers LLC

Supercompilers LLC seems to be doing Supercompilers for Java. Not exactly partial evaluation but they want to accomplish the same things.

Coroutines

I believe that coroutines will turn out to be crucial for combining:

* Continuations
* Inverted control (e.g. in GUIs, web browsers)
* Partial evaluation
* Programmer-directed optimization
* Deforestation
* Fold/Unfold frobbing
* Loop fusion

And, no, not generators -- coroutines.

Now that's a weird one from left field... Tell us more!

I really, really wouldn't have pick that one. But that could well be because I'm ignorant, so please tell me more.

I have always regarded coroutines as "Like thread's but less efficient since you the only way of getting the other routine going is to do a context switch and context switches are very very costly."

Coroutines

I think it's generally the opposite of that. Coroutines don't require context switches because they aren't threads. The coroutines are in the same code space, and can be implemented via simple function calls with heap-allocated stack frames.

Although there are different ways of implementing them, coroutines are best understood in the context of continuations. You can easily implement coroutines in terms of continuations -- resuming a coroutine is just resuming the continuation of another piece of code.

Perhaps the reason you believe that coroutines require a context switch is that a naive implementation of continuations might involve stack-copying or stack-switching, which is something like a user-space thread package.

In a lot of ways, coroutines are like more deterministic threads. You can create pipelines made of multiple filter functions, each of which reads from an input stream and writes to an output stream -- much like unix processes communicating through pipes, or threads communicating through queues/streams. However, with coroutines, this kind of program can run without buffering, without context switching, and without any kind of scheduler. It can be perfectly deterministic. And, if you mutually inline the routines, you can get away without having to move the data between the filters much at all.

I suspect that coroutines have been neglected because they impose a strange and much more complex function-calling discipline on the programmer; it's something like using goto, and can easily turn your code into spaghetti. But this is only because of the lack of abstraction; if you hide the coroutines inside libraries, then you can get the benefit without it infecting the rest of your program with spaghetti. Put another way, you can use just enough spaghetti to get something useful done, and hide it inside an object or function, and get the benefits without the complexity.

For example, consider generators. A generator is something like a function call -- perhaps you have one that, when called, returns the next item from a stream of prime numbers. But that prime-generating routine doesn't actually return, it suspends itself, so that when it is invoked again, it's still got all its local variables, and it's still inside whatever loops or if-statements or whatever. In other words, it gets to keep its state in the cleanest way -- implicitly.

The problem with generators is that, in most implementations,they aren't quite first-class. Python generators are this way, I believe, although Stackless Python addressed this. And sometimes they are global, meaning that there is only one active instance of any particular generator. Which is about as flexible as only allowing one active instance of a function (otherwise known as Not Having Recursion).

Deforestation, as described by Wadler and others, can be seen as the inlining of coroutines, although I suspect that they would probably say that defining deforestation in terms of coroutines is exactly the wrong way around.

What I like about coroutines is that they give you some really high-level ways of cleaning up the interaction between functions and modules, in a way that works well in classic imperative languages. With coroutines, you no longer have to decide if a function is going to push or pull data from some other function. (You might not think of yourself as having to make this decision very often, but it's actually something you decide implicitly every time you have one bunch of functions call another bunch of functions -- your call tree imposes a calling hierarchy.)

With coroutines, you don't have to decide one way or the other, because you can have it both ways, and let glue code take care of any "impedance mismatch" between them. Again, I believe that with suitable analysis and inlining, this glue code can be eliminated. And since the functions you are connecting are cleaner (and have more localized side effects), they can probably be better analyzed and optimized.

But, like I say, I can't prove this. Yet.

I have implemented this, just barely. I was able to write code that looked like it was made of filters reading from source streams and writing to destination streams, and I was able to chain them together and pump data through. And the code doesn't have any buffering, except for a couple of implicit one-character buffers in the form of local variables in the filters. And, best of all, you can pull output characters from the end of the pipeline, or push characters into the beginning of the pipeline, without having to change anything about the filters. The glue code is designed to work either way.

I admit that it's out of left field -- saying "Coroutines are the answer to everything!" is like saying "Zinc is the answer to everything!" It's kind of crackpot.

But we all know that continuations are a great way of thinking about programs, even for languages that don't implelment first-class continuations.

We know that a 'return' statment is like a function call to an unnamed function pointer living in the stack, so we can then say "It's all function calls!" Likewise, if we acknowledge that first-class continuations remove the restriction of using a stack for everything, then we can further say "It's all continuations!"

I am merely suggesting we take another step and imagine we use continuations more, and allow ourselves some stranger user-defined control structures. We then hide them inside interesting control-structure libraries and use these to combine pieces of code. Then we might say "It's all coroutines!"

So I guess when I figure out how to deomstrate this, I'll have to write a paper called "Lambda: the Ultimate Coroutine".

a paper called "Lambda: the Ultimate Coroutine"

No, instead you'll pick another greek letter (to represent coroutines as opposed to procedures) and say "Tau: the Ultimate Imperative" and so forth.

Maybe what's needed is a better way to describe how coroutines running in tandem communicate? What sort of syntax for coroutines already exists?

Coroutines

It's certainly tempting to claim a new greek letter for one's new idea, but I don't think coroutines are all that radical, so I think I'll stay with lambda.

It's true that arranging the communication between coroutines is important. The syntax I'm using actually doesn't anything more complicated than call/cc, but, as I said, I hide this code inside some library code, so from the end-user's point of view, there's nothing funny going on.

Here's a typical filter from my prototype:

void filter( source, dest )
{
  while (1) {
    c = source.get();
    dest.put( c );
  }
}

(This just passes the data through; in practice, you would do more than that.)

This filter seems like something that might be in a unix pipeline, with source/dest taking the place of stdin/stdout. The glue code constructs source and dest objects that handle the communication between the filters. If you call source.get(), it resumes the coroutine that feeds this filter, which might be an input file, or perhaps another filter.

That source filter will call dest.put() on its own dest object; this call passes the character right into the original filter. In other words:

(Assume filter A is feeding filter B)
* filter B calls source.get()
* filter A is resumed
* filter A does some calculations or whatever
* filter A calls dest.put()
* the call to put() resumes directly into B's call to source.get()

This is my favorite part -- passing a value to A's dest.put() is the exact same operation as having a value returned from B's call to source.get(). This isn't a pair of operations that happen one right after the other; it's one operation.

Think of this in the context of C. When you say "return 37", you are really saying "pass 37 to an unnamed function of one variable, whose pointer is right there on the stack". That one action leaves one function and returns to another. In a well optimized implementation, with stuff passed in registers, this can just be a jump; with inlining, it might not even require that.

The glue code sets up the continuations so that the source and dest objects behave this way. As an added bonus, the are smart enough that
you can have B pull from A, or you can have A push to B. Either way, the code for A and B are identical; it's the glue code that figures out what to do.

With proper inlining, I believe these filters can be merged into something really tight, with very little glue waste. If you are willing to unroll the code substantially, the state variables of the different filters can all be partially evaluated. This would produce a ton of code, but it would be really fast.

Also, the state of each filter is nicely confined to the filter; it would be easy to do effect analysis on code like this, and embed it in a pure, lazy, functional language with monadic state. (I hope.)

As I mentioned earlier, this has a lot of the properties of generators, but some kinds of generators don't allow recursive coroutines, or aren't first-class, or otherwise have some restrictions on how they communicate with each other. But coroutines in general can easily be implemented with call/cc, and call/cc is about as flexible as you can ever want.

So, to get back to your initial question, the communication between the coroutines is described, or rather defined, by the glue code. Sure, this code might be hairy (as all code with call/cc can be), but, as with all hairy code, you can use abstraction to make it clearer. Don't thank coroutines for all this -- thank continuations.

Lua

Have you looked at Lua's coroutines? I think you'd find them very sympathetic to this approach...

I'm not sure I believe this is as fundamental as you do, but it's certainly a better way to express a lot of programs. In PVR's recent talk, he identifies determistic concurrency as a possible layer in a "definitive" language. Maybe this is a good candidate for that niche?

Nice

I took a quick look at the design of Lua coroutines -- that looks pretty nice.

I notice, though, that most of the example code explicitly uses coroutine-specific methods like yield() and resume(). Of course these methods are essential, but I think that coroutines are best used when they are hidden behind abstraction layers. Like in the src/dest example I gave earlier, in which the individual filters only know about src/dest objects, which are very much like stdin and stdout. You use them for I/O, but you don't know how they work.

Only the glue code deals with continuations, yielding, resuming, and so on.

semicoroutines

well, shell pipelines are a syntax for monodirectional data flow, but allow bidirectional control flow. As an example, the essence of a "99 bottles of beer" program may be expressed as:

countdown $beers | bottle | singverses

along with the appropriate definitions...

Coroutines as thunks?

At the low level, a coroutine is just code address combined with its call stack. This makes it similar to a closure, but more powerful: a closure usually consists of a code address combined with a small finite environment calculated at compile time.

Another concept that appears to be much more similar is a lazy thunk. Since a thunk can contain an arbitrarily complex unevaluated expression, it should be possible to represent coroutine continuations as thunks. After all, a pure generator is isomorphic to a lazy list, and it's not difficult to figure out differrent lazy data structures for various other forms of coroutine. For example, the producer-consumer pair could be represented by types

Producer a = (a, Producer a)
Consumer a = a -> Consumer a

I don't see any way to generalize this to represent arbitrary coroutines as lazy functions, though. Does anyone know of any attempts to, say, extend Haskell's State or IO monad and add the coroutining behavior?

html tinkering

I want to see if I can close the code block left open by Mario B.

Done.

Sorry about that.

Coroutines via lazy functions

There has definitely been work on monadic continuations, such as
A Monadic Framework for Delimited Continuations, which I accept as valid without fully understanding it. And continuations give you coroutines pretty easily.

But you make a good point -- the similarity between coroutines and lazy lists is closer than monadic continuations might suggest.

I definitely want to re-implement my coroutine implementation using explicit state monads on top of a pure, functional core, and I expect then that I'll be able to examine this issure more directly.

With coroutines, you don't

With coroutines, you don't have to decide one way or the other, because you can have it both ways, and let glue code take care of any "impedance mismatch" between them. Again, I believe that with suitable analysis and inlining, this glue code can be eliminated. And since the functions you are connecting are cleaner (and have more localized side effects), they can probably be better analyzed and optimized.

Continuations and transducer composition by Olin Shivers seems to describe what you want.

Transducers

I think I read that paper a while back -- I'll have to look at it again. After a quick look, I can see that it definitely does what I want with regard to merging loops and otherwise removing the overhead; but it still doesn't cover it all with a nice abstraction layer.

But you can definitely do it. I first learned coroutines when I was taking first-year scheme, and that's why I know they can be useful in practical situations.

Hmm. Disagree on first paragraph...

think it's generally the opposite of that. Coroutines don't require context switches because they aren't threads. The coroutines are in the same code space, and can be implemented via simple function calls with heap-allocated stack frames.

Although there are different ways of implementing them, coroutines are best understood in the context of continuations. You can easily implement coroutines in terms of continuations -- resuming a coroutine is just resuming the continuation of another piece of code.

Hmm. Contrariwise. They don't need the VM space to be switched, but they do still need the lexical scope switched in and out (that's what call-cc does) or at the very least at the very lowest machine code level they need...
* All the registers switched.
* The stack pointer switched to another stack. (ie. You also need a chunk or ram larger than your greatest possible stack depth per coroutine.

While the scheduling is simpler and faster than multiple threads (if you're not running this one, run that one) the naive usage (without buffering) is to do one fine grained chunk of work, pass to other coroutine, consume that pass back to producer routine. ie. Basically 1 context switch per item processed. ie. Inefficient threading.

Of course the solution is to buffer'em up and switch on larger chunks, but then that loses the I do one, you do one, synchonicity of coroutines.

And yes, you do get thread races in coroutines....

def routine_a
  n = myarray.size
  yield to b
  use myarray[n] and die!
end

def routine_b
  myarray.erase_all_and_everything
  yield to a
end

... it just easy to get them in a mares nest of threads.

Last time I implemented something using coroutines via call-cc in ruby I back it out and redid using threads and buffers. Way way way faster.

Efficiency of coroutine switching

Of course the solution is to buffer'em up and switch on larger chunks, but then that loses the I do one, you do one, synchonicity of coroutines.

The paper by Shivers, et al that was mentioned elsewhere in this thread presents a CPS transformation that enables the compiler to do this kind of chunking for coroutine pipelines. In their framework the chunking optimization you refer to would correspond to unrolling and inling a mutual recursion by a few levels. An advantage of this approach is that it doesn't do any buffering and thus doesn't change the semantics of the pipeline in the case of side-effectful coroutines.

Coroutines vs. Threads


Hmm. Contrariwise. They don't need the VM space to be switched, but they do still need the lexical scope switched in and out (that's what call-cc does) or at the very least at the very lowest machine code level they need...
* All the registers switched.
* The stack pointer switched to another stack. (ie. You also need a chunk or ram larger than your greatest possible stack depth per coroutine.

You can get around these issues by using heap-allocated stack frames, and in-depth interprocedural analysis. Not that these are easy to do -- heap-allocated stack frames are slow, and interprocedural analysis is hard.

I read Appel's Compiling with Contuations a long time ago. It really blew my mind, because it demonstrated that you can really tightly optimize high-level languages, sometimes to the point where you can implement function calls with many arguments without having to move anything at all, because you can sometimes just have things in the correct registers.

Otherwise, you get the same old overhead of moving arguments around for every call, whether it be a regular function call or a continuation call.

I know that some implementations of coroutines (or near-coroutines) use a regular stack, which means that they have to maintain multiple stacks; on the other hand, I also know that many scheme implementations do heap allocation all the time, or when necessary, to avoid this. Scheme does this to encourage you to use closures/continautions, because they are good things.

On the third hand, I further know there has been lots of research about knowing when you can stack-allocate your stack frames, and when you have to heap-allocate them. And research into hybrid systems, where you allocate frames in stack chunks, so that if you have to copy, you only have to copy a little bit. I believe Chez Scheme does some of this (because I believe R. Kent Dybvig has done research on this.)

To summarize, if you optimize for continuations (at the cost of all function calls being a bit slower), you can get coroutines without the kinds of overhead you normally have with threads. I suspect there is a continuum of implementations, based on whether your coroutines are more like functions, or more like threads (whatever that means).


While the scheduling is simpler and faster than multiple threads (if you're not running this one, run that one) the naive usage (without buffering) is to do one fine grained chunk of work, pass to other coroutine, consume that pass back to producer routine. ie. Basically 1 context switch per item processed. ie. Inefficient threading.

Definitely. This is just like the case you get when you do a function call for each pixel of an image that you render; it's much slower than a loop. But I'm assuming that I want to write my code in a high-level way (with lots of function calls), and get the optimizer to remove the functions and create lots of loops (c.f. the venerable Lambda the Ultimate Blah papers). And if you can optimize function calls really well, perhaps you can do the same thing for continuations and coroutines.

A caveat, though: flow analysis in the presence of first-class functions is much harder, although Olin Shivers has done great work in this area. I bet it would be harder with coroutines, if only because they not only use continuations, they use them densely and strangely. But I don't know.

Coroutines (at a guess)

I think you have it backwards. Threads (native threads anyway) require context switches. Coroutines live entirely in userspace and do not (necessarily) require any scheduling, so they can be much faster.

The way I understand it (and I could definitely be wrong) coroutines can be seen as something along the spectrum between procedures and threads. A procedure is something you call; while it's running you block; and when it's done it returns and is no more. A thread is something you kick off and then it runs completely independently, so communicating with it is complicated, and race conditions are possible.

Coroutines are (I think) like procedures, but instead of just calling them once and getting a single return value, you can have a conversation with a coroutine. Good for two processes that have to run in tandem and communicate with one another, one would imagine. Like lexing and parsing. Or a REPL and the GUI that presents it to the user. Etc.

Coroutines

Exactly. Lexing and parsing is a great example of coroutines in action; the wonder that is lex/yacc manages to generate C code that implements coroutines for lexing and parsing.

If you know the internals of this stuff, you know that it has to maintain a special stack for handling the communication between the two, and the maintenence of state when switching from one to the other. It's kind of like a domain-specific call stack, and because it's under the control of the C code, you can go beyond the function-call paradigm into the world of coroutines.

And the REPL/GUI example is also a great one. The reason GUIs (and browsers and REPLs) are hard is that they are programs where there are two different entities in control.

In a regular program, say, for generating primes, the program has total control of the process of generation, and the order of actions, and the hierarchy of function calls needed. (Not that you would really need function calls for that, but whatever.)

In a GUI program, you have the program, which is dealing with its own concerns, inlcuding rendering the GUI. But you also have the user directing things. It's difficult to deal with the fact that there are two loci of control. The result is usually some kind of event-based callback system; the downside of this is that the callbacks don't get to maintain state between calls.

Coroutines alleviate this by putting stateful control in each place where it is needed. The code that catches input events is one coroutine; the rest of the program (including GUI updates) is the other one. But is this any easier? I believe it is. Not only can the main program routines keep state more easily, but you can use abstraction to make it easier.

This is often called "inversion of control"; it's been studied in the last 10 years or so by people working on GUI code in purely functional languages, and also by people trying to find better ways of implementing web sites.

I've been working mostly in web stuff for the last 10 years, and I really can't stand the whole cgi-script-as-stateless-callback thing. I've tried a thousand ways of making it cleaner, but you really just can't do it right without continuations, or something equivalent. There are definitely problems to be solved before this kind of stuff could be used commercially....

I can provide references to research about functional GUIs and inversion of control....

Manually-written RD parsers equal to table-driven parsers

I believe that there is an algorithm that can make manually written recursive-descent parsers to have equal performance and behaviour to table-driven parsers.

If such an algorithm existed, one could write a parser using techniques like those in Boost::Spirit, and avoid the problems of recursive descent parsers (stack overflow, backtracking, left recursion).

Packrat Parsing

Packrat Parsing (The parsing technique formerly known as memoized recursive descent) is amazingly good. I've implemented it in both PHP and C, and it was nice and fast. And it's really easy to implement. And you don't have to have a separate language for lexing and parsing -- you express them both in the same recursive grammar.

It's not the most original idea, since it's just memoization, but it was really smart of Bryan Ford to think of trying it.

I don't remember the hierarchy of parser power (LR, LALR, etc.), but I figure Recursive Descent has to be the most general. And when you use Packrat Parsing for the kinds of grammars we usually use, it's just as fast as a table-driven parser. (At least in theory.)

It does use a ton of memory, but Bryan Ford's argument is that if you are implementing a compiler, it's quite likely that the stages that come after parsing are going to use tons of memory, so there's no reason to avoid using tons of memory for parsing.

And I suspect that if you do some kind of grammar profiling, you could select only some terms to memoize, saving the tons of memory without sacrificing much speed.

Packrat Parsing in PHP

Hi Greg,

Is your PHP Packrat Parser Code public? If so, I think many of us would love to make use of it.

public code

I'll ask my boss. I wrote it for work, and tried it out, but it never got used by our code, so maybe they won't mind.

can't do it

I can't release the code, but I can give a quick explanation of how to do it. It's pretty easy.

Not quite comparable

I don't remember the hierarchy of parser power (LR, LALR, etc.), but I figure Recursive Descent has to be the most general. And when you use Packrat Parsing for the kinds of grammars we usually use, it's just as fast as a table-driven parser. (At least in theory.)

Backtracking recursive descent is as general as you can get (but quite slower than Earley or GLR) with context-free grammars. Packrat parsing is not just memoization in a backtracking recursive descent parser; you might run into big surprises if you see it so: not all backtracking points are considered.

It is wiser to see it as a different way of expressing a formal syntax, with some big advantages: it is quite modular, and you have conjonction and negation operators. On the downside, it is not as intuitive as what you might expect from the ressemblance with context-free grammars.

memoized parsers

Can you explain what you said about missing backtracking points? I implemented my parser as a regular old recursive-descent thing, where I do a depth first search, and stop searching a subtree after the first success. Then I memoize it, and it's nice and fast. (Good cache-hit rate.) I definitely don't enumerate all possible parsings.

Also, can you give a one-sentence descrition of Earley parsing?

Backtracking points in packrat parsing

Here is an example (copied from Aho & Ullman, 1972, page 458) of a packrat parser that will not give the result one might expect: the PEG is

  S ← A c
  A ← a / a b

and the input to parse is

  abc .

Function S is called with abc as input, and in turn it calls A with the same input. Using the first alternate, A matches the first a successfully: we return in S with bc as remaining input. But at this point, S expects to see a c and reports a failure on abc. Since A reported success on its first call, it will not be called to try the second alternate: this is what makes a packrat parser different from a backtracking recursive descent parser. The language described by the PEG is { ac }, whereas the language described by the corresponding CFG would have been { ac, abc }. A PEG describing the latter language would simply be

  S ← A c
  A ← a b / a .

The bottom line here is that the / operator is quite different from a regular alternance operator.

About Earley (1970) parsing, it is a general parsing method with O(n3) time complexity. It uses item sets and will look familiar if you already know LR(0). The best thing to do is perhaps to read the appropriate presentation in Grune & Jacobs, 1990.

incremental parsers

I think we really need to be focusing on incremental parsers in IDEs. Expressing a batch-style parser is a solved problem, but expressing a parser that can run incrementally is still an open issue.

Isomorphisms, constraints, and nouns

I believe, but cannot prove, that the most productive way to write programs involves:

1) High-level projections and isomorphisms between data representations. The projections are necessary before final output to display/file/database/network/etc, but within the program itself they should always be reversible. Debugging can be made easier by simply removing some of the projections. The isomorphisms should be reactive (aka incremental, aka event-driven) as much as possible, to allow immediate feedback. Because of the bidirectionality, the program can always talk about "the foo that this bar corresponds to", without having to pass around a (foo,bar) pair.

2) Declarative constraints on the implementation. For example, I should be able to say that I want a program that is observationally equivalent to a given program, but where a particular data structure is persistent. Or I might write a bubblesort algorithm, but then state that I want the actual implementation to work in O(n log n) time, and be tuned for 10-20 million items. If the computer can't figure out how to achieve these constraints, then it reports failure and asks for a hint.

3) A very strong focus on concrete data, as opposed to code. I believe that humans have trouble understanding the effect of a string of verbs (i.e. code) when they can't actually visualize the nouns (i.e. data). Single assignment is just one example of giving a name to the effect of a verb -- but we can go much further. The clearest example I've seen of this difference in style comes from Freek Wiedijk's comparison of theorem provers: compare Mizar (strongly noun-oriented) versus Coq and Nuprl (both strongly verb-oriented).

Re: Isos, etc.

I am interested to know how you came to your belief about isos and projections.

visualization, search engines, stream processing

The most immediate source for this idea came from working with graphics and visualization. There, the correspondence between data structures and pixels on the screen is very strong. At around the same time, I was also working on business applications, where there is a clear correspondence between items in a list and records in a relational table, or the fields of a dialog box and the columns of a particular record. The need for reactive, bi-directional dataflow was obvious in that domain.

After that, I worked on search engines. The general architecture of a search engine is to have lots of machines that store disjoint subsets of data, and then other machines that aggregate these partial datasets into a complete view of the entire dataset. The main isomorphism I had to deal with was between the data structures of the first and second types of machine. In general these are optimized for different purposes, but they usually store the same information, and keeping the two synchronized required a lot of busy work. It was while working on search engines that point 2 of my ideas really started growing, although I admit it's by far the least practical of the three.

Most recently, I have been working on stream processing. This domain is more about transforms than about isomorphisms, which means that debugging is a tedious affair, since the connection between input and output messages, or between messages and persistent records, is completely lost.

losing debugging info

(was just looking at Freek's Formalizing 100 Theorems and looking to see if it has been mentioned much on LtU and thus read this thread.)

Yup, even in good old event based systems you lose debugging info; when you get a message there is generally no nice way you can easily follow back to find out who even really generated the message or why. So this seems to be a general problem: how can you keep the universe of history which might come in really handy? Do you log the daylights out of everything? That doesn't generally equate with having a nice source line debugger experience. There's the Omniscient Debugger for (some value of) Java, and O'Caml I guess has something along those lines. When and how do they fall down; what could one do to make them work?

Debugging is hard; one has to consider the gamut of resource constraints different people might be laboring under.

Execution trace junk

Not to forget that there is much "junk" with respect to debugging in execution traces. Debugging e.g. parsers is never a nice experience even if you are omniscient.

Debugging is about localization of causes. But since everyone in CS seems to work about static aspects of PLs it might be up to us programmers to provide innovation here ( = moving beyond both single step execution tracing and printf debugging ).

Yup, even in good old event

Yup, even in good old event based systems you lose debugging info; when you get a message there is generally no nice way you can easily follow back to find out who even really generated the message or why.

E seems to have a pretty good distributed debugger.

debugging event context

rauold : Yup, even in good old event based systems you lose debugging info; when you get a message there is generally no nice way you can easily follow back to find out who even really generated the message or why.

It's unnecessary for event based systems to be that way.

You put your finger on just the complaint leveled by myself and coworkers in the past when we used some event based system.

That's why I designed in debugging support for events in a system I describe at http://www.briarpig.com/aelig.html, where an async event is rendered as a "call frame" in a spaghetti stack continuation scheme.

The main problem with events (as I tell it to others in each story) occurs when an error happens when processing an event. Who sent the event? What other circumstances are involved? What's the context?

I decided to keep context in an "event stack" so you have just as much debugging context as you would in a normal stack-based synchronous language. In a normal stack based dispatch regime, each new call frame occurs in the context of an existing frame.

So to have the same debugging info for events, you want to chain events together when possible. Each new event frame should point at an earlier frame that was the cause or calling context. The receiver of an event gets the entire stack of frames (so an event is really a stack and not just one frame).

This requires refcounting of frames (or some other kind of gc), which is resisted vigorously by others I work with, who wriggle frantically like fish trying to get off a hook. They want to go back to stackless myterious debugging because they want to ignore memory management discipline. (Actually, a chain of dependencies following from refcounting event frames causes usage conventions detested by typical programmers, because they don't want to see any sign refcounting is involved; but with C apis it's hard not to see every nuance.)

However, it's not hard to do functional programming with events using immutable spaghetti stacks of frames that are threadsafe when they don't change. So you can cook up Erlang style immutable messaging based on such events -- or something similar.

Sounds good to me

ship it! (what are the resource issues like? bloat, chance of leaks.)

busy, busy

rauold : ship it! (what are the resource issues like? bloat, chance of leaks.)

I'm up to my eyeballs in work, so it'll be a long time before more appears than that, maybe as much as a year.

I'll write something about resources soon, though. That's a good question. I do resources, scaling, threads, and interactions of those basic things with the complex stuff. I can give you a short reply in advance of a good answer later. You can consume all available resources if you make a mistake. You can usually do this in a Turing complete language that lets you consume resources. Some profiler must tell you where resources went (and you also need some report about what happened in absence of failure too).

I'm a real pessimist about refcounting unless you do a draconian form of auditing I'll describe in detail later. (I'll post a link after.) I tell every employer I've never seen a refcounting system work that didn't have aggressive auditing. I tell them it will always be wrong left to human hands to qualify.

If I add refcounting to a system but I haven't done automated auditing yet, I tell folks it's not expected to work, and until auditing occurs we can expect memory corruption. I treat it like a binary condition: broken until thoroughly shown correct. You can't be a little big pregnant, but that's what folks hope with refcounting (that it will only be a little bit broken).

You need to capture a backtrace every time a resource is acquired, keying a stats value counting ups and downs (and whatever else you want to count). A reference to a counted object should be a "handle" that is counted pointer plus optional backtrace (if auditing is enabled) or backtrace hash if you want to save a 64-bit hash of the backtrace instead of a pointer. (Backtraces would be interned anyway, so a pointer is fine.)

Everyone detests the handles. In C++ you can hide them. But in C it looks like proliferation of entities. Oops, I told you more than 50% of the long answer. If the inner tree of the event frame spaghetti call stacks is not big compared to the leaves, the space overhead isn't significant in the absence of leaks by design or error.

Isos and debugging

I agree with your #1 -- I also believe that visualization of intermediate steps in a function is a better form of debugging than the traditional step-through-the-loops form.

Are there any examples of languages which allow bi-directional projections? I don't quite know the term for this concept ... but for example: I transform a string into a numeric matrix, update a value in the matrix, and the value propogates back into the original string.

Concurrency

I believe, but cannot currently prove, that we do not have a good formalism for concurrency in programming languages. My evidence for this is that we don't have a Curry-Howard-like isomorphism for it. This sounds terribly Wadlerian, but when we find the "right" formalism for concurrency, we'll know it, because it will have a logic that comes with it.

What I suspect but truly cannot prove, is that no such formalism will satisfy all concurrent programmers. We may have to be satisfied with a bunch of them, all of which deal with limited cases.

I also believe, but cannot prove, that true multi-language development will never happen. Like with CLR systens, many languages which participate will have to be crippled in order to work correctly. (See F# for details.)

Pi calculus?

I'm not familiar with it myself, but isn't pi calculus supposed to be a logic for modeling concurrency? What's wrong with it?

Not pi calculus... yet

Pi calculus is like lambda calculus, in that it's a programming language for modelling concurrency.

Just about every programming language "issue", be it parametric polymorphism, object-orientedness, call by name etc, has a programming language model that also has a Curry-Howard-like isomorphism with a logic calculus. The pi calculus, which is certainly IMO the best formal model of programming language concurrency that we have so far, doesn't have a CH-like isomorphism.

This may seem like a weak objection, but I think it's much more important than a lot of people realise. (I apologise for sounding like Phil Wadler here, but bear with me.) Typed lambda calculus has a legitimate claim to be the most "natural" way to program. Not in the sense of being easy for humans, but in the sense of being a part of nature. It's as natural as modus ponens. (After all, modus ponens is one of the typing rules.)

Similarly, extensions to lambda calculus have found to have isomorphisms with fairly natural logics. Pi calculus is the odd one out. In that sense, it has no claim to be the "right" way. Either we need to find a CH-like isomorphism for pi calculus, or we need to find another model for concurrency. If the former, I believe (but can't prove) that pi calculus would be the "best" model for concurrency. If the latter, I believe (but can't prove) that the other model will be even "better" than pi calculus.

Incidentally, if someone wants to use this as a research topic, I think that what isn't fully appreciated is the connection between Curry-Howard-like isomorphisms and category theory. Curry-Howard connects simply-typed lambda calculus with the intuitionistic calculus of sequents. In fact, they're both also equivalent to Cartesian-closed categories. I believe (but can't prove) that all important CH-like isomorphisms also have useful connections with important families of categories. Perhaps it would be easier to find the family of categories first, and then build the programming language/logic from it.

Action Calculi?

Are you sure the relationships between pi-calculus, logic, type theory, and category theory, don't already exist? It seems to me, from just a glance (I haven't studied action calculi yet, but I am interested in it -- I'm still working on the pi-calculus) that these relationships are explained through work that is being done now on action calculi.

Furthermore, it seems to me (and I could be wrong) that if we can come up with a theory of typed pi-calculus (which has already happened in various ways), that just the existence of such a thing implies an underlying logic by the work already done on the Curry-Howard Isomorphism. Also, the fact that the pi-calculus can embed the lambda-calculus seems to imply that such an isomorphism already exists, or at least a more general one.

Finally, it seems there has been some work on correlating action calculi with linear logic, which, if nothing else, seems to imply the foundational correspondence you seek.

But I don't know much of the formal details of this stuff yet. Then again, seems fitting to comment on it given the nature of this thread :)

Pi-calculus and logic

You may be interested in taking a look at the existing (if somewhat old) work on the pi-calculus and linear logic (which was apparently initiated by Samson Abramsky in the early 1990's, and considers "proofs as processes").

More recently, there has been a lot of interest in applying "spatial logic" to bigraph structures (and by extension to the pi-calculus). One example this work is a paper by Luís Caires on Behavioral and Spatial Observations in a Logic for the π-Calculus. A more general look at spatial logics and bigraphs can be found here.

Actually, it may be bigraphs (rather than the pi-calculus) that will form the concurrency model you are looking for, since bigraphs are a generalization of the pi-calculus, Petri-nets, and several other concurrency models.

[Edit: see also the work at University of Copenhagen on Bigraphical programming languages]

CLF

Just to toss it out, the "implementation" of the π-calculus in CLF (or LolliMon if you want something more tangible) is pretty sleak and natural and gives a good way to see the correspondence between linear logic constructs and the π-calculus. And, in general, CLF and LolliMon are really cool.

Categorical Logic

Many logics have been given categorical models; in fact, I'm not sure which ones have not. And, e.g. Bart Jacob's thesis shows a method that leads to a whole family of categorical models and associated logics. With models for the substructural logics, I don't see too much difficultly in deriving the categorical semantics from the logic. Incidentally, occasionally the Curry-Howard correspondence is referred to as the Curry-Howard-Lawvere correspondence viewing Lawvere's work on categorical logic and the relationship to CCCs as being the nail in the coffin on the issue. As a tangent to the tangent, Cartesian closed categories don't exactly match the simply typed lambda calculus. They require the structure of finite products that the typed lambda calculus does not. Getting a "tighter" association is a bit more complicated.

Going the other way can also be fun. Put some category together with the appropriate pieces and see what logic and programming constructs pop out. E.g. the thesis Presheaf models of concurrency can be read backwards this way.

Cool

That's very cool stuff. Way outside my area of expertise, of course, but that's why we all love LtU!

multi language development

I was only going to comment on the multi-language part of your post, then it seemed unfair to ignore the part on concurrency. So I added random thoughts there.

Pseudonym: I believe [...] that we do not have a good formalism for concurrency in programming languages.

There might be a goal conflict whenever the idea is to avoid defining an order in which two events A and B occur, so happening concurrently is an option enabled by lack of definition. Is it just me, or is it often hard to state unknowns explicitly in many formal systems?

One might need a formalism describing the relation between two other formal systems -- one for a language and one for a scheduler -- but that just sounds confusing to me, and perhaps over-constraining, though it would let you describe a user's expectation for the effects of a given scheduler design.

Pseudonym: I also believe, but cannot prove, that true multi-language development will never happen. Like with CLR systens, many languages which participate will have to be crippled in order to work correctly.

Well I want to both agree and disagree. The less alike, the more languages must be crippled (ie arbitrarily constrained) to work together nicely. I agree the most when you mean a system that intends to knit all languages together regardless of difference. As you increase the number of simultaneous constraints a system must meet, the smaller the simple solution set.

But for a few similar languages, you could make a system that treats them all fairly equally. I'd still expect them all to be tweaked slightly though. Even so, I'd think crippling would be absent and true multi-langauge dev would be present in this particular context, with these languages tweaked to co-exist.

Objective-C is kinda like C with Smalltalk embedded inside it (within the square brackets) with a few weird things done to make Smalltalk with with C.

Personally, I'd rather cripple the C-like languages and keep the Smalltalk and Lisp style languages mostly unchanged. (Let's hypothesize a system with codename assignments: Gum=XML, Gab=Smalltalk, Gyp=Scheme, and Pyg=Python, where the last two are notably palindromes.) Python would fit in without conflict, though you'd be able to do some things in Pyg not allowed in Python when those things are allowed in Smalltalk or Scheme.

Note the term "language" tends to encompass far too much territory, and is typically understood as not only including syntax, grammar, and semantics of basic primitives, but also such extended things as libraries, associated frameworks, and canonical development environments. The more things included, the harder to get them working together. A hypothetical system like Gum/Gab/Gyp/Pyg might aim for differences is syntax and grammar only (with a means of embedding any inside another) with the rest in common.

To convert code written in Scheme for execution in a standard Python environment, you might try a code translation path something like Scheme -> Gyp -> Pyg -> Python, with first and last steps burdened with attempts to map environmental complexities, with presumably only approximate results needing hand tweaks.

To me this approach makes far more sense (and seems far easier) than grand unification in one context like CLR. And it works better in a real world where things have not been unified. You can convert your code to run in the local preferred language, if it's one supported, without being co-opted by one language and associated community.

So basically I'm interested in a translation oriented world, with data in heterogenous formats consumed by transducers that fold, spindle, and mutilate as necessary to drive the requested end results, whether they be web sites or any other sort of client/server agents. This is a federation model instead of a grand unification model. Is it still multi-language development?

being visionary

I believe but can not prove yet that there is a serious impedance mismatch between programming languages and human cognition. The need for more powerful programming languages (eg. in order to deliver certified mobile code) will eventually make this impedance mismatch larger. Looking at type systems we see the emergence of dependently typed programming languages, such as Epigram. Essentially programming becomes theorem proving, which is not for the everyday mind. I believe programming will become an interactive game between a human and a theorem prover, with a program as its result; a program which is not necessarily human readable (for the majority of programmers), but which lends itself to multiple views (either visual or textual) on specific aspects of the program. I believe the current range of theorem provers is too much geared towards the professional mathematician and as such don't lend itself to programming in the large.

Go softcomputing

An ungrounded speculation of mine is that future systems are guided by testcases in a more active way. They feed empirical data into the program which are used to train the system, calibrate it, cut down the configuration space, deselect alternatives and optimize pathways. Not the tests themselves of course but a program that evaluates the data traces appearing in the system at test. Systems expect to behave in future as in those cases where they got tested. The additional structures that emerge during this process can be assgined as "guards" or "types" - why not? Although programs are not structured like neural networks, language designers share connectionist assumptions about supervised learning and apply "softcomputing" heuristics.

Thanks...

...for the reminder that I need to learn IBAL!

Hey, this looks interesting!

Hey, this looks interesting!

"A Mutation-based Framework for Automated Testing of Timeliness"

http://ise.gmu.edu/~ofut/rsrch/robertnilsson-diss.pdf

Not exactly your vision but pointing that way.

Formal Grammars, Luc Steels, Natural Language & AI

In reference to

"There is no future in natural language programming."
- sigfpe
"There may be a future in quasi natural language programming."
- Peter J. Wasilko
and to Niels Hoogeveen's comments on the human-computer impedance mismatch

I believe we are on the verge of some tremendous breakthroughs and possibly of general AI itself, through the extension of formal grammars into the real world.

I believe, but cannot prove, that Luc Steels' experimental work on how language arises and evolves in a society of robots will supplant the more limited Chomskyian view of languages as formal grammars. Among the results wil be a first operational definition of linguistic intelligence and, possibly, of general intelligence.

Steels' work provides a framework for the introduction of new symbols into a language. In a controlled environment AIBO robots develop their own words and grammars for objects in their environment. All aspects of human language development are mirrored in these experiments: words compete for acceptance in the population, new words are created, and grammatical structures arise spontaneously. Steels' work also addresses the idea of a "robot culture", since it is in the context of a population of cooperating agents that language becomes most useful.

Sources:

A favorite, How to do Experiments in Artificial Language Evolution and Why.

Unexplored Genres, Natural Language, Efficiency,Self-Compilation

I believe, but cannot prove, that there are not one, but several genres of general PLs (functional, imperative, declarative, etc.) that have so far been basically unexplored. These may have been touched upon in DSLs, but not from a general PL viewpoint.

I believe, but cannot prove, that the more powerful a language is, the less like natural language it is. This is not due to any true problem with natural languages (besides ambiguity), but more due to the fact that the most powerful languages are also the most simple, with all their power built upon very simple semantics.

I believe, but cannot prove, that a more powerful language does not strictly imply a less efficient language, though such a powerful, efficient language can not appear for a long time. At some time in the far future, a language will be devised that not only has macros, but implements them by means of a much lower-level, more general semantic; this rule will work on such a low level that it will be almost indistinguishable from plain assembly. This will be one of the ugliest languages ever made.

I believe, but cannot prove, that a language can be devised in which a program really is its own compiler, and the PL's compiler itself is much closer to an interpreter for the program; the compiler/interpreter simply runs the program, which generates machine code. This could turn out to be fairly powerful and efficient. It may even be the same language I mentioned just previously with the new, near-assembly-level semantic.

[Edit: I believe but cannot prove that as I progress, I will look back at my predictions and realize, one by one, that they are all naive in some way or another.]

Measuring work done ...

I believe there exists an algorithm that can take a program and calculate a single number in the range [0,1] that indicates the fraction of its performed computations that's actually doing useful work.

I'd love to have this tool to be able to prove how much useless code is typically generated in large projects that do "russian doll programming" - generating wrappers around wrappers to wrapped wrappers of functions ... usually in "object oriented style".

1. The first stage of this evaluation is to throw away all code that does not run even once. This stage can have its own score.

2. The second stage is the real evaluation that assigns work done at the instruction level. For example, copying one memory location to another (if its a pure memcpy and not a screen blit) is assigned a "work done" figure of exactly zero. In general, it looks like almost all information preserving transformations can be assigned a work done = 0. Gone will be all the XSLT->SOAP->Marshalling->DCOM->COM->local copies and all such pointless gunk.

3. Map the actual work back to the source code doing it and identify all those who wrote the other bits. You know what to do to them :) You also, I presume, know to give a raise to those who wrote the code that actually does the work!

Computational modelling

I believe, in the mysterious future, programmers shall directly manipulate constructs that we might now naively call ASTs, but they shall have more power and expressiveness—and they won't look like lisp. (I believe endless arguments will erupt as to whether this is a good or a bad thing.)

As a side note, I believe the current hooplah about concurrency ('Oh my God! The cores are coming!') will be laughed at in this mysterious future once a simple paradigm, perhaps in the form of a message-passing operator such as in Erlang, becomes a well-established meme.

I can prove none of this, except for the arguments about lisp, which (in deference to an earlier post) not only shall outlive us all, but are already older than C itself...

Embedded

I believe, but cannot prove, that over time language syntax will adapt to the programmer not the other way around. That is, text editors will reformulate the language in a syntax the programmer is familiar with.

I believe, but cannot prove, that we will see more and more embedded devices and computers as such will adorn only the desk of programmers and researchers. That may have the side-effect that current efforts to make better use of desktop systems would have marginal influence on the future.

I believe, but cannot prove, that someone should really tell us about cool stuff happening now and still in research stage rather than throwing in speculations about the mysterious future. Shoot me.

Apple chic and Prêt-à-porter

PLs will stay communication media and not intellectual masturbation devices for syntactically challenged people who need special editors for reading their teammates code and are handicapped in collaborating at the same screen. Hence your assertion might be disproved by plain reason. Same for grammars of natural languages that adapt to anyones speech defects and are balanced by "universal translators".

However, your assumption about digital chic is interesting. Maybe some years from now Apple will make Prêt-à-porter shown first in Milan, Paris or New York?

-------

Unfortunately there is not much cool stuff happening right now but just business as usual and "normal science" exploiting the paradigms known from the 60s and some crossovers. We see just more from the same and decreasing returns for intellectual heros. So, you have to kill yourself.

... So, you have to kill

... So, you have to kill yourself.

Put down and slowly step away from the pipe;)

Intellectual masturbation devices

PLs will stay communication media and not intellectual masturbation devices for syntactically challenged people who need special editors for reading their teammates code and are handicapped in collaborating at the same screen. Hence your assertion might be disproved by plain reason.

Was that some stylistic rebuttal?

Anyway, concerning the need to collaborate around the same screen with teammates: what, you need to share the same view. How is that different from today situation where all the coders for one application use the same programming language?

Thank you for pushing me gently towards the gun. At the moment, I will take a step aside.

Was that some stylistic

Was that some stylistic rebuttal?

"Skinnable syntax" is an old, libertarian theme in IT pop culture: the language that suits me. It is mostly a social and aesthetic fantasy not a technical obstacle and I treat it as this: with some fierce polemics. Hope only that children do not read my postings and listen to gangsta rap or Marilyn Manson thereafter...

LtU Policies

Plase refrain from posting such abusive language as it's against LtU's civility policy.

The Future

I believe that Microsoft and Apple will create software construction applications that will eventually all-but-eradicate the need for programming your average information management program.

I believe that in the face of this, the computer programmer's industry will downsize significantly, the same way robots replaced workers in the auto industry.

I believe that if constraint solving ever becomes a practical tool, it will automate much of programming--type inference, assembly optimization, memory management/garbage collection--all the stuff that a programmer shouldn't have to do. Massively parallel computers will help usher in this development.

I believe that if everything is allowed to run its course, eventually, all tedious, detail-oriented tasks will be handed down to constraint solvers, and programming will more and more become a creative process, and less and less a complex process.

I believe that the evolution of computer science will never end until it meets its limitations--hardware and virtuality.

I believe that some day, different paradigms will no longer be looked at as religions, but as different solutions to different problems--all paradigms will be used only as abstraction tools, and have nothing to do with machine design--each paradigm will be used to model the natrual representation of the way we think naturally in terms of the problems we model.

I believe that metaprogramming will just become another feature addition to multiparadigm languages, and isn't the end solution. Some languages just don't inter-transform straightforwardly. Just try to solve the problem of using 100% metaprogramming to implement new paradigms, language-wide garbage collection, type inference, or code optimization!

I believe that generic programming would make another great feature addition to general-purpose multiparadigm languages.

I believe that DSLs will not be used widely, since many IDE vendors create editors for many kinds of content that suits itself well to DSLs.

Close

Of all the visions so far yours is the closest to mine. I would add to it that I beleive concurrency will be delt with in a message passing model, not via threads. My MPI codes run 90% efficient without modification on a mulit-core macbook under OpenMPI. What is the reason to bother with threads when message passing scales so well?

The other thing I believe is that people will start using multiple languages more. As of late I have been developing in Ruby, profiling, then re-writing the unacceptable parts in C. It saves a lot of development time and is way more maintainable.

There will always be work for those who ...

... solve people's needs and problems. The current trend (from the point of view of my clients) is that there is much being foisted on them that appears to be "pretty" (my word) but doesn't actually live up to the image.

.... create software construction applications that will eventually all-but-eradicate the need for programming your average information management program.

They haven't managed to do this yet and from the directions that they have taken over many years any system that they obtain and then market will have all the good stuff removed. It may all look pretty and easy to do things (and may prove so for very simple things) but in the long run will cause more frustration for non-programmers. Their stuff causes enough frustration now.

There will always be a need for people who can talk with, determine what is needed and then deliver applications that do what the client wants and needs.

...., the computer programmer's industry will downsize significantly, ....

It already has done so with the off-shoring of many jobs. But the wholesale replacement, the industry is about making things more complicated not easier so that they can sell more of ...

... all tedious, detail-oriented tasks will be handed down to constraint solvers, and programming will more and more become a creative process, and less and less a complex process.

One could only wish, but it aint goner appen.

... different paradigms will no longer be looked at as religions, but as different solutions to different problems ...

What make this field an engineering discipline? (Other comments and thoughts better left unsaid).

My belief

I believe that in a few years' time, functional programming will be taken seriously, and those who can do it will be highly regarded rather than being considered ivory-tower academics. This is a variant of the "No technology exists until Microsoft officially invents it" idea, as my reason for this is based on the incorporation of LINQ and functional-programming technology in the next version of the .NET framework.

My two cents

My hope:

* The distinctions between the various paradigms will continue to blur, programmers will quite worrying about how "pure" their language is with respect to a particular flavor of the month, and we'll just create good languages:

My fear:

* We'll still be coding in Java. Or C. Or SQL. (My long-term hope is that better languages will replace these in domains where they are relevant).

My hope:

* Academia, industry, and hackerdom will find more respect for one another's contributions. Hackers and industrial programmers will find relevance in the scientific discovery of the academy; and the academy will continue to remember that software engineering is a distinct discipline from, say, type theory. :)

My fear:

* Academia will continue to regard industry as incompetent; industry will continue to regard the academy as irrelevant.

metaprogramming and functional languages

I believe, but can't prove that languages that don't support some kind of compile-time metaprogramming (or even those that don't support it well) will fade away. This is the DRY principle: repeated code is evil. Compile-time metaprogramming gives you a better way to avoid rewriting the same code over and over. The fact that languages like ML and Haskell have both acquired metaprogramming support lends support to this idea.

I also believe that in five years or so (if not earlier), functional languages will hit the mainstream in a big way (particularly statically-typed FLs). The advantages of the type systems, the strong type checking, and the freedom from side effects will be too compelling to resist, though non-functional languages will still retain a (shrinking) niche. Someone also pointed out above that FLs will scale better to parallelism, which I agree will be another driving factor to FL adoption.

The solution to the concurrency problem...

Will be a two language solution:

Language A:
will be a sequential language designed from the ground up to be easily parallelizable at the meta level. It will not be a parallel language with parallel programming constructs, neither message passing nor multi-threading nor explicit effects. It will have a lot of restrictions that may seem strange to the sequential programmer but they will all be there for a good reason which is...

Language B : which is a declarative meta-language which will allow a different programmer to examine and reason about both language A and it's potential run-time environment and then re-factor it's code to run in parallel on the chosen platform(s). At first most programs in language B will be heuristic and specialized for particular applications, but as more experience is gained with particular algorithms (in language A), and with particular hardware configurations, patterns will begin to emerge and language B will be able to support automatic parallelization across certain categories of algorithms on certain categories of hardware. Language B will have all the nice syntactic sugar parallel programmers love, whether it be in the language of threading, effects, messages or something else.

Just found this on LtU a few

Just found this on LtU a few minutes ago:

A Topos Foundation for Theories of Physics

Seems relevant to what you're suggesting!

language b

patterns will begin to emerge

like skeletons?

CPS == parallel dataflow

(but I sure as H@ll can't prove it)

Live Object Layer

I believe that separate compilation, plugin architecture, singletons and such will be replaced by explicit representation and linking of 'live' objects in a global namespace. Static programs will be designed as configurations that plug into and access live systems in a global namespace. Smalltalk already embodies this concept to some degree, albeit without 'compilation'.

This architecture will offer greater support for optimization, better security, better modularity, and more support for sharing resources. It also allows flexible combination of live-programming and static programming.

For Object Capability and its benefits to security, confined testing, etc. provision of the 'live names' will be forbidden from static source-code. Compile-time access to live objects will instead be routed through a context-object introduced by the programmer or IDE to the build or compilation process, with various forms of service-matchmaking and plugin architecture being provided through the context-object.

Stigmergy in Programming

I believe, but cannot prove, that future programming languages and IDEs will be designed in Web 2.0 style, perhaps based on Wiki. Programs will be mashups of original content and links to both live services and other content.

I believe, but cannot prove, that the languages will need to be designed around this concept to survive, with design considerations towards social engineering and towards safe composition and clueless programming. Composable concurrency control primitives - transactions - will be critical. Guaranteed termination for certain operations will be critical. Deadlock resistance and effective deadlock handling will be critical. Emergent and composable security, likely achieved via a combination of capability and trust models, will be critical. Standard libraries designed for robust, persistent, secure, shared, multi-user GUIs, data-distribution, databases, etc. will be critical.

I believe, but cannot prove, that while design for 'psychology' of the 'individual programmer' will be given short shrift in favor of social engineering of masses, individual programmers will ultimately be empowered by use of search and the ability to effectively, safely, and almost cluelessly mash-up new programs by composition of existing programs.

I believe, but cannot prove, that attention to performance is absolutely important to stigmergic processes. In particular, programmers at the individual level are concerned about performance, and thus will invariably seek to achieve it. If performance and composition are at odds, then programmers will resist composition, and by extension resist mashups. If a programming language is to engender or engineer efficient stigmergic processes, it must offer a great deal of optimization that cross layers of composition as effectively as the programmer could ever hope to do it. As part of this, separate compilation at the source-code level shall be tossed aside in favor of a Live Object Layer, though such a layer will also be necessary to support integrating mashups of live services.

I believe, but cannot prove, that embedded or integrated DSLs and syntax extensions will thrive in such an environment. They'll simply be given de-facto standardization, and there will be community resistance against individual programmers tweaking the syntax for his or her own psychology.

I love to read this... live

I love to read this... live objects, mashups, web 2.0 and beyond.

Right now I want to state a simple observation: there are no command shells embedded in websites. Command shells aren't particular sophisticated and it wouldn't require some sort of mission-to-mars technology to enable them but just a Flash app and a web server. The limitations are socio-technical. They are not yet ripe for the public space. I find this so interesting because it is an easy and obvious indicator for the state of socio-technical networks.

You mean something like this?

You mean something like this?

A command line interface:

A command line interface: the JavaScript shell.

JavaScript before Orc?

Surprised you mentioned that before Orc (shell). You seem to bring it up often enough. ^_^

Have I been type-cast as the

Have I been type-cast as the Orc guy? Man, you bring up a topic once or twice and they never let you live it down. :-)

Actually, I had forgotten about the online Orc shell, so thanks for the reminder. I hadn't had much chance to play with it, so I wasn't sure whether it integrated with the browser DOM. I had thought that it was a neat way to invoke web services, similar to the way Waterken and Tahoe provide JavaScript latency-aware interfaces to their server-side objects.

To make it a true "web shell", some form of DOM would seem to be required, so the JavaScript shell is a start.

Please...

...stop posting this stuff. Yes, I've also seen "Try Ruby" and it has even a tasteful UI. This is all insignificant. It doesn't imply that the web has become a grid instead of being a multimedia library with some additional chats, blogs, forums and virtual shopping carts.

What stuff? You mentioned a

What stuff? You mentioned a console interface to websites, and I posted a relevant link that provides a JavaScript shell with full DOM access, including remote calls. I don't see why what I posted is so irrelevant.

Non-Web2.0 Stuff(?)

Kay is (I suspect) looking for something closer to the Web 2.0 profile. A JavaScript shell doesn't create persistent services that other people may then access and compose.

There is already significant progress in that direction, with such things as IBM's Lotus Mashups (previously known as QEDWiki).

On the 'language' front, you also have SmallTalk IDEs and Mozart Programming System, and other live and potentially distributed environments, but those seem to be lacking in terms of security, the safeties necessary for clueless composition, robustness to partial network failure and disruption, and various other design considerations for this purpose. Honestly, even the debugger needs to be subject to capability security...