Will data-intensive computing revolutionize programming languages?

The EAPLS (European Association for Programming Languages and Systems) is looking for new board members. I have placed my candidacy with the following statement:

The field of computer science is changing profoundly. Computing systems now have the processing power, storage capacity, and networking ability to easily handle enormous data sets. Data-intensive computing using large-scale distributed algorithms is realizing one by one the old dreams of artificial intelligence. Traditional research in programming languages and systems has solved most of the problems of computing with small data sets. It must now grow up and address computing with large data sets, for which programming languages at a much higher level of abstraction are needed. This is already happening: data-intensive language abstractions (of which map-reduce and its relatives are just the tip of the iceberg) are already catalyzing the new computer science. As a member of the EAPLS board, I will encourage programming language research to move in this direction, as a key part of the new computer science.

I happen to believe that most of the important problems of programming languages "in the small" are solved and that we are on the brink of a new revolution that will have a profound effect on programming languages. I would very much like to hear what LtU members think about this!

In agreement

I generally agree with your position, but at least in the short term there is still value in small scale optimisation. If you can get your program running twice as fast you cut the cost of your cluster in half (more or less). This is a significant saving in today's world, and why playing around with things like memory layout can still be worthwhile. You might argue that the cost of computing will drop further so at to make this level of optimisation not worthwhile in the medium term, and history suggests you'd be correct.

By Noel at Thu, 2010-03-25 10:42 | login or register to post comments

Lost in translation

Good luck with your candidacy, Peter! They would be lucky to have you.

I agree that distributed computation is the next frontier for PL innovation.

In particular, I've been thinking about distributed type systems: would it be possible to make some kind of type system guarantees in a distributed, heterogeneous language environment, perhaps by expanding on the concept of "blame" or by applying distributed local logics?

For an idea of the challenges of distributed multi-lingual environments, there is an apropos quote from the "Heisenberg" paper:

We predict that in the next ten years, voice communication will do
real-time language translation (we suspect that Google is already hard at work implementing
it!). You will be able to phone a correspondent in China and speak and hear English, and the
correspondent will hear you in Chinese and reply in Chinese. Today's conflicts between world
languages will become irrelevant.

At least now I know how World War III will start. ;-)

By Marc Hamann at Thu, 2010-03-25 13:40 | login or register to post comments

Natural language translation

We predict that in the next ten years, voice communication will do
real-time language translation

LOL. I predict that it will take at least another 50 years until computers can even do reliable off-line translations of non-trivial written text -- if it ever happens at all.

By Andreas Rossberg at Thu, 2010-03-25 13:57 | login or register to post comments

Huge amounts of data

Huge amounts of data together with stochastic algorithms are able to do amazing things .... 10 years sounds pretty optimistic to me, too, though :-) But these ideas seem to be very en vogue now, there is even a TV show ("Caprica") where the heroine is an AI created from all the data flowing around in the net about a specific person ...

By Steven Obua at Thu, 2010-03-25 14:35 | login or register to post comments

Unnatural language translation

I predict that it will take at least another 50 years until computers can even do reliable off-line translations of non-trivial written text -- if it ever happens at all.

Yeah, machine translation is one of those things that has been "just about to happen" for many decades now, and though there have been remarkable improvements in the "brute force" parts of the field in recent years, I long ago realized that, given that real translation is hard even for humans, a true MT system would require the equivalent of strong AI.

Then again some of the Singularity crowd think that strong AI is only ten years away too. ;-)

Fortunately, the outlook is better for PL interoperability, since the semantics are more constrained.

By Marc Hamann at Thu, 2010-03-25 15:03 | login or register to post comments

PL interoperability

Fortunately, the outlook is better for PL interoperability, since the semantics are more constrained.

I daresay machine translation is doomed, then ;-)

By Leon P Smith at Thu, 2010-03-25 16:41 | login or register to post comments

Re: Natural language translation

LOL. I predict that it will take at least another 50 years until computers can even do reliable off-line translations of non-trivial written text -- if it ever happens at all.

I agree with this prediction.

Of course, in 50 years I'm going to guess that (barring a worldwide disaster of some kind) we will have approached the fundamental physical limits of integrated circuit design. So if off-line translation isn't working reliably within a relatively short period of time after that, I predict it will never happen.

But who knows? Quantum Computing just might materialize in the meantime, and that's a whole new ballgame.

By Leon P Smith at Thu, 2010-03-25 16:33 | login or register to post comments

Issues more Algorithmic than Mechanical

You suggest that the physical limitations of our computations are the basic deciding factor for "reliable translations of non-trivial written text".

I believe most issues of AIs are far more related to how we approach the problems, and how we frame them. Throwing a lot of compute cycles can sometimes allow us to brute-force ineffective approaches, but do not preclude the possibility of taking better approaches to the same problems.

Mathematics has repeatedly proven that new notations and new ways of thinking about problems can greatly improve both our performance and precision. There is no reason to believe software is any different in that regard. Even if we reach limitations of hardware, I suspect humanity will have centuries to continue advancing the math and software (though there may be some point at which software begins advancing itself faster than humanity can keep up).

By dmbarbour at Thu, 2010-03-25 16:48 | login or register to post comments

This makes me wonder. How

This makes me wonder. How could we improve programmer productivity if computers were a trillion times faster?

By Jules Jacobs at Thu, 2010-03-25 20:01 | login or register to post comments

A trillion times faster...

What would that mean, anyway? Chances are that not every aspect of computers (memory, bandwidth, network latency, heat dissipation, processors, etc.). would be a trillion times faster... so I suspect that a lot of performance issues would be shifted about.

I also suspect that Parkinson's Law would apply. Judging from historical precedent, computer startup times would likely require three-trillion times as much work...

I think a better question is: how can we improve productivity today in a way that can effectively utilize greater performance, should it become available.

By dmbarbour at Thu, 2010-03-25 21:32 | login or register to post comments

Job of Linkers in context of Distributed Data and Services

Imagine programming, at least in part, in terms of externally provided service requirements... where you can describe these requirements in code, i.e. in terms of which unit and integration tests they must pass, pre-conditions, post-conditions, invariants, accreditations, CPU/memory/monetary/licensing costs and heuristics, and so on. Other requirements may be inferred. Registries and markets for these services would exist, and smart linkers ('match-makers') would be able to search to find a solution with the right price, or may even have factories or AI to compose the less complicated services.

Developer productivity would be improved to the degree that developers can write a portion of their code in terms of declared requirements. It would also be improved to the degree that they spend less effort binding a new program to any particular library. A match-maker could do much of the messy work, though a developer could be part of those smarts if he so chooses. While a developer might need to support the match-maker, a match-maker integrated with a services market would also allow third-party developers to chip in if they feel they can meet the contracts and accreditations exposed in the requirements.

Security would be achieved in part by making the match-makers themselves into first-class capabilities, such that each service for a project can be met by a different match-maker if so desired. The match-maker selected would determine exactly how visible are your linking requirements to external systems.

Ideally the binding to a given service is loose, stateless, such that one can fall-back to alternative services after a partial failure, or one can upgrade to alternative services as better options become available. Live programming should allow one to tweak on the fly both service requirements and the match-makers.

This design would advantage from certain language features. Support for transactions would better enable auditing and testing: you can run tests that would have side-effects, if committed, then not actually commit them. Language support for runtime code distribution would greatly reduce need for authorities to be explicitly provided to untrusted code: instead of linking 'libraries', one links 'unum presences' that may be internally stateful and carry the authorities needed. Support for capability security enables one to host untrusted code, and is thus useful for runtime code distribution.

And if computers do become a trillion times faster, the match-makers could process much larger databases, browse much larger markets, and 'solve' somewhat more complex requirements. Thus, the developers could be somewhat more productive... though, chances are that coming up with more better ways to express and solve requirements would do us better than twelve-orders-of-magnitude in brute force.

By dmbarbour at Thu, 2010-03-25 23:53 | login or register to post comments

How would these match makers

How would these match makers search the database of libraries/code and hook them up correctly? Wouldn't that require either human level AI or very standardized interfaces for each type of library? If we can do the former we're done already, and the latter wouldn't be widely applicable (you could have a standardized interface for hash tables, but I'm not seeing how this is general enough to be interesting). Probably I'm not getting what you are envisioning. Can you give an example of requirements that define a problem that an automated tool could solve (or partially solve)?

By Jules Jacobs at Fri, 2010-03-26 00:37 | login or register to post comments

Match-makers under the hood.

A match-maker, at the interface, may be queried with a requirements specification and will reply with a service description that, presumably, provides the requested service. (For live programming and live queries and runtime upgrade, one adjusts the above to: the query is a reactive expression describing requirements, and the reply is a reactive expression describing a solution for those requirements.)

At its most complicated, a requirements specification could be a complicated and inferred structure consisting of types and behavioral contracts; at its simplest, a requirements specification could be a simple URI or an opaque string. A service description could consist of values, such as functions and procedures or capabilities for external objects.

Match-makers could be implemented, under the hood, in any manner. You could have specialized match-makers that, for example, only know how to search for font libraries from an online resource. You could have generic match-makers that hook into a human services market and allow humans to auction their services. You could have registry-based match-makers to which more specialized match-makers may be registered, and that communicates with those other match-makers to find a best match for a service. I suspect you'd also have a lot of 'adaptor' match-makers, that know how to take the output of specific other match-makers and adapt them to specific interfaces.

As noted above, for security it is ideal if the developer can specify which match-makers are used for which tasks. This means that the human developer's job largely consists of: (a) specifying the requirements, (b) specifying the matchmakers, (c) writing glue code and hooking in other services via distributed object capability.

the latter wouldn't be widely applicable (you could have a standardized interface for hash tables, but I'm not seeing how this is general enough to be interesting)

The problem is your example. A hashtable isn't an especially interesting service. Rarely do specializations on a hashtable offer more than a slight difference in performance, reentrance, and mt-safety. Hashtables don't embody any particularly useful capabilities. Besides, hashtables are intensely stateful services... they should be avoided, on principle, because (among myriad other reasons) you cannot easily upgrade or fall-back from semantically stateful services at runtime.

Consider instead standardized interfaces to font libraries, webcams, sound-systems, GUI display systems, search services, clock services, logging systems, consoles, news events, stock tickers, and so on. Databases of adaptor code would allow one to corral various systems into common interfaces.

There are plenty of interesting classes of service to which one might wish attach. You just need to look for service classes that: (a) embody capabilities to the external world (i.e. sensors, actuators, user interface, and services that use these indirectly such as stock tickers), or (b) can be very heavily specialized in interesting manners (as with fonts, which specialize artistically and in terms of scalability).

This is widely applicable, even with standardized interfaces.

Wouldn't that require either human level AI or very standardized interfaces?

If the interface were part of the requirements specification along with pre-conditions and post-conditions and such, one would probably need a good AI to solve the problem... or a services market that could involve other humans. This is a worthwhile endeavor, as it allows a great deal of high-level constraint programming. But one doesn't need to start with that level of feature just to make match-makers (distributed systems 'smart linkers') worthwhile.

Probably I'm not getting what you are envisioning. Can you give an example of requirements that define a problem that an automated tool could solve (or partially solve)?

"give me a traffic camera that can see the corner of [street address], feed me MJPEG".

"find me a fire-station close to [location], and give me capability to voice communications with the men stationed there". An automated tool might reply with a capability wrapping a phone number through a VoIP service.

"give me a service that can find me a best route between points A and B through the building".

These requirements would need be written for the language, of course.

By dmbarbour at Fri, 2010-03-26 03:40 | login or register to post comments

Interesting. I though your

Interesting. I though your idea was that the libraries from the market could be automatically combined to form complete solutions, but now I see how it could be useful long before we have AI good enough to do this.

Consider instead standardized interfaces to font libraries, webcams, sound-systems, GUI display systems, search services, clock services, logging systems, consoles, news events, stock tickers, and so on.

Most of these are available today (e.g. Qt), or would be easy to build on top of the different available web service API's.

By Jules Jacobs at Fri, 2010-03-26 12:30 | login or register to post comments

A trillion is quite a lot.

A trillion is quite a lot. I am not sure by what factor computer speed improved since the 70ties, but my guess is: much less than a trillion (anyone here has the real number?).

With that kind of speed-up the experience of interactive theorem proving should have changed completely; an interactive theorem prover should then combine the automatic reasoning capabilities of about 1 or 2 fields medal and 10 turing award winners put together. What that means is that every creative and competent person out there can create rock solid world-class designs of about anything virtual. That should change the computing landscape quite a bit.

By Steven Obua at Fri, 2010-03-26 03:26 | login or register to post comments

A trillion is quite a lot, but....

The original PC ran at 4.77 Mhz. Today's cores run at around 4.77 Ghz. That's a factor of a million, not a Trillion.

Allowing for the difference between 8-bit and 64-bit computing, multiply by another factor of 4. Allowing for multi-core CPU's, multiply by another factor of 4 in general purpose CPU's or 200 or so in DSP's. Pipelining and streaming get another factor of 4 in regular CPU's, or another 20 in DSP's (My figures for Digital Signal Processors are probably inaccurate, but I know that some are on the order of 4000 times as fast as a CPU; a specialist in the field could probably explain better exactly why).

So, by my wild-assed guesses, general purpose CPU's are somewhere around 64 million times faster or thereabouts than the IBM PC, and the best DSP's probably somewhere around 16 billion times faster. We're still far short of a trillion times as fast, but if we take the logarithmic scale of Moore's law seriously, we're over halfway there.

The problem is that most of our algorithms don't scale linearly. An N-squared algorithm given a CPU 16 billion times faster, and 127000 times as much data to work on, will complete no faster. As our CPUs get faster we're working with larger, or more detailed, data sets at a rate that compensates the speed increase. And larger data sets become available at the rate that providers can correlate and organize them -- meaning, the rate at which our computers can work basic sorting and indexing algorithms limits the aggregation of data sets to work with.

I see no reason to suspect that this is going to change.

In fact, there might be a high-end market for special-purpose SIMD hardware, and languages to program it, made and optimized just for sorting, indexing, and searching. That's what people who want to provide large data sets need.

By Ray Dillinger at Fri, 2010-03-26 07:04 | login or register to post comments

Mille vs million

The original PC ran at 4.77 Mhz. Today's cores run at around 4.77 Ghz. That's a factor of a million, not a Trillion.

Million would be great already, but in fact it's only a factor of 1000.

By Andreas Rossberg at Fri, 2010-03-26 08:01 | login or register to post comments

The trillion was meant as a

The trillion was meant as a very high speed-up that's large but not infinitely large, not as something that is possible. I'd like to see what we can do if we don't have to think about small inefficiencies. For example we could remove different data structures that do almost the same thing except for speed (lists, arrays, maps).

An individual processor is not going to be a trillion times faster than today, but the number of processors times the speedup of an individual processor most likely is.

Would automatic theorem proving really change a lot? Proving time is at least exponential in the "difficulty" (whatever that is) of the theorem? You quickly run out of a trillion that way.

Here's a graph that shows calculations/seconddollar over time: http://upload.wikimedia.org/wikipedia/commons/c/c5/PPTMooresLawai.jpg. This picture shows that we've had more than a trillion times speedup in 100 years.

By Jules Jacobs at Fri, 2010-03-26 11:57 | login or register to post comments

And I would say that we have

And I would say that we have quite a visible difference in computing between now and 100 years ago!

As for automatic theorem proving: Of course algorithms need to change, too... But for a lot of applications these brute-force complexities hugely overestimate the practical difficulty of the problem; small to medium-sized hardware designs can be proven to be correct fully automatically today already.

By Steven Obua at Fri, 2010-03-26 12:15 | login or register to post comments

No need of programmer

How could we improve programmer productivity if computers were a trillion times faster?

The "ultimate" goal of language design is making the computer program itself, so maybe that monster won't need any programmer...

With some bootstrapping, it will create its own languages.

By odanet at Fri, 2010-03-26 16:57 | login or register to post comments

Distributed Data Fusion, Command and Control

I expect that integration with live, persistent, open distributed systems will greatly affect how programming languages are developed. Security in language design has become a popular concern over recent years, which allows untrusted programs be safely distributed and securely upgraded even as they interact with the network.

I don't believe that languages dedicated to large 'data-intensive batch processing' will be particularly relevant. I say we need languages for live data processing and data fusion (i.e. recognizers of high level events)... and that then would allow us to take rapid advantage of that information and of further updates, i.e. to develop automated agents and GUI applications to control the systems. We need these systems to be resilient - self-healing, disruption tolerant, graceful degradation - in addition to being secure, efficient, and scalable.

In a sense, I believe that all query replies should be 'live', by default, such that they maintain over time. I would include replies from AIs, logic systems, search services, databases, etc. For composition, it is also useful if the queries themselves are live.

Still, any language capable of the above can also take advantage of external, distributed, relatively static data resources. There are optimizations available when the query data resource is fully static. One can always keep a snapshot of a query and take action based on that snapshot.

By dmbarbour at Thu, 2010-03-25 15:50 | login or register to post comments

I'm sure I disagree...

I'm sure I disagree with the claim that "most of the important problems of programming languages 'in the small' are solved", but I'm not sure if our disagreement is about a) what the problems are, b) whether or not they've been solved, or c) whether or not they are problems of programming languages vs. some other domain of inquiry. In any case, here are some fundamental problems related to software development in the small (or large) that seem wide open:

P1) Code reuse is the exception rather than the norm. Developers write code for common and extremely similar tasks over and over again because they lack effective mechanisms for automatically extracting knowledge representations, algorithms, and techniques from existing code. One particular case of interest is where newer programming languages have poor library support for common programming tasks because they are not able to easily translate existing, production quality libraries from other languages. Much the same thing could be said about translation within an existing language but to say, a different, but more or less functionally equivalent windowing toolkit.

P2) Innovations in programming languages haven't usually translated to substantial productivity gains. New software development projects require too many person hours to complete. This is apparently because too much of the required work is often "custom" to the particular problem. But PLT doesn't have much to say about why and how the problems are custom or the extent to which this is fundamental to software development or incidental to our current understanding of software practice.

P3) PLT researchers are often surprised and/or dismayed that both the bulk of experienced professional software developers and their own incoming students don't care to work with the languages that they deem to be of the highest quality. They often attribute these preferences to the naivete/inexperience of the students on the one hand, and the calcified habits of the experienced programmers on the other. This divergence suggests that either many PLT weightings of programming language merit are pretty subjective or else they focus on criteria that are really begging the question rather than foundational in the broader sense.

By Josh Stern at Fri, 2010-03-26 03:37 | login or register to post comments

I think people are too

I think people are too pessimistic, and overlook the progress that has been made.

Code reuse is the exception rather than the norm

There is far more code reused today than there was 10 or 20 years ago. People build their applications on top of frameworks that do more for them, and are accustomed to using disparate libraries (often open-source) for much if not most of their tasks. Many programmers actually regret this phenomenon; they feel more like plumbers joining bits and pieces together than actual programmers, writing code to solve problems.

Innovations in programming languages haven't usually translated to substantial productivity gains

I don't know how anyone could make this statement and pretend to be serious. Java, the COBOL of our generation, is far better than the C that preceded it because of a handful of things, but most importantly memory safety, garbage collection, a half-usable module format along with generally accepted coding standards that greatly reduce versioning brittleness. People today writing applications in languages like Java and higher-level ones are solving problems that are of a different order of complexity than those that were worked on 10 or 20 years ago. The only reason people moved from languages like C to C++ to Java, to C#, to Python and Ruby, was because of undeniable productivity gains.

New developments take more time because they do more, they're more ambitious. 20 years ago you'd be happy with a green screen app that did CRUD into a database. Now you have applications that are globally distributed, mobile and offline / occasionally connected and that are expected to interact with multiple heterogeneous systems.

PLT researchers are often surprised and/or dismayed that both the bulk of experienced professional software developers and their own incoming students don't care to work with the languages that they deem to be of the highest quality.

The economic benefits of what appears to be the primary focus of academic PLT, namely verification of software via proofs and increased static type checking, are not as high as PLT researchers think. In fact, they're quite low. Software that mostly works but that can be written quickly and easily modified and interacts loosely with third-party libraries trumps the over-specification that static typing, and the proofs that go along with it, tends to overemphasize.

By barrkel at Fri, 2010-03-26 08:09 | login or register to post comments

In fact, they're quite low.

In fact, they're quite low. Software that mostly works but that can be written quickly and easily modified and interacts loosely with third-party libraries trumps the over-specification that static typing, and the proofs that go along with it, tends to overemphasize.

In fact, I would argue they are quite a bit higher than "experienced practitioners" who spurn typing think, but I agree that a flexible foundation for concurrency, dynamic loading, etc. are just as important.

By naasking at Fri, 2010-03-26 15:11 | login or register to post comments

Responding to Josh Stern

p1) Production libraries usually have performance measurables you cannot simply transliterate; translating a windowing system obviates the need for the language as a tool for implmenting windowing systems, since ANY windowing system you design in the new language should make system use cases more robust, not merely n-version programming!

I don't understand you reuse arguments here.

p2) there are ideas for helping complete projects faster, such as george fairbanks "design fragments" for better examples of how to use libraries and frameworks. i think the major problem here is that, today, library/framework writers are not responsible for posting their design docs and common use cases/user stories along with the code. in addition, when somebody as a thrid party finds their use case doesnt perfectly fit the tool, they do not have good feedback channels that (a) communicate directly to the designer the problem. (b) communicate to others that there are use cases the designer clearly did not think of, helping forewarn that project estimation will not be straightforward. -- these are things many PLT'ers simply do not care about, b/c so much of what they do is arbitrary (pure math)

real world example: c# is informally specified, and has warts all over it, but to find out about these warts i've genuinely had to read over 40 books about .net and thousands of blog posts. and what about comparison to vb.net? beginners have no shot at just going to a knowledgeable resource about this errata

3) i am not sure what this has to do with programming in the small

By Z-Bo at Mon, 2010-03-29 03:53 | login or register to post comments

More explanation

p1) Production libraries usually have performance measurables you cannot simply transliterate; translating a windowing system obviates the need for the language as a tool for implmenting windowing systems, since ANY windowing system you design in the new language should make system use cases more robust, not merely n-version programming!

I don't understand you reuse arguments here.

The example was about substituting existing 'Windowing Toolkit 2' for existing 'Windowing Toolkit 1', considering both as 3rd party APIs. It wasn't about creating an improved implementation of some given library.

A much simpler but similar in kind example would be changing from the use of one API for serialized binary output to another one.

There is a long tradition in PLT of working towards derivation of code from axioms. The Ynot project is a current example of such that has been mentioned on LtU.

To the extent that its possible to automate the derivation of code from a specification, one could change the axioms and automatically derive new code. And if the methodology allowed for using 3rd party libraries, with axioms describing their behavior, then that would be an example of what I am talking about. However, most of those derivation systems emphasize very expressive logics (with weak computational properties), minimalist axiom sets, and verification; they are willing to sacrifice automaticity.

At the other extreme is something like Prolog programs, which also derive existence proofs of simpler claims, but in a limited logic with a lot of ad hoc programmer supplied "axioms"; these sacrifices enable a high degree of automaticity. The kind of metaprogramming I have in mind is closer to the automaticity end of the spectrum.

there are ideas for helping complete projects faster, such as george fairbanks "design fragments" for better examples of how to use libraries and frameworks. i think the major problem here is that, today, library/framework writers are not responsible for posting their design docs and common use cases/user stories along with the code. in addition, when somebody as a thrid party finds their use case doesnt perfectly fit the tool, they do not have good feedback channels that (a) communicate directly to the designer the problem. (b) communicate to others that there are use cases the designer clearly did not think of, helping forewarn that project estimation will not be straightforward. -- these are things many PLT'ers simply do not care about, b/c so much of what they do is arbitrary (pure math)

There are lots of extra channels that can be used to communicate design, but these suffer from all the usual problems of stuff that falls outside of compilation/automation (not enforced, not synchronized, not automated, vague, etc.)

3) i am not sure what this has to do with programming in the small

I was arguing against the idea that PLT had run out of important projects except for those in "programming in the large".

By Josh Stern at Tue, 2010-03-30 02:04 | login or register to post comments

Just to be clear

Peter did not say programming in the small was a solved problem.

Instead, he said programming with small data sets is a solved problem. What he means by that is the debugging, profiling, and compiling aspects have all been fully researched and we're at a zenith when talking about these technologies. In effect, we've achieved some modest degree of "automatic programming" for crunching small data sets. I disagree with this, because for resource constrained devices, we're still badly crippled, and think Peter and I have a point/counterpoint going on.

However, it is completely misquoting Peter to say that programming in the small is done with.

Programming in the large has nothing to do with the size of data sets, at least as traditionally defined.

By Z-Bo at Tue, 2010-03-30 15:40 | login or register to post comments

You're not clear

I didn't attribute to PvR the claim that *programming* in the small is done with. I disagreed with his claim that "programming languages and systems has solved most of the problems of computing with small data sets".

My original posting clarified that I wasn't sure if this was a disagreement about the nature of "problems", the nature of "solved", or whether programming languages can be improved to solve them. He didn't clarify, so I'm still not sure what he meant. I do agree with him that research into distributed algorithms has a long way to go and that many mainstream programming languages will probably evolve to make it easier take advantage of distributed algorithms through things like cloud computing. In any case, I'm sure he can defend himself if he thinks his remarks were mischaracterized, as I am doing here since you mischaracterized my remarks.

By Josh Stern at Tue, 2010-03-30 16:54 | login or register to post comments

To repectively disagree (for the near future)

As a "layman" practitioner and manager of many projects over a few decades, I've seen the broader programming community soundly reject even the simple s-expr based syntax rationalization that would allow for much great language-level expression potential and consequent productivity. Same goes for type inference, integration of logic programming concepts into mainstream languages, and on and on and on with many noteworthy and otherwise obviously useful PLT/PL innovations.

IMHO, data-intensive, distributed programming will be likely lead via low level languages such as C, C++, Java and their language kinfolk, with access to libraries ("high level" and "low level" both) to crudely (e.g., not safely, etc.) optimize programs for distributed computing. For years or decades, such programming efforts will still be considered highly specialized efforts, even within organizations that might otherwise greatly benefit from more routine use of related "advanced techniques" broadly defined.

I think PLT folk underestimate the minimal training, frankly limited mental acuity, low upper bound on pay scale for superior skills or talent and pervasive low morale of the vast work-a-day programming public.

Even configuring a program or library or single routine comprising a simple list of stepwise instructions with some brute force data retrieval and trivial looping logic thrown in for good measure - maybe nowadays also some badly managed exception handling - is essentially still beyond the competency of most working programmers or groups of their kind - if any kind of elegance, efficiency, safety, documentation/clarity or nearly any other quality criteria are applied to their efforts.

The failure of clearly superior language tools to produce a LARGE and HIGHLY VISIBLE volume of clearly superior computer applications does NOTHING to help integrate the wonderful innovations of the PLT community into common practice.

Sad to say, but IMHO, a PLT revolution of any kind is not upon us by a long stretch. The PLT community can be of most benefit via stepwise incremental improvements to existing practice, perhaps devising just tiny changes in existing languages and tools - and then, only via their MOST POPULAR IMPLEMENTATIONS - necessary to make new appropriate frameworks and libraries (within existing, maybe slightly modified, languages and tools) available to our great programming unwashed to a) take advantage of new hardware/algorithmic technologies or b) solve new classes of problems.

Beyond the great c*rcle j*rk of papers in academic life, it's the MOST we can realistically hope for having a meaningful impact on the oodles of lines of code that comprise true computing practice in the greater world around us.

Just my 2 cents.

By scottmcl at Fri, 2010-03-26 05:39 | login or register to post comments

I think PLT folk

I think PLT folk underestimate the minimal training, frankly limited mental acuity, low upper bound on pay scale for superior skills or talent and pervasive low morale of the vast work-a-day programming public.

I think it's important to emphasize that this is a problem for PLT folk rather than the "work-a-day programming public". People solving problems with minimal training, "limited mental acuity", low pay etc. are still generating value. Disparaging their mental acuity is obnoxious; more likely, they are not interested in investing much mental resources on programming as a discipline because it's not worth it. Rather, they know more about e.g. the business niche they are focused on, or they are spreading their efforts across a wide swathe of different tasks, and can't afford to get bogged down in any single one to analyze it with much formality.

The failure of clearly superior language tools to produce a LARGE and HIGHLY VISIBLE volume of clearly superior computer applications

Modern distributed development - i.e. REST, HTML apps over stateless HTTP - is clearly superior to historical desktop computer applications, and there is a large and highly visible volume of this work, plain as day, on the web. They work on different devices, different OSes, mobile or fixed, almost anywhere in the world, they're discoverable via hyperlinks, they're programmable (even if it has to be at the level of web scraping), they're far more accessible to people with disabilities, etc. The simple fact that the hyperlink, as persisted in a bookmark, as a notion of keeping track of your location in an application is almost magic - even when many web apps don't properly implement REST, it works well enough for those that do.

And it wasn't ALGOL, or FORTRAN, or C, or C++ that underpinned this explosion in clearly superior applications. It was languages that supported GC and memory safety and had a healthy circumspection re static typing, and that were amenable to being largely built with code reuse. The languages that helped most were those that were most amenable to code reuse: the plumbing languages like Perl, Python, Ruby, etc.

By barrkel at Fri, 2010-03-26 08:26 | login or register to post comments

Progress

I think people are too pessimistic, and overlook the progress that has been made.

Surely there has been *progress*, but my disagreement relates to the claim that most problems have been solved.

Code reuse is the exception rather than the norm

There is far more code reused today than there was 10 or 20 years ago. People build their applications on top of frameworks that do more for them, and are accustomed to using disparate libraries (often open-source) for much if not most of their tasks. Many programmers actually regret this phenomenon; they feel more like plumbers joining bits and pieces together than actual programmers, writing code to solve problems.

I agree, but would point out that this type of library and framework usage was already common 10-15 years ago, and reemphasize my earlier point that one example of an unsolved problem is the lack of technology to even semi-automatically move from one framework to another functionally equivalent one. Everyone agrees on the importance of library usage for productivity, but that is fairly different from re-use of application code.

Innovations in programming languages haven't usually translated to substantial productivity gains

I don't know how anyone could make this statement and pretend to be serious. Java, the COBOL of our generation, is far better than the C that preceded it because of a handful of things, but most importantly memory safety, garbage collection, a half-usable module format along with generally accepted coding standards that greatly reduce versioning brittleness.

I agree with most of the above. In particular, I agree that Java programming tends to achieve better productivity and reliability than C and Cobol for application programming and this has a lot to do with its adoption. But Java is a frequent butt of criticism from many (most?) PLT theorists, and it would surprise me if PvR was mainly thinking of Java when making the claim that the problem of languages for programming in the small had been solved. Certainly Java has a lot of limitations.

People today writing applications in languages like Java and higher-level ones are solving problems that are of a different order of complexity than those that were worked on 10 or 20 years ago. The only reason people moved from languages like C to C++ to Java, to C#, to Python and Ruby, was because of undeniable productivity gains.

Don't agree so much with this part. My sense is that larger and more complex applications have typically to date been written in C/C++ (interesting though to see how C# is mixed in with Windows 7). Some of those are certainly bigger now than 10 or 20 years ago, but its pretty debatable whether this has been mainly enabled by programming language advances vs. increases in CPU, RAM, and disk performance/$.

PLT researchers are often surprised and/or dismayed that both the bulk of experienced professional software developers and their own incoming students don't care to work with the languages that they deem to be of the highest quality.

The economic benefits of what appears to be the primary focus of academic PLT, namely verification of software via proofs and increased static type checking, are not as high as PLT researchers think. In fact, they're quite low. Software that mostly works but that can be written quickly and easily modified and interacts loosely with third-party libraries trumps the over-specification that static typing, and the proofs that go along with it, tends to overemphasize.

I agree with PLT types that increased static checking has a lot of value for medium and large projects. I disagree with PvR that we have gotten as far as we need to go with even static checking, and I agree with you that language adoption is driven by productivity and that there are many other language features that are important to productivity but which receive scant attention from the PLT community. Java and C# make progress on some fronts (while taking a step back on some others) but IMO a lot more improvement is still possible.

By Josh Stern at Fri, 2010-03-26 13:54 | login or register to post comments

Explorers and Settlers

I'm noticing a theme in some of the responses in this thread.

There seems to be an assumption that the purpose of a revolution in PLs such as Peter is proposing must necessarily have the explicit goal of improving wide-spread software engineering practice.

At one time in the history of LtU, I was a vocal proponent of a similar view, and so, as penance, I want to counter this assumption. ;-)

The analogy I want to use is explorers and settlers. An explorer's job isn't to find new places for people to live, but to expand boundaries into unknown territory.

Some newly found territories will be good for living in, and settlers will move into them, but that is not the explorer's concern: that's what settlers care about.

Other territories will be inhospitable, but the explorers might find new resources there that allow the invention of totally new ventures. Other inhospitable territories might just be barren, but their very barrenness might provide useful knowledge about what makes territory hospitable in the first place.

At some point, a territory is pretty much explored, and the settlers are going to start squabbling among themselves about how it should be developed. That is the time for the explorers to go off and find the new frontiers that have opened up and see what's there.

Peter is an explorer, and I take the OP to be his cry of "Go West, Young Man!".

Some of us might be settlers and want to remain behind, reworking the same old territory (that has it's importance too), but someone has to forge into the new frontiers, whether anyone ends up living there or not.

By Marc Hamann at Fri, 2010-03-26 14:23 | login or register to post comments

Dimensions of exploration

The meaning of "Go West, Young Man!" was obvious to people of that day because they understood the fundamental dimensions of territorial exploration provided by geography and existing settlements.

Everyone can agree that innovation is good and that exploration is a necessary part of innovation. But I see the content of the parent piece as making a claim about the geography of the PLT space and in which directions exploration is possible. It is saying, to use your analogy, "these dimensions are not only explored but fully developed, and these other dimensions are the main ones that are left unexplored." My disagreement is with that theory of PLT geography, not with a the desire to explore.

By Josh Stern at Fri, 2010-03-26 15:08 | login or register to post comments

Settled vs. solved

My disagreement is with that theory of PLT geography, not with a the desire to explore.

Part of my analogy was that just because a piece of territory is settled, doesn't mean there aren't still problems there. There will continue to be disputes over zoning bylaws, how much green space to have, etc., but these problems are more political than technical. They're about getting people to agree how to use the well-mapped territory.

Issues like code reuse might benefit from technical innovations, but my opinion and experience is that it is mostly a political issue: sensitive programmers want others to make use of their "awesome code", and IP owners want to see returns on their "capital investment".

The technical solution to code reuse is all the many different forms of abstraction that have been invented (polymorphic types, objects, macros, modules, interfaces, etc.); there is no shortage of options. But the partisans of various approaches can't agree on which ones are good and which ones aren't and in what circumstances. No research paper is going to resolve that dispute definitively.

There may be new forms of abstraction to be invented, but they are more likely to be found by thinking about more complicated, distributed systems, by stretching our models and approaches to handle situations that we currently can't easily wrap our heads around.

Some problems we software engineers just have to solve on our own by coming to a consensus, not by waiting for a deus ex machina from new research.

By Marc Hamann at Fri, 2010-03-26 15:42 | login or register to post comments

Technical problems

Issues like code reuse might benefit from technical innovations, but my opinion and experience is that it is mostly a political issue: sensitive programmers want others to make use of their "awesome code", and IP owners want to see returns on their "capital investment".

The technical solution to code reuse is all the many different forms of abstraction that have been invented (polymorphic types, objects, macros, modules, interfaces, etc.); there is no shortage of options. But the partisans of various approaches can't agree on which ones are good and which ones aren't and in what circumstances. No research paper is going to resolve that dispute definitively.

The issues are technical. The problem with the kind of stuff you are talking about if that it currently has to be done by hand rather than automation, and it mainly relies on anticipating the dimensions of future variation at the time the original code is written. That combination of factors means that the original programmer has to spend extra resources to anticipate future variation (both in development time and often in runtime performance) and will naturally be unable to anticipate many actual future variation needs. All the accounted for variations that are not useful in the future then are pure waste, and trying to account for every conceivable future variation will inevitably fail.

I see the required technology as closely analogous to a completely different dimension of compilation that doesn't yet exist. Traditional source code compilation improved on assembly language programming in many, many ways, but one of the most important was automatically working out relative addresses/binary layout. When the source code changes, the compiler automatically figures out how the addresses and binary layouts have to change as well. Of course assembly language programmers could adopt conventions and use macro languages to do some of these things, but nobody would argue that the difference in ease and reliability of doing that is a purely political problem.

Aspect oriented programming talked about some problems of this nature but did not supply much in the way of generically useful solutions.

By Josh Stern at Fri, 2010-03-26 16:52 | login or register to post comments

Political problem

it mainly relies on anticipating the dimensions of future variation at the time the original code is written.

The expectation that today's solution will effortlessly solve tomorrow's completely unpredictable problem is a political (i.e. people) problem.

If a building is built as a warehouse, we accept that it will have to be renovated if we want to use it for offices or apartments. We don't blame it on a failure in the techniques of architecture and construction, or on the available expressiveness of blueprints.

By Marc Hamann at Fri, 2010-03-26 17:59 | login or register to post comments

Digital re-use

it mainly relies on anticipating the dimensions of future variation at the time the original code is written.

The expectation that today's solution will effortlessly solve tomorrow's completely unpredictable problem is a political (i.e. people) problem.

If a building is built as a warehouse, we accept that it will have to be renovated if we want to use it for offices or apartments. We don't blame it on a failure in the techniques of architecture and construction, or on the available expressiveness of blueprints.

Now you're just distorting my point with a ill-fitting analogy in order to dismiss it ; "effortlessly solving all future problems" is just a silly summary of what went before.

If we take a concrete body of code and tell a programmer to a) make it unicode compatible, or b) make it suitable for a concurrent context, or c) use Cocao or Qt instead of MFC, etc. etc. then competent developers know what that means and what counts as a suitable transformation. The ideas are not deep, but there is no means to semi-automate the process using current PLT. These types of changes are common in original development and not just re-use of someone elses work. But one key difference is that the original developers have a much better idea of what to change without breaking working algorithms - they have a model of the code that isn't explicit to either current tools or other programmers.

Programming tools which better supported transformations and refactorings are just one example of a useful programming technology that isn't here yet.

By Josh Stern at Fri, 2010-03-26 18:23 | login or register to post comments

Optimistic transformations

If we take a concrete body of code and tell a programmer to a) make it unicode compatible, or b) make it suitable for a concurrent context, or c) use Cocao or Qt instead of MFC, etc. etc. then competent developers know what that means and what counts as a suitable transformation. The ideas are not deep

The three examples you mention are excellent examples of the kinds of changes that sound simple and "not very deep", but many a project has gotten lost in the swamp trying to implement. I don't think any are likely to be automated transformations anytime soon.

Furthermore, there is still an underlying assumption here: that because some aspect of software engineering is inconvenient, that it is PLT's job to fix it.

Some problems are just hard and have to be solved on a case by case basis by skilled programmers. The good news is that we aren't likely to be replaced by machines anytime soon, which is what would happen if you got your wish. ;-)

By Marc Hamann at Fri, 2010-03-26 18:55 | login or register to post comments

Depends on the annotation system

If we take a concrete body of code and tell a programmer to a) make it unicode compatible, or b) make it suitable for a concurrent context, or c) use Cocao or Qt instead of MFC, etc. etc. then competent developers know what that means and what counts as a suitable transformation. The ideas are not deep

The three examples you mention are excellent examples of the kinds of changes that sound simple and "not very deep", but many a project has gotten lost in the swamp trying to implement. I don't think any are likely to be automated transformations anytime soon.

My goal was to give examples of useful programming language research that hasn't been done yet.

At first I was accused of wanting to remain with the smelly settlers instead of the cool, elite explorers trundling off to California to get rich selling pants. Now you claim I've overshot the explorers and landed on a atoll in the Pacific. :)

The actual difficulty of semi-automating the type of things described above depends on the richness of the annotation system used in this imagined, future programming language. My judgment would be that without any annotations it is an AI problem similar to building a useful home robot - something we will probably see in our lifetimes. But the more practical case would be if the programming development proceeded by refinement of design, yielding a rich system describing programmer intent. In that case the complexity is very similar to textbook AI planning problems that are solved by existing software without too much difficulty - certainly something where semi-automation is practical.

Furthermore, there is still an underlying assumption here: that because some aspect of software engineering is inconvenient, that it is PLT's job to fix it.

Excuse me, I got caught up, and momentarily forgot that PLT's job is only to solve simple problems of memory safety and access control, and then obfuscate those solutions with mathematical overkill.

Some problems are just hard and have to be solved on a case by case basis by skilled programmers. The good news is that we aren't likely to be replaced by machines anytime soon, which is what would happen if you got your wish. ;-)

Basic microeconomic theory suggests that developer salaries would go up if productivity per person hour went up.

By Josh Stern at Sat, 2010-03-27 01:47 | login or register to post comments

Unintended value judgment

At first I was accused of wanting to remain with the smelly settlers instead of the cool, elite explorers trundling off to California to get rich selling pants.

As far as I'm concerned, the settlers and the explorers aren't intended to rank one above the other. Each has their role to play; they're just different.

I think it is equally misguided for the explorers to expect unstinting praise and enthusiasm for what they do from the settlers as it is for the settlers to expect the explorers to only care about their interests.

The two groups do benefit each other, just more indirectly than either would sometimes like. ;-)

By Marc Hamann at Sat, 2010-03-27 02:30 | login or register to post comments

Theoretical vs. Applied Math

Earlier in the thread I explained why I didn't think the explorer/setter analogy was a good fit to my criticism of PvR's claim. 'Explorer' seemed like it was being used as a placeholder for pursuit of the new and innovative vs. refinement of the existing, and my claim included disagreement about the boundaries of what should be labeled innovative and novel.

On reflection though, I think maybe you are arguing something that I hadn't considered. Is it the case that you think PLT should be seen as more akin to Pure rather than Applied mathematics? Do most PLT researchers think their research goals are driven primarily by abstract questions of formal system development rather than explorations with the ultimate goals of solving some practical problems? Are they happy to turn over the field of research with the specific goal of creating useful programming languages to some other area - perhaps Software Engineering Theory? I'm skeptical that is true, but interested to hear a defense of that claim if you actually believe it.

By Josh Stern at Sat, 2010-03-27 15:21 | login or register to post comments

Practical is hindsight

Is it the case that you think PLT should be seen as more akin to Pure rather than Applied mathematics?

I don't think that distinction is substantive in math, so I'm not crazy about using it here. ;-)

G.H. Hardy is famous for saying that his work was intended to be pure and without practical application, but it turned out he was very wrong. His intent had no effect on the utility of the ideas he generated.

Different researchers have different ideas about the applicability of their research to "practical" matters. My read is that Peter views his research goals as much closer to the "practical" than many others view theirs.

However, the role of research is to explore the implications and effects of ideas; to expect it to only focus on those ideas that people already think are "practical" misunderstands the nature of innovation, which is often the result of serendipity rather than design.

Are they happy to turn over the field of research with the specific goal of creating useful programming languages to some other area - perhaps Software Engineering Theory?

I don't know if you've noticed, but most PLs with large commercial traction have come out of industry, with liberal borrowing of ideas from established research. Solving industry's problems, leaning on ideas from research, is industry's job. If you want to quote economic theory, if there is any money in solving a problem, and the problem can be solved with tools available, someone will solve it to tap that money.

Personally, I prefer that researchers aren't worrying about those questions (unless they are paid handsomely to do it by industry). I want them thinking about ideas, because one of their current "crazy, impractical" ones is the seed for tomorrow's revolution in practice.

By Marc Hamann at Sat, 2010-03-27 17:53 | login or register to post comments

Dueling subjective judgments

Earlier you endorsed subjective judgments about what is innovation vs. what is refinement and what is PLT research vs. what is not. So now I've understood you to say that you are comfortable with making a subjective, categorical for the conjunctive predicate

IS_INNOVATIVE(X) & IS_PLT_RESEARCH(X),

but uncomfortable with subjective judgments about

IS_INNOVATIVE(X) & IS_PLT_RESEARCH(X) & IS_PROBABLY_USEFUL(X) .

Is that right, or did I misread you?

By Josh Stern at Sat, 2010-03-27 22:32 | login or register to post comments

Healthy Diversity

IS_INNOVATIVE(X) & IS_PLT_RESEARCH(X) & IS_PROBABLY_USEFUL(X)

My point is this: the X you will choose to satisfy this predicate probably depends on whether you are a researcher or not.

Moreover, it is not a problem that researchers and non-researchers will choose different Xs, but a sign of a healthy and devoloping intellectual environment.

By Marc Hamann at Mon, 2010-03-29 18:37 | login or register to post comments

And the Explorers say

And the Explorers say: If only the Settlers had a look at the maps we have drawn so carefully -- instead of building more and more houses in the same old stinky swamp, just because there's already a Saloon nearby.

And the Settlers say: These obscure maps of alien territories the Explorers keep drawing are useless -- they don't indicate where the next Fort is to protect us and get Whiskey from. Also, we heard that the trees have oddly-shaped leafs out there.

And the Explorers say: Obviously, there are no Forts yet, you have to build new ones. But there is plenty of clean water, fresh air, and fertile soil.

And the Settlers say: Are you kidding? We have no interest in building Forts, we just want to harvest enough food for a living and some dough to hang out in the Saloon. Life is too short. Clean water is overrated anyway when you already have Brandy to make the yucky taste go away.

And the Explorers say: But brackish water makes you sick. Live will be so much healthier and easier in the West, ones you have settled down.

And the Settlers say: Seriously, have you ever traveled across half a continent with all your bag and baggage? Any idea how long that takes? No point if we starve on the way.

And the Explorers say: Do it for your children then!

And the Settlers say: Look how much they enjoy playing in the mud! Barney, just be a bit careful with swallowing that pointy stick, will you?

By Andreas Rossberg at Fri, 2010-03-26 15:43 | login or register to post comments

And the Explorers didn't

And the Explorers didn't care any more for the Settlers and lived happily ever after in California anyway.

By Steven Obua at Fri, 2010-03-26 17:07 | login or register to post comments

To Peter,

Bet the comments in this thread were not quite what you were expecting!

For me, I don't know what EAPLS really is, apart from trying to read the home page.

Your letter reads like it addressed to pointy haired bosses, and not language and systems geeks. If that was your goal, then good job. But then how is it interesting to the LtU audience?

Overall, your over-simplifications need rework for me to take them more seriously. If I was on the board and didn't know your prior work, this letter would not influence me to check it out. For example, you vastly oversimplify what map-reduce is. As Dean rebuts Stonebraker in January's ACM Communications, map-reduce is about two systems-level thinking concepts: (1) fine-grained fault tolerance for large tasks (2) storage-system independence

So, my push back to you is, how are you focused on that? Also, a separate issue is how the language targets such a backend. You're sweeping this away by using the annoying buzzword "data-intensive" (which, again, nobody on ltu IMHO should give any credence to). Another way to state this is you need detail sentences. Data-intensive, in the Google sense, is about data that exceeds main memory. Language abstractions for data tha exceed main memory is interesting, and so that's where you then point me to work you've done or seen in the past. Hanan Samet, JH Davenport and PMD Gray are three different researchers who at various points have pursued tools that could be interesting to language design.

Forget your data-intensive buzzwords. State harder goals. State bigger ambitions, and less will-o'-the-wispier ones.

Data-intensive computing using large-scale distributed algorithms is realizing one by one the old dreams of artificial intelligence.

I would argue you have it backwards. And it is Moore's Law + changes in other devices (volatile memory -> 3d memories, trust-in-the-rust disks->solid state disks) that is doing the realizing. The argument then needs to center around how languages designed with 1970s design guidelines, which include basically all languages today except for oddities like ZPL and Maude, are not really prepared for today's marketplace.

By Z-Bo at Mon, 2010-03-29 02:56 | login or register to post comments

Who are the EAPLS?

Z-Bo writes For me, I don't know what EAPLS really is, apart from trying to read the home page.

I'm not very familiar with them, except that they took over publication of The Journal of Functional and Logic Programming from MIT Press in 2000. They support, with EATCS (which I know much more about), ETAPS, Europe's biggest programming languages conference series. They aren't big supporters of any other major conferences that I am aware of.

I'd like to hear more, if anyone knows.

By Charles Stewart at Mon, 2010-03-29 14:20 | login or register to post comments

1970s languages for the 2010s?

your over-simplifications need rework for me to take them more seriously

I never oversimplify.

you vastly oversimplify what map-reduce is

The simplicity is the whole point: that you don't have to worry about fault tolerance or storage management. Map-reduce removes accidental complexity, leaving only intrinsic complexity (to paraphrase Ross Anderson). If only more distributed programming were like that.

the annoying buzzword "data-intensive"

Probably I should have invented a new word. By data-intensive computing I mean computing where most of the added value comes from the use of large data sets. This includes disciplines such as data mining, machine learning, image and language recognition, computer graphics, signal processing, and virtual worlds, but also search and database management. It also includes large parts of distributed algorithmics, such as swarm intelligence, gossip algorithms, and structured overlay networks, because they implicitly use the structure of a large distributed system as their data. All the essential parts of these disciplines will eventually be folded into languages and disappear.

how languages designed with 1970s design guidelines, (...), are not really prepared for today's marketplace.

This is very easy to answer. All the "1970s" languages assume reliable shared memory with instantaneous global coherence. This is very far from what data-intensive computing needs, which is large-scale distributed programming. All the current work in programming large-scale distributed systems is actually language design in disguise. Libraries for such systems provide language primitives at a high level of abstraction. A typical example is the peer-to-peer transactional store that we built in the SELFMAN project (with both Oz and Erlang implementations). But in order to use this high level, we have to make a tedious detour through the low level by doing explicit library calls and writing programs that run on single nodes. At some point, the detour will become too tedious and we will jump to a higher level of abstraction. Chapter 11 of CTM shows one way to do it, namely network-transparent distribution, but this only works up to about 10 nodes. Beyond that, we need new ideas. This is my point.

By Peter Van Roy at Mon, 2010-03-29 18:11 | login or register to post comments

A need to go for new ideas, then? Yes, I do think so, too.

However, I wouldn't bet very comfortably on either side of the yes/no answer to the original question "Will data-intensive computing revolutionize programming languages?" -- I'm afraid, as far as PLs are concerned, it's still fairly too early as of 2010 :(

But I could maybe bounce on this, here, with the following take:

Actually, after reading it with attention, imho, everybody in this thread made a good (or very good) remark at some point or another; I really enjoyed.

For instance, I totally agree with dmbarbour's about the better option of using the proper abstractions/theoretical input available to our knowledge, at the right moment, at the right place, and depending on the purposes and goals at hands -- that is, instead of giving privilege to a more blind "brute force-based" approach to the problem solving activity, where you don't bother taking a step back and try to look at the bigger picture (again, of the problems you suspect/identify and want to address).

Now, on PLs. Who knows what the future holds about them, at least from the point of view where what we *currently* know about them and their underlying theories.

Seriously. It's already difficult to have everybody agree on how to give the proper definition of their semantics, in the general case, say, of general purpose ones. Not even mentioning (even harder to agree/even more prone to debate and harsh "religious" views) the time where you'll have to choose THE standardized proof checking tool suite, to make yourself (and others) happy about the expected understanding on the language, and the programs you'll write in it...

And all this, of course, being pretty difficult already, even in the simplest cases of mono-threaded executions.

So, yes, I couldn't agree more with dmbarbour and this line of yours : if one wants to reasonably seek a less guessy/ad hoc answer to the question that opened the thread, I suspect one needs to step back and pause a bit (a lot?) and try hard to first have a look at the bigger picture, for a broader vision scope.

Doing so, you have to be intellectually "honest" with yourself, not take everything you already know well or very well for granted, and accept the fact that you may need to rework the phrasing of your own vocabulary's definition.

PLs can be at the same time a source of food and a consumer "muscle" found in those first, very primitive turing machines which are put into flesh by the hardware, your processor architecture. Those are so primitive the only logic they know how to execute is one of those based on the set of the natural integers, and to add to the "insult", the intrinsic representation of values is in base 2 !

From then on, everything else is just a matter of reifying abstractions familiar to us, humans, which come from either real world examples or from self-fed artifacts in PLT (abstract data types, algorithms, etc).

There is of course no fundamental difference between the machine opcode set of your microprocessor and the assembly language you devise to write phrases of the former more friendly. Both are "languages"; both are also the sources of all of their possible "serializations" (meant to be, for some of them, tangible to humans very eventually, on ... a tangible support, mind you!) to which you attach a meaning, by projecting them, with one or two or etc indirections. Then, you devise an even higher level language, a form of cross platform assembler -- the C programming language, to ease the implementations at large of what you call OSes.

Why so, by the way? My own "theory" (I may be totally wrong, obviously) about the historical/semiotical aspects of making such a decision/having such a rationale -- if you even dare to suppose that the presence of the Unix idea wasn't necessary -- goes like this :

while the english-like assembly is surely enough for you to write small useful programs that will mostly deal with/make profit of a) the microprocessor, b) the memory, and c) some kind of primitive input/output device, it's a totally bigger ballpark to make a (mini-/micro-)computer appealing enough to end users, with a Good Old Sacred "Operating System" (and there enter the religions, btw...) which will want to enable the use of dozens/hundreds/thousands of different peripheral hardware devices... Hence, the need for C (or something similar, for having OSes go beyond the phase of being an "idea").

But then, while OSes are pretty much "closed" (write them once, and hopefully for the long term), every-new-day invented applications are not, and you figure C still fairly lacks expressiveness to avoid redoing boiler plate code over and over again...

That's when you step back again and you devise, more or less slowly, more or less surely (depending on the useful input/ideas from others), more or less consciously, even... a new paradigm.

New paradigm... "new" in either the ways you intend to express computations (OO vs. functional vs. hybrid, etc) in the syntax / semantics or "new" in the ways you intend to integrate it/to merge it right on top of your lower reification layers (the platform, OS, runtime environment).

There come things like Java and .NET... with slight variations in their choice of specifics, but fundamentally running after the same idea: since everybody wants/talks about OO types, librairies, etc, why not make them as something coming right out of the box for the development tools and applications which, in turn, will sit on top of them?

And that's it. So, all in all, from what you pointed out, already, and as you especially insisted on the open questions of the impact of the distributed / scalability aspects over the future of PLs, I'd say that I would personally tend to suspect just another side of what would be a paradigm shift I've also found myself speculating about a bit, but well... just from a slightly different (related) angle :

http://lambda-the-ultimate.org/node/3895

HTH! :)

By Cyril at Tue, 2010-03-30 06:01 | login or register to post comments

Browse archives

Active forum topics

New forum topics

My preferred design choices
3 weeks 5 days ago
Delimited continuations don't parallelize well.
3 weeks 6 days ago
With Unicode, Character as a type is in fact meaningless.
4 weeks 11 hours ago
With Unicode, Character as a type is in fact meaningless.
4 weeks 12 hours ago
...people claim about C++
4 weeks 5 days ago
Terminology counts
4 weeks 5 days ago
Terminology counts
4 weeks 5 days ago
Yeah, been there...
7 weeks 6 days ago
Don't throw away a useful tool
8 weeks 4 days ago
Right. That's an example.
8 weeks 4 days ago

User login

Navigation