large-scale programming systems inspired by ecological and market systems

Gerald P. Weinberg's second law says: "If builders built houses the way programmers build programs, the first woodpecker that came along would destroy civilization." We've all heard that one, right?

Thousands of reams of paper have been spent (wasted?) on "what can we do about it?" And, to date, we don't have really good answers. A single bit going stray usually crashes a program, or causes it to give ridiculously wrong answers. Every program has to have everything it depends on arranged just so, or it ignominiously fails or bails out with an error. And entire industries are built around putting together suites of software that don't have conflicting requirements that prevent them from working together, or babysitting complicated applications that will die horribly if there is so much as a bobble in any of a score of systems that provide it with a score of different things it needs.

What we've seldom done is to look at the relatively robust very large systems that we *have* studied and try to abstract principles that can be applied to programming. Two such systems are ecologies and free markets. Both are characterized by: many relatively independent agents with limited lifespans; agents in proximity, even if previously unknown to each other, are able to perceive each other and interact; competition and cooperation between independent agents; a small number of outliers, atavisms, recessives and contrarians constantly explore non-mainstream strategies for many generations, even when those strategies have failed or had limited success in the past; survival/reproduction advantages given to more efficient (or more correct) agents; diminished returns for doing something when there are many agents which do that same thing, and, crucially, dynamic reconfiguration of dependencies in that when some agent fails (dies or goes out of business) other agents that formerly depended on its output can switch to depending on different agents rather than dying or going out of business themselves.

Can these lessons be taken to heart in a software system or programming paradigm? Should threads "pay" other threads for services, in tokens redeemable for compute time or memory space (and, in finding a "going rate", eliminate inefficient providers?) How would we simulate "ecology" or "marketplace" of services that different threads can provide, to one another and ultimately to end-user programs? Should threads reproduce or "copy successful business strategies" automatically?

I think you'd need to keep track of state pretty carefully. If a subcontractor makes a hash of something, or dies due to an error, you need the power to detect the problem, revert your data, and hire a different subcontractor. So each thread would need data versioning, or something like it; the power to remember the current state of its own local environment, and return to that state if needed. You'd also need to automatically check subcontractors' work against each other from time to time, to detect errors/cheaters.

I think you'd need a "DoOneOf" statement, for trying to do something in each (or all) of many different ways (or via subcontracting to many different threads or spawning many different threads) and proceeding if any one of them is successful.

I think you'd need to automatically make occasional "configuration tests" (maybe whenever launching new threads of the relevant type) even when things were going fine, to find, for example, databases at new locations, new network interfaces, etc. This might be handled via a "DoOneOf" mentioned above - and would also automate load-balancing, since the query to the least-used database or interface would tend to return first and thereby get "paid" for the completion of the task.

I think you'd need capabilities (revocable or failable service hooks via other threads) rather than a simple linked-call interface.

How, in a programming language, would you handle a marketplace of software services, where threads advertise what they do and offer rates for which they'll do it (and the schedules on which they will do it, where appropriate)? Can that be abstracted below programmer notice, or not?

Ray

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Redundancy

The traditional decomposition into the two questions of "Is there a correct specification?" and "Does the program implement the given specification?" have some merit. In most engineering, including civil/structural, it is almost taken for granted that reliable constructions need to be specified with degrees of error tolerance and fail safe backups. One version of applying that to programming to check if a program implemented a spec would mean that functions were first specified and then handed off to different independent teams of programmers to implement independently. CASE tools could then check whether the independent implementations gave identical behavior for all code paths. Independent tests of specification correctness would also be helpful, but harder to automate if the specification isn't formalized in some logically complete language. But perhaps "test oriented specification making" would allow good test coverage of where a spec is approximately correct, and the idea of cross validating spec work from independent teams would again be helpful.

Cost (time, money, and run-time efficiency) is usually a hidden issue in all these discussion; producing large, error filled programs is very expensive, and producing much more reliable programs would be more expensive on one or more cost dimensions. The builder/programmer comparisons tend to be unfair to the programmer in a bunch of ways, starting with the fact that the large program is much, much, bigger than the large building in an information-theoretic sense - i.e. bigger in terms of how many bits are required to specify what matters about that particular program vs. that particular building. Buildings and their construction are more similar from one to the next.

In the winter of 2010 ...

... the so-called free market looks more like an abstraction than an actual functioning system, never mind anything fit to model good engineering on.

Markets as Designs

I've always said that if an engineer designed a system whose behavior was as unstable that exhibited by market economies, he'd never again find work as an engineer.

The difference in instability...

Is that unstable markets continue to function and correct themselves, whereas unstable programs (at least the way we make 'em now) just crash and die.

Conflating word 'stability'

The way economists and statisticians use 'stability' and 'unstable' isn't the same way you see programmers refer to 'unstable'.

Requiring humans to monitor

Requiring humans to monitor the system and adjust its operating parameters is not what I'd call self-correcting.

To be fair, there is still

To be fair, there is still contention in economics about whether humans really do need to adjust operating parameters.

function and correct

hey if somebody have me 800 billion dollars i could probably fix my program to suck less, too.

Most software does not need

Most software does not need any of this: if you examine Martin Rinard's work, you will see experiments to demonstrate that your premise is wrong:

A single bit going stray usually crashes a program, or causes it to give ridiculously wrong answers.

There are many caveats to this -- say hardware becoming more unreliable as it continues to shrink -- but that does not necessarily require a change to how we specify software. A lot of the ideas you mentioned are interesting (e.g., lottery scheduling is a Good Thing and fits into this model), but I suggest focusing more on the problem before picking your solution.

Service Markets

For very large scale programming the use of service markets makes sense. Certainly, achieving this as common in the large scale will resist Denial of Service attacks, Viruses, Worms, and other forms of malice that will suddenly become 'expensive' to create and maintain.

Automating the market can help make it much cheaper and more competitive than it is today, where the need for human intervention has a cost (and effectively causes vendor lock-in). That is, people will be able to compete by offering their own resources and services on the market, build reputations for quality, etc.

That said, I doubt you'll want "threads" to be the elements paid. At the low level, it'd be clouds and independent hosts. At the high level, it'd be services, which could be running any number of threads. The services would pay other services and pay clouds and independent hosts.

How, in a programming

How, in a programming language, would you handle a marketplace of software services, where threads advertise what they do and offer rates for which they'll do it (and the schedules on which they will do it, where appropriate)? Can that be abstracted below programmer notice, or not?

Is there any reason to think this is a programming language issue?

Is there any reason to

Is there any reason to believe it isn't one?

I would say the answer to your question is: yes. Integration with a market would have some very pervasive requirements for composition and expression of code. Further, it's one of those things we would certainly want to get "right" even while handling failures or 'out-of-money' errors and such.

This suggests it will likely become a PL issue, eventually, with integration at both the language and something like a type-system. Though I would suggest we get it working and determine which patterns are the most common before we go so far as to build it into a PL.

Building large-scale robust systems

To build large-scale robust systems you need two things: redundancy and feedback. I don't understand why you limit your inspiration to ecological and market-based systems. A more direct inspiration is (parts of) biological organisms, like the human respiratory system.

In computing, probably the platform that lets you do the most in this regard is Erlang. It supports redundancy (each process or "agent" is independent) and feedback (each internal node in a supervisory tree is a feedback loop guarding part of the system). We have extended these ideas to loosely coupled systems: see The Limits of Network Transparency in a Distributed Programming Language.

Of course

Fault-tolerant circuit design also teaches these lessons in the context of hardware, not software. This is due to the trivial reason that circuits are networks with a closed loop and a return path, and fault tolerance is derived mainly from studying the number of (redundant) input and output pins.

The ideas I've been exploring in programming language design are actually mostly inspired by networking and circuit design.

Redundancy and Feedback

Redundancy and feedback might be enough in a closed system, but you'll note that even the human respiratory system needs some extra facilities to deal with pneumonia or smoke damage.

In open systems you need to imagine there are plenty foreign bodies of code waiting to mess things up. When that happens, it is necessary to know which systems to shut down, and which to repair.

Markets are suitable for dealing with the case where the reason things mess up is competition for limited fungible resources, such as CPU, Memory, Bandwidth, Energy, Power, competition for limited non-fungible resources (such as specific time slots, access to locks, control over a camera), and to help pay for Waste (Memory Garbage or Heat). There are other mechanisms for dealing with other threats.

How, in a programming

How, in a programming language, would you handle a marketplace of software services, where threads advertise what they do and offer rates for which they'll do it (and the schedules on which they will do it, where appropriate)? Can that be abstracted below programmer notice, or not?

I'm fairly skeptical about the general applicability of these ideas.

The reason is that mathematical models of markets (and ecologies) are essentially feedback systems whose semantics are given as the implicit solution to a fixed point equation. E.g., the proof of the existence of Nash equilibria in game theory works by an appeal to Brouwer's fixed point theorem. However, the computational cost of calculating these fixed points is, in general, quite high -- finding Nash equilibria is the archetypal PPAD-complete problem. So this means we face the traditional obstacle to declarative programming: if we view specifying a program as a market-optimization problem as a kind of declarative program, then we face the problem that generic strategies for solving these problems can be very slow in practice.

Now, there are subclasses where efficient solutions are possible: Daskalakis and Papadimitriou have shown that polynomial time approximations are possible for anoynmous games. This may make a reasonable basis for a DSL.

No agent in the market has to "solve" the market.

Your points about the complexity of, eg, finding equilibria are well-taken, but why would any specific part of the system need to do that? Equilibria in markets and biomes are emergent properties. A trader is never thinking about the market as a whole; he is making a decision about the current deal. A woodchuck, similarly, doesn't work out exactly how much oxygen it needs to consume and how much CO2 it needs to produce in order to have the optimum regulatory net effect on the local biome. It just breathes as much as it needs to, thrives and raises young if the biome favors its needs, or has limited success of the biome does not, and the equilibrium in the local biome emerges over woodchuck generations.

Just a note, but the word "equilibrium" itself is something of a misnomer; what markets and biomes find are more properly attractors which affect a dynamic (chaotic) current configuration.

That's the key isn't it?

"Just a note, but the word "equilibrium" itself is something of a misnomer; what markets and biomes find are more properly attractors which affect a dynamic (chaotic) current configuration."

You want to apply chaotic processes modeled after those that occur naturally to what problems? Flying the Space Shuttle? Keeping your credit card balance accurate over time? How is that desirable anyway? Isn't there a field that researches this already? Distributed agents? Network protocols?

This is all just non language related bloviating in my opinion.

Some Previous Work

See "Combining Agoric and Genetic Methods in Stochastic Design" at either http://autogeny.org/chsmith.html

or at it's Google-cached copy

http://74.125.47.132/search?q=cache:rdB35R8s8ocJ:autogeny.org/chsmith.html+autogeny.ORG/chsmith.html&cd=3&hl=en&ct=clnk&gl=us

In particular, scroll down to the section titled "Charles Smith: an Evolutionary Economic Framework".

Even more relevant may be Eric B. Baum's Research at http://www.whatisthought.com/eric.html

Scroll down to "artificial economies of agents that reinforcement learn". Start with Baum's paper "Evolution of Cooperative Problem-Solving in an Artificial Economy" which describes his program Hayek.