Programmer Archeologists

I ran across this while doing some research recently:

In Verner Vinge’s space opera A Deepness in the Sky, he proposes that one of this future’s most valuable professions is that of Programmer-Archaeologist. Essentially, the layers of accreted software in all large systems are so deep, inter-penetrating, idiosyncratic and inter-dependent that it has become impossible to just re-write them for simplicity’s sake – they genuinely can’t be replaced without wrecking the foundations of civilization. The Programmer-Archaeologist churns through this maddening nest of ancient languages and hidden/forgotten tools to repair existing programs or to find odd things that can be turned to unanticipated uses.

From A Deepness in the Sky by Verner Vinge:

“The word for all this is ‘mature programming environment.’ Basically, when hardware performance has been pushed to its final limit, and programmers have had several centuries to code, you reach a point where there is far more significant code than can be rationalized. The best you can do is understand the overall layering, and know how to search for the oddball tool that may come in handy -”

Such an interesting possible future. Hopefully we can avoid it with better code data mining techniques and eventually AI. But if not, is it possible that future programmers will be more like scavengers looking for interesting tidbits of code to reuse rather than producing new code from scratch?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Abstraction may alter that scenario

In the long run, it seems the distinction between our overall (heterogeneous) computing environment, and a single (homogeneous) programming language within it, will dissolve.  There are two ways that can happen, though, and I suggest Vinge's "mature" scenario would only happen in one of them.

Long ago, when I first set out to build a theory of abstraction, I drew a basic distinction between radical abstraction, which one has to understand by stepping outside the environment of the previous abstraction layer (e.g., implementing Scheme in C), and incremental abstraction, which can be understood as a change in a programming environment that occurs within the environment (e.g., implementing a new set of classes in Java).  I soon concluded that radical abstraction is neither theoretically nor practically interesting (for abstraction theory).

Theoretically, radical abstraction is uninteresting because —in theory— when presented with a radical abstraction you can always step back to view the larger computing system within which the radical abstraction takes place, and from that more rarefied perspective the abstraction is actually incremental.  Just as we used the source code of some new Java classes to go from (Java without those classes) to (Java with those classes), we use the source code of a Scheme interpreter in C, together with some commands to the operating system, to go from (OS platform with a C compiler but not a Scheme interpreter) to (same OS with a C compiler and also a Scheme interpreter).

Practically, it seemed the insights to be had in abstraction theory would come mostly by studying the more limited case of abstraction within a programming language in the conventional sense.  That comes back to Vinge though (so I claim I'm not really off topic).

Within a single programming language, things are quite uniform, and the challenge is to achieve efficient abstraction, so that in successive abstractions one does not accumulate unwanted dependencies on previous layers.  Accumulated dependencies would make each successive abstraction more difficult (equivalently, more brittle) than those before, until eventually it becomes impractical to go further, and we've reached the "radius of abstraction" of the base language.  It seems we can't achieve Vinge's mature programming environment in that scenario because the mature environment is too sprawling to fit within the abstractive radius.  On the other hand, if we could achieve efficient abstraction, making the radius of abstraction effectively infinite (is that possible, and in what sense, are key questions for the theory) — if we could, then with such efficient abstractions, the dependencies between layers might be sparse enough that we could avoid the unmanageability Vinge described.

It seems that in this homogeneous PL environment, we either can't reach the scope of Vinge's mature environment (with low radius), or needn't reach its unmanageability (with high radius).

But, when we reach the abstractive radius of our programming language, and want to go farther, what do we do?  We may build AI-like tools to try to coax more out of the language, but imho this is a losing strategy, in that we can never get more than a (poetically) constant factor increase in abstractiveness that way, whereas improving the abstractive facilities of the language could give us a larger asymptotic class.  What we do instead is invent a new programming language — "radical" abstraction, which takes us back to the heterogeneous OS-level environment.  In principle, our abstractions at that level ought to be really quite efficient:  we should be able to implement Scheme in C or Java or any other general-purpose language and end up with the same Scheme language... which doesn't quite happen because there are quirks of the implementing language that create quirks of the implementation.  A heterogeneous OS platform seems a friendly environment for unmanageable complexity; so I can well believe that if our centuries-future platform evolves from heterogeneous OS platforms instead of homogeneous PL platforms, Vinge's "mature" scenario may indeed come to pass.

You used to have to work

You used to have to work hard to make code portable. Now code is generally portable by default (modulo certain OS abstractions). You used to have to tediously manage all memory resources, and your programs were riddled with resource management logic instead of business logic, but now memory management is all automatic.

If you wanted your program to distribute across a cluster, you used to have to program all the networking manually, along with the distributed consensus protocols, but now all of that can be mostly automated. If you wanted search capabilities, you used to have manually write some sort of B-tree and choose your search method carefully, but now with constraint systems, this too is becoming declarative and somewhat automated and I expect it will become part of the next generation of languages (it's already in Oz and Alice ML, and in a limited form in Linq on .NET).

I don't think Vinge's future will come to pass, because the trend is that better abstractions moves more code into the standard runtime, which ends up dramatically simplifying programs. Our programs then also become more ambitious, but not much more "complex" than before.

back and forth

I think it is fair to say lower level languages get popular when hardware resources are scarce and higher level when hardware resources are plentiful. The move towards PCs re-popularized lower level languages, the things you mention as improvements would be seen as tremendously low level by a typical COBOL programmer.

I think we are clearly getting better at creating abstract efficient programing languages and paradigms with time. But our problem set is so much more complex. Think about how many levels of understanding are even required to work through what is actually happening to a .NET program on a CPU.

I think we have already crossed the point where no one person can actually understand how a major application works top to bottom in detail. Now it is just a question of how small the percentage will be.

understanding things top to bottom

(Saw “node 4424” mentioned with reverence by none other than Anton himself. Had to check it out.)

Re: comment 68699:

I think we have already crossed the point where no one person can actually understand how a major application works top to bottom in detail.

Gerald Jay Sussman concurs. That's exactly why they de-Schemed MIT.

Going Low Level

I think it is fair to say lower level languages get popular when hardware resources are scarce and higher level when hardware resources are plentiful.

The fun aspect for me is the idea that there is a natural progression where those archaic micro-controllers are parallelized with low level languages from the bronze ages of programming and with the increase of hardware resources we also move upwards through history towards the contemporary promiscuous and postmodern melange of web programming in the cloud. This is also basically a story in which we express our own subjectivity: a linear order of stages, a progression from lower to higher forms, from the primitive to the complex, but also becoming old and cranky.

At the same time I find the nice quote of C.Doctorow that a car has become a computer which also drives. Of course this is meant to indicate a surprising inversion in the order of things but in terms of programming technology it's a becoming contemporary of the bronze age because of the sort of computer a car is ( and a train, a plane, ... ). So it is a double inversion. I guess for the years to come it will be the major fun to plunder the toolboxes and idea scape of modern programming languages throw out some of the OOP and FP idioms and go low level.

I'm fascinated by the future

I'm fascinated by the future could be, the actual future is probably more boring then that. I totally believe that we will be able to keep up, at least in our lifetimes.

What if our abstractions eventually don't keep up with the complexity of our technology? This could happen in a couple of ways. First, there could be a singularity where strong AI takes over developing the technology. While we could benefit from the tech, we'd no longer really understand how it works, and then we essentially lose control. We already can see this happening in machine learning: we can train a program to do something, but the structure built up by training is mostly opaque to us, much like the results of evolution and the current state of a dynamic physical system.

Second, if there is no singularity and we somehow stagnate in our technology development. This is more of Vinge's vision. In that case, a lot of useful but not obsolete software would accumulate very quickly. Today's code effectively has a shelf life related to the platform it depends on and the nature of its job. It seems if those shelf lives were extended, the code would have to live longer also (examples today already include many banking cobol systems).

It is interesting that the timekeeping kernel ...

... uses the Unix epoch, specified as about 15 megaseconds after the first lunar landing. To be precise, Eagle landed at timestamp -14182941.

Replaceable Abstractions

A counter-vision to Vinge involves automating reuse of code, representing maintenance, security, packaging, and upgrade concerns directly in the language and module system. We can envision have a culture-based `software singularity` even without AI. Automated discovery of code effectively enhances the system with the amount of available code, without requiring that code be written by AI. A linker can be a simplistic constraint solver that only uses code written by humans.

Policy to favor one implementation over another can be represented in our code, rather than left implicit. By representing policy, our code becomes very adaptable (configured based on preferences or available implementations) and resilient (fallback implementations can be used if necessary). Fallbacks in turn, with a carefully designed state model, can better enable runtime upgrades - reducing commitment to any given implementation.

It is this vision that leads me to develop a reactive programming model with controlled access to state, a focus on fusing and orchestrating heterogeneous data models, and a module system that provides a clean separation between interface and implementation.

Is it possible to develop a programming language that can scale to a million developers? In seeking an answer to this, I learned that `composition` is far superior to `abstraction`: ability to represent diverse ideas pales compared to the value of composing and understanding all the different ideas we represent. Automated discovery and upgrade means we cannot poke too far into the implementation, so most valid reasoning is compositional in nature.

My position serves as a counterpoint to John Shutt, who believes that rich abstraction and clean separation of abstraction and performance will somehow solve scaling problems. They do not. `Efficient abstraction` and `scalable abstraction` are distinct concepts, though both properties are valuable. I would constrain developers to abstractions they can reason about at scale and across scalability challenges (e.g. runtime upgrade, network disruption, diverse data models, and heterogeneous authority).

Similar visions include Gilad Bracha's `Serviced Objects`, though I believe those fall short for reasoning about disruption or sensitivity to starting time, and do not help resolve a problem of upgrading (or replacing) stateful objects without losing work.

There are some extremely

There are some extremely interesting issues here, which I fear get severely muddled (even mangled) by a terminology incompatibility. It's seemed to me for quite some time now (and I feel defeated in my efforts to untangle it) that we're using the word "abstraction" in ways incredibly subtly and most profoundly different. I'm put in mind of a running joke in Vernor Vinge's A Fire Upon the Deep (to which A Deepness in the Sky is the prequel) in which someone is commenting on the events of the story from the far side of many layers of automatic translation, and keeps going on about how the key to understanding the situation is the fact that humans are hexapodal (or at least, that's how their commentary comes out once it's passed back through the many layers of automatic translation; I'm recounting from memory, but it ran something like that).

Note, I was not using the term efficient in its performance sense. I'm thinking of something more like resistance in a wire. When a new layer is constructed on top of the existing system (in my terminology, an act of abstraction), the new layer may have unintended dependencies on the pre-existing system, and these unintended dependencies are like heat generated by an electrical current due to resistance in the transmission wire. This line-loss curtails one's ability to transmit electricity, just as the accumulating unintended dependencies limit one's ability to repeatedly extend the system with successive layers. What we really want is a good high-temperature superconductor.

(I actually suspect, but it's only a suspicion, that what you mean by "composition" may be remarkably closely related to, and complicatedly entangled with, what I mean by "abstraction", and that if one could ever manage to grok both what I mean and what you mean at once there might be really marvelous insights to be had... but I'm beginning to wonder if we've managed to produce two big ideas that are impossible to hold in the same mind at the same time.)

What I mean by composition

What I mean by composition refers to having composition operators (e.g. using the Arrows model), a set of operands (universally including results of prior composition), and compositional properties (i.e. that you can inductively reason about based on the operator and properties of operands). I explain composition in an earlier discussion.

To achieve a broad set of useful compositional properties properties requires constraining the set of operands and operators. That is, one is limited in the abstractions one can represent. One can associate `abstraction` (noun) with a named operand or result of composition.

You refer to `layers` of abstraction. This applies just as much in that scenario: developers would be constrained to express layers that can further be composed and layered and maintain compositional properties. Not all abstractions are layered, of course. (The notion of `layering` - more formally `staging` - can itself be represented by a composition operator, e.g. `eval` or ArrowApply.)

I think I understand what you mean by efficiency, though I would have a different terminology for that concept (cf. semantic noise, syntactic noise). But I would still maintain that potential noise - some mismatch between what you can say and what you might want to say - is an acceptable (and apparently essential) tradeoff for compositional properties.

Our ability to "repeatedly extend the system with successive layers" is not necessarily a function of "accumulating unintended dependencies". Hidden dependencies are not a problem if all relevant reasoning about the code is either local (i.e. within one module) or compositional.

I'm slowly forming a theory

I'm slowly forming a theory about how our terminologies are misaligning so painfully.

Forget I ever used the term "layer" with respect to abstraction; it can only be made to work with my notion of abstraction if one already has an unshakable grasp of the notion I'm using, so I should have known better than to use it.

Composition, in the sense you're... I think... using the term, is something that goes on within a language. It's essentially part of the structure of the text. You can compose whole modules with each other, but for the scale I'm working on that's still just fine structure. Modules exist within the medium of discourse.

Abstraction, in the sense I'm using it, isn't in the structure of the text; it happens to the entire medium of discourse as a whole. Composing abstractions makes about as much sense as Chomsky's "Colorless green ideas sleep furiously." Composition in your sense and abstraction in my sense belong (I conjecture) to utterly different worlds. The situation puts me in mind of the point from my monads essay about the 'bigness' of monads: they may look like patterns of typing within the language, but really a monad represents an entire universe of "all possible computations", so vast that entire programming systems are trivia to it.

It's a point that goes by so fast in my abstraction-theory techreport, it's scarcely even possible to discern — that the internal structure of "texts" isn't inherently relevant to the theory. It [i.e., internal structure] comes in through the back door, as classes of properties that are preserved by the families of morphisms one uses to parameterize the expressive and abstractive relations between languages.

Perhaps, abstraction theory has a dual formulation (in the sense of wave-particle duality) wherein the internal structures shift into the foreground. But as yet I haven't worked that out.

External structure

For composition, it is not `internal` structure that is important. That can be hidden; one needs only the summary of compositional properties for further inductive reasoning. Rather, composition is about `external` structure - how each operand (which can hide a whole volume of discourse) is used in context.

When you say "abstraction happens to the entire medium of discourse as a whole", you capture just one point of view - of someone trapped in that `abstraction`. But some developer must construct the interpretive context, and this developer will possess an external point of view - in which your `abstractions` are volumes of discourse with clear spatial and temporal scope. Those volumes may be represented by words or URLs or points in an external `abstraction`.

For scalability, it is valuable that all abstractions support the same compositional properties and decidable analysis thereof. And, thus, the `external` compositional requirements may constrain the abstractions a developer might experience or develop within each volume of discourse.

Internal structure

For composition, it is not `internal` structure that is important. That can be hidden; one needs only the summary of compositional properties for further inductive reasoning. Rather, composition is about `external` structure - how each operand (which can hide a whole volume of discourse) is used in context.

The good news is, the above confirms I'm dead right about where the miscommunication is occurring.  The bad news is, the miscommunication clearly still exists.

I'm not talking about what composition is about, I'm talking about what it is.  You mention operands and contexts; from the perspective I'm using, all of that is internal.

No Absolute Reference

Internal and external are just functions of topology relative to observer. There is no such thing as your `abstraction` that is external to everything else.

entirely different dimension

The words "internal" and "external" in the sense you're using them are along a dimension orthogonal to what I'm talking about. In the dimension you're addressing, it's patently obvious there is no 'greatest element' that would be 'external to everything else', but you're mistaken in suggesting that I've ever claimed there was such a greatest element. From my perspective, the entire dimension you're talking about is what I (but evidently not you) would call 'internal'.

Medium of discourse

You say abstraction refers to the whole medium of discourse (i.e. the language, or some higher level description thereof). What you call `abstractions` - even when you make them big like categories or monads - are still just points in yet larger abstractions. Any objective definition of abstraction must allow abstractions to be objects of discourse.

Compositional properties penetrate all abstraction - both the medium of discourse and the objects of discourse. This penetration makes compositional properties valuable: it enables limited local reasoning about non-local phenomenon without hindering modularity. Examples of useful and potentially compositional properties include safety, permission, consistency, latency, progress.

By nature, any compositional property will be an imprecise description of behavior. So compositional reasoning must be accompanied by `local` reasoning (e.g. in terms of interfaces, contracts, denotation or operation of a subprogram). However, imprecise does not mean ineffective.

Compositional properties are vastly more valuable than abstractive power when the goal is a scalable language (where `scalable` includes many developers, many administrations, many generations - not just performance). In practice, I believe a developer's ability to layer, extend, reify, or otherwise transform a `medium of discourse` should be curtailed to just those transforms that preserve rich compositional reasoning. Or put another way: developers shouldn't allowed to `abstract` themselves into a corner where compositional features are unavailable.

I've repeatedly pointed out

I've repeatedly pointed out that you are completely misunderstanding what I'm talking about. I'm not sure how to say that more clearly; I could easily believe I'm [not] explaining my theoretical ideas nearly as well as they deserve, but I'm having trouble believing that the problem getting across the message "you are misunderstanding me" is in the way I'm saying it.

[edited for really egregious word-omission]

Modus tollens

You frequently discuss the symptoms of what you consider to be idealized abstraction. For example: "In principle, our abstractions at that level ought to be really quite efficient: we should be able to implement Scheme in C or Java or any other general-purpose language and end up with the same Scheme language." I understand these well enough, even if I do not grasp your philosophically refined use of terminology.

And I know that we cannot have your "really quite efficient" abstractions and also have my compositional properties. You allege that your `abstraction` is in a different dimension from my `composition`. I don't need to understand you or your terminology to recognize the conflicts in our claims. Modus tollens - conflicts in conclusions indicate error in premises or logic.

At the moment, I believe you have not sufficiently explored composition or the relationship in your theory between abstraction and composition. Terminology concerns aside, you can take my earlier responses as attempts to describe some of the conflicts and relationship, and why composition isn't a separate or `internal` issue even in your theory.

You can't safely conclude

You can't safely conclude there's a contradiction between our claims if you don't know what my claims are. As an abstract example (i.e., not based on specifics of the situation here): Suppose I say that A is true, and that A implies B. Suppose you misunderstand me to be saying that C is true and C implies D. And suppose you observe that D is clearly false. There is a contradiction here, but it's not a contradiction between your observation and my premises; it's a contradiction between your observation and what you thought I said. To put it another way, the contradiction does imply a false premise somewhere, but some of the premises involved are your premises about what I'm saying.

I'm quite interested in how composition might interact with abstractive power, in the sense I'm exploring. Keeping in mind that my formal definition of abstractive power can be rather mind-bending in its consequences; I've barely scratched the surface.

Suggestion for maybe unifying your definitions

Last time you posted about your abstractive power, I read your paper but didn't really grasp the reasons for the definitions chosen. I'd still like to spend a little time and try to do so, but one thought I had at the time, which may be relevant to your exchange here with David, was to consider modifying your setup to treat code as trees of constructs, where the children of a given construct are unordered, as opposed to a sequence of constructs. This goal is to model parallel development by teams of developers.

It seems to me that your framework would consider "most powerful" a language that allowed each construct to install a new interpreter for the tail of the code. Whereas, as David often argues, constructs that work "out of the box" are more important than the ones that can be bolted on after the fact if bolting them on after the fact requires everyone to agree to use the new feature. I think the change mentioned above does a decent job of modeling that situation.

It seems to me that your

It seems to me that your framework would consider "most powerful" a language that allowed each construct to install a new interpreter for the tail of the code.

Ah! But the fact it doesn't do that is kind of at the heart of what makes it interesting. Instead of favoring languages that give you raw power, it seems to favor languages that give you the ability to decide how much raw power to afford in future over facilities built today. Emphasis on seems, because there's a lot of basic start-up work still to be done.

I don't follow

If each construct can, by default, write an arbitrary interpreter for the tail of the code, then any construct can also choose to rescind or restrict that ability for subsequent constructs. It's just a matter of picking an interpreter that imposes those limits. What am I missing?

Hm. Perhaps I'm not following

Hm. Perhaps I'm not following what you mean. Any time you change a language, no matter how small the change (such as adding a single declaration for an integer variable), that could be implemented by writing a whole new interpreter to behave exactly like the pre-change interpreter except with that one difference. Internally it could be implemented like that; but internal implementation matters not at all to the theory (not unless one deliberately exposes some aspect of it, e.g. performance); it only matters what the programmer enters in order to cause the change to happen.

For the programmer to explicitly write-and-execute a new interpreter is a different matter. Being able to do it may indeed sometimes increase abstractive power (though it may involve a class of languages that one would choose to exclude from one's study for another reason), but being able to do so is different from being able to do so gracefully. The theory is parameterized by a notion of expressiveness, which can discern the difference. For the programmer to write a whole new interpreter is likely to be very clumsy and error-prone (one detail slightly wrong results in something wildly different than expected), and the pattern of expressiveness relations around it will therefore tend to distinguish it from the simple variable declaration — the simple declaration is easy to do, relatively easy to get right, and when one gets it wrong the results aren't nearly as drastic as with the full-blown replacement interpreter.

[edit: missing ")"]


I'll just have to go through it again. This post may be helpful to me when I do, so thanks.

On my agenda is a blog post

On my agenda is a blog post explaining this stuff clearly. The trouble is, I haven't yet figured out how to do that. Moral: I'm "aware of the problem".

Temporal and Spatial

In addition to deciding power afforded to the `future` - i.e. on one linear dimension - it is important also to consider power afforded spatially - i.e. for federated and parallel development, security, modularity.

Matt's explanation was well worded, and I appreciate it.

Anyhow, you seem to want abstractive power in practice - i.e. even in a scenario such as Vinge's mature programming environment. It seems to me that granting older generations the power to curtail what is afforded to the future is directly opposite your practical goals.

Restriction on future abstraction based on past abstraction is one scenario I'm aiming to avoid. Inertia, entanglement, and state are bad enough without help.

"Time" is, by nature,

"Time" is, by nature, a distinguished dimension. Each of those other things you mention would presumably impinge on the theory in a different way, and studying just how they impinge on it offers interesting directions for further study.

The idea of power through selective constraint is indeed a slippery one.

Scenario: I write a function, adding it to the vocabulary of the language and thereby producing a slightly different language. If the program thereafter ("in the future") is able to examine all the internal details of the function, that's a different resulting language than if the program thereafter is prohibited from examining those internals. Yet another language would result if the program thereafter were permitted not only to examine the internals, but also to modify them. (Of course examining and modifying can be done in various ways; I'm just sketching, here.) A language that dictates whether function internals will be examinable, and whether they will be modifiable, is presumably going to be less "powerful" than one that allows the programmer who writes the function to decide what the program thereafter will be allowed to do with it.

Judging by sheer expressiveness, the language that allows everything to be examined and modified would seem to be most powerful. However, the ability to regulate which of these things will pertain is another kind of power, and one for which (with some formal definitions) I'm using the name "abstractive power". More subtly, it seems the theory should recognize power to regulate that power of regulation (say, allowing the programmer to arrange that it's easier to define functions with unexaminable internals). There may be still subtler forms of abstractive power whose importance will emerge through deeper study of the dynamics of the theory.

Is raw abstractive power desirable? That's a far, far deeper question. Surely it'd be silly to treat any one property of programming languages as universally more important than all others; one wants to understand the interplay between different such properties, and that again leads to further studying the dynamics of the theory.


I would emphasize some different aspects of your scenario. For example, when you define your function, you do not modify `the` language. You only modify a `spatial volume` of the language - where the function happens to be in scope. There is no universal scope, and scopes are not generally hierarchical. Distributing access to a function, modeling scope, is something that would ideally be performed within the language.

The relevance: composition may cross scopes or volumes with distinct abstraction - thus my references to `composing abstractions`, and reifying them, and why the compositional properties constrain abstraction.

I am certain my emphasis is influenced by my efforts towards a scalable development model.

Regarding `power`, it seems to me that the power is in the meaningful decision. In another language, developers might express an accessibility or modifiability decision by choosing `function` vs. `data` vs. `object` or `service` and so on. What you seek often seems more about uniformity of code than about power.

This seems (knock wood) to be

This seems (knock wood) to be the essential difference of perspective between us, and really ought to be a source of strength —allowing everything to be seen from two different angles, thus understanding it more thoroughly— rather than a source of weakness —causing us to miscommunicate.

The two perspectives here might be called "whole" and "parts".  You're apparently confining yourself to look only at the parts, to the point where you deny that they are part of a whole, using a sort of Zeno's-paradox argument.  I am looking at the whole, and from my perspective all conceivable parts are just that, parts.  I fully expect to be able to study the dynamics of parts through their cumulative impact on the whole, which is a fascination of this approach — trying to see everything through its impact on the grand scheme of things.

It's somewhat (I'm supposing) as if I were looking at an unbounded plane and you were looking at bounded regions within the plane.  You reason that no matter what region I look at, there's always a larger one that contains more than what I was looking at.  I reason that no matter what region you look at, it's still only part of the plane.  But my theoretical approach never does look directly at any bounded region, and you approach never looks at the unbounded plane; only in discussion outside our respective theoretical frameworks can we directly view both the parts and the whole at the same time.

Partial View of the Elephant

I posit that, in a scalable programming system, mine is the only tenable perspective. No matter who you are - programmer, engineer, designer, architect, software agent - you can only observe and influence parts of the system. I don't need a Zeno's Paradox argument. Bounded is the view with which humans will recognize, classify, and perform `abstraction`.

I think you should reconsider the validity of your perspective, at least for an argument regarding Vinge's mature programming environment or other scalable systems.

I do look at properties of `unbounded regions`. Compositional properties offer developers a firm, formal grasp of how each part contributes to an unbounded whole. The cost is constraining local abstraction.

my theoretical approach never does look directly at any bounded region

Neither does my theoretical model. But I do model observation and influence by agents within the system. And the view of any agent is finite and partial.

The whole animal

You've got yourself trapped in a sort of conceptual Klein bottle.  You justify your preferred conceptual framework based on the sort of questions you expect to ask, but (it's clear from outside your framework) you're limiting yourself to only ask questions fromulated within your preferred conceptual framework — leading predictably to a perception that your framework is well suited for all relevant questions.  (This is a phenomenon I'm familiar with from quantum mechanics, which also defends itself from any conceptual assault by denying the validity of any question it can't in principle answer.)  It's not at all surprising you don't understand what I'm doing, but a little surprising you'd still be assuming you do (as evidenced by "I do look at properties for 'unbounded regions'").

What my approach is optimized for —what, in fact, led me to to the approach, because I found the conventional parts-oriented approach couldn't make the grade— is studying how changes in the shape of a programming language affect the shape of the web of application areas it can address.  The approach might have been chosen specifically for maximal suitability to assessing Vinge's scenario.

No better position

I design a system to aid in answering the questions or enforcing the properties I believe are most valuable and relevant. Developers will bridge gaps as they always do - with discipline, documentation, and abstraction.

Attempting a gestalt view will not, in practice, offer a better position to assess a mature programming environment. One will only be able to formulate questions that cannot be answered due to lack of perspective, information, formalization, commitment, confinement, or authority. A model developed of the mature programming environment will be just another subsystem with its own scope, abstraction, and partial view of the system.

Can you offer a few sample questions you might ask about `the shape of the web of application areas`? I'm curious how you achieve a question whose answer would be interesting, decidable, and non-trivial in most languages.

Between Turing completeness, Goedel's incompleteness, and Rice's theorem, I believe the best we can generically do is answer questions for which the language is designed, and design said language to aid in answering valuable and relevant questions.

Anyhow, we're repeating an old argument between us that never went anywhere the last time we tried it. I'll stop now.

That you think there's an

That you think there's an old argument here shows it's not me you've been arguing with. Any argument you've had has been with a phantom you've assigned to me as a supposed position; my efforts have been simply to flag out to you that you're profoundly misunderstanding me, and afaics you haven't argued against that at all but instead have ignored it.

I'm very sad to see this thread fail; I'd invested a tremendous amount of time in finding ways to respond politely and constructively to your increasingly... well, to your remarks, and it's a shame to see all that effort come to naught.

You said the same thing in

You said the same thing in our prior argument on this subject. I acknowledge that you believe I misunderstand you.

I've been reminded that LtU is not a forum for arguments. Best describe your position in a blog article.

The future

My guess is that the systems would look at lot like what you see in well maintained corporate systems today. There you have enormously complex systems that are semi understood by the employees collectively but poorly understood individually. Large cross functional teams need to meet to even evaluate the impact of mid sized changes, and large scale change is just ruled out.

Now code is written but it replaces small chunks of functionality in small subsystems. Dozens if not hundreds of languages are used by the system as a whole.

The reality of the vision

The reality of the vision depends on whether it is more effective to use possible upgrade and migration paths into modernized runtime environments + frameworks that come up with a couple of "strata" or layers out of the box and are centrally maintained by small groups ( compared to the overall number ) of programmers than it is to maintain legacy systems which run stable and are rarely adapted. So it is not that everyone adapts quickly to the tip of progress where new things are happening ( a modernist and nerdy vision of a long chain of disruptive innovations and a market vision of fashion and replaceable consumer goods ) but balancing the benefits of progressing against its costs. The "programmer archeologists" would then be those who move to places of unused migration paths and century long inertia. Why not?

Composition, Entanglement, Abstraction

Vinge's vision of a calcified, brittle programming environment depends on developers being unable to replace dependencies, or just afraid to do so without `wrecking the foundations of civilization`.

Compositional properties and reasoning - especially those related to safety, security, consistency, latency, resource management, and failure modes - would offer developers an effective, formal grasp of just what happens to layers of dependents when they replace code. This would greatly enhance the grasp developers normally possess based on module interfaces and contracts. This would alleviate fear of change.

Of course, composition is insufficient.

Even without fear, developers must also be able to replace dependencies. There are various problems to solve regarding code distribution, packaging, configuration, state, security, namespace entanglement, and other causes of dependency hell.

Abstraction is a tertiary concern. Historically, we can and do build large, calcified, brittle systems with simplistic abstraction. But a language that can be tweaked to a wide variety of different domains and purposes will be more pleasant to use at scale (i.e. less boiler plate, less noise, less pain and pressure to change), and thus can be taken by human developers to larger scales.

why it's called "science fiction"

Economic collapse is more likely than software systems that become that complicated.

Already, for example, overly complicated software stacks are explored not by archaeologists, but by cyber-warriors in international conflicts. Already, retail software products collapse and die in competition against simpler substitutes.

As a fact of life no new

As a fact of life no new software technology ever replaces an older one entirely but this doesn't also mean that they build stacks which are understood in analogy to geological strata, a metaphor which motivates the archaeologist to appear in a fictional context or in reality. It is not the buried legacy that causes the most trouble and can be kept under control in Vinge's vision by the introduction of the software archaeologist.

Already, retail software products collapse and die in competition against simpler substitutes.

Do you have any specific products in mind which have collapsed under their weight?

Data archaeologists at Facebook

From a Facebook Engineering post:

Three “data archeologists” wrote the conversion rules.