seL4: Formal Verification of an Operating-System Kernel

In seL4: Formal Verification of an Operating-System Kernel, Communications of the ACM, June, 2010 Klein et al

...report on the formal, machine-checked verification of the seL4 microkernel from an abstract specification down to its C implementation. We assume correctness of compiler, assembly code, hardware, and boot code.

seL4 is a third-generation microkernel of L4 provenance, comprising 8,700 lines of C and 600 lines of assembler. Its performance is comparable to other high-performance L4 kernels.

We prove that the implementation always strictly follows our high-level abstract specification of kernel behaviour. This encompasses traditional design and implementation safety properties such as that the kernel will never crash, and it will never perform an unsafe operation. It also implies much more: we can predict precisely how the kernel will behave in every possible situation.

Overall the paper is more of an experience report than an in depth exploration of the kernel and its proofs but there is a some meat to be found. More information can be found at the sel4 website.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I'm fascinated by the

I'm fascinated by the iterative abstraction layer methodology; no matter how excellent our proof techniques get, there's always the specification problem looming. I'd like to see more iterated approaches, especially since they seem to help out in that move all the way to a live system, rather than a model.

In that vein, though, stands a problem. I hail from the ACL2 motherland, where the executable code is also a term in the logic. The "pun" provides a strong story about what's actually being reasoned about. In the paper, links between abstract specification, haskell demi-implementation, and C code are of utmost importance for its trusted chain. I haven't had time to dig into the more formal and technical reports, but the intro here glosses right over the subject.

The counter-argument may boil down to "don't care". If this system produces nearly-bug-free software, its logical/mathematical underpinnings may not matter to the end-user. However, I'd like to see what the LtUers can dig up on this subject, as I find it foundational. In the end, we're all trying to reason about program formalisms, so a weakness in the story gnaws at everything from type checking to theorem proving.

Good Question!

I encourage you to read the rest of the report, as it actually does (I think!) answer your question. In particular, they dig into the ways in which their Haskell code and C code are both read into Isabelle (!) and the refinement theorems are proven. As you might expect, they only treat a subset of C, but it's a realistic subset for an OS kernel.

In this respect, they've made Isabelle/Haskell and Isabelle/C-subset a kissing cousin of ACL2: Haskell and C-subset executable code is indeed a term in (a) logic.

More details

I looked on their website for a paper that contained more details, and it seemed that the one called Mind the Gap: A Verification Framework for Low-Level C fit the bill.

Re: Communications of the ACM, May, 2010 Klein et al

The May issue of CACM has hit the newsstands. The seL4 paper did not appear to have made the editorial cut just yet. In light of this, I posted a few thoughts as a reply to one of at least two earlier mentions of this paper on LtU.

The other mention was in a recent thread where shap essentially asked the following question:

If you had about 50 person-years of highly qualified effort at your disposal, would you spend it on seL4 or would you be socially responsible enough to apply it to the more promising task of building a better systems programming language?

As he pointed out elsewhere,

the SEL4 approach [...] does not generalize to any other effort to verify programs in C. Having achieved a verification of the SEL4 kernel, and assuming that all of the verification libraries associated with that result were made public, the advance in the available verification art that can be applied to other programs is very limited. Verifications using C as the source language are expensive, “one off” efforts.

Unfortunately for shap, his question remained unanswered and the thread quickly (d)evolved into a three-party debate between neo-Markovians, neo-Hilbertians, and Atanassowists on the meaning of the Curry-Howard isomorphism. Oh well. It was worth a try.

Pres Kennedy on PL research

Its hazards are hostile to us all. Its conquest deserves the best of all mankind, and its opportunity for peaceful cooperation many never come again. But why, some say, formal verification? Why choose this as our goal? And they may well ask why climb the highest mountain? Why, 35 years ago, fly the Atlantic? Why does Rice play Texas?

We choose to formally verify. We choose to formally verify in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win, and the others, too.

...but I believe in secret White House recordings he agreed that verifying programs written in C is a bit silly.

Domain specific languages

I see only one reasonable answer to how to spend 50 person years of highly qualified effort:

Studying the best organic system for a particular domain, available in open source, understanding why its abstraction points (and lack thereof) exist, why its extensibility points (and lack thereof) exist.

Then build a domain-specific language to model variations on that model as easily as possible, to tune things more aggressively.

This is the approach I want to take with a TCP/IP networking stack for an operating system. To discuss what kernel bindings should be exposed, and how to verify the internals meet the exposed interface's contract, seems like asking the wrong question? Forget asking whether to do it in C or not. Once you realize C vs. something else is a silly question, then you can begin asking the right system design questions. Or maybe you've never damaged your brain to the point where you started off asking such wrong questions!

To discuss the operating system as a whole... or even a microkernel or nanokernel or exokernel... seems misguided. We've seen enough designs of TCP/IP networking stacks where it is now time to go the domain specific language route. Portability should be achieved in the classical way: through microcode (intermediate language).

ask not what formal verification can do for you

Unless I'm misremembering my Presidential History 101, Kennedy was instrumental in the development of BitC.

Right about after the speech you quoted, he put a guy named J Strother Moore (along with a bunch of other people) to work on a pretty big and important piece of software at MIT (I think). Dismayed at the rather primitive — some would say, Medieval — methods of development he got to observe first-hand, Moore resolved to do better in the future. His resolution eventually led to ACL2, and the latter, as we all know, played a prominent role in the origins of BitC.

If...

...your aim is full functional correctness of more-or-less first-order imperative programs, the programming language you use is basically irrelevant: there's no great difference in proof difficulty in the band of languages ranging from Java to C. Facing down functional correctness means that you'll get the usual safety properties en passant, as part of the correctness proof.

This is because the shallow properties (such as memory safety) that automation (typechecking, abstract interpretation, model checking, etc) can help with are automatically entailed by the harder ones which require thinking hard. This is why people who develop proof systems for these kinds of languages don't stress out too much about types systems. (The situation changes drastically for functional correctness in the higher-order imperative case, though.)

OTOH, if your goal is to make verification of shallow safety properties very easy, then language design can help a lot. Of course, that design advice will basically boil down to: design a strict version of Haskell or a pure version of ML, plus some deltas for your application domain.

first-order vs. fourth-order

A quick tangential question...

Do you happen to have a definition — formal or otherwise — of the notion of "more or less first-order" in non-λ-calculus-y languages? I've seen you equate LSP-conformant OO programs to first-order programs, and LSP violations to higher-order behavior.

By way of an example, would you call Forth first-order or higher-order?

No...

...I don't have a formal definition of this. Basically, I use it to mean "you can give a fairly reasonable Hoare logic for programs in this subset".

Now, every real language has features which go way beyond the basic formalism of Hoare logic, and so the Hoare logic game is usually to (1) identify a subset of the language where the classical techniques work, and (2) make the minimal extensions necessary to handle the higher-order programming techniques that are most pervasively used in the actual language.

In Java, the two higher-order techniques most commonly used are (1) objects-as-closures, and (2) reflection and dynamic type testing. The LSP is an attempt to tame these two features. Adherence to the LSP is a way of identifying when the use of objects is merely an idiomatic usage of the language, rather than something that essentially uses the power of these features. (My take, shaped by my own research agenda/prejudices, is that very often you really do use the full power of higher-order features in essential ways, and program logics should just bite the bullet on this point.)

In Forth, you see a third form of higher-order feature: a very heavy reliance upon quotation and eval, as even the basic conditional mechanism relies on it. I don't think anyone has given a program logic for Forth itself, but there has been a fair amount of work on program logics for self-modifying code. (At the low level, you need to understand it to verify things like bootloaders, and at the high level you need it to verify things like MVC frameworks.)

Forth

Forth doesn't really have quotation or eval, as I understand the terms; you may be thinking of Factor, Joy, or PostScript here. It just has compile-time evaluation. The basic conditional mechanism for Forth is almost the same as for C, and is generally compiled in the same way, except that it exposes an interface for mucking with the parsing state of the compiler.

ease of proof of correctness

That's fascinating! That explains a lot about Stepanov and Forth people: their goal is fully functionally-correct software, although they are using informal techniques rather than symbolic logic. And they claim that using languages that give them those shallow safety properties doesn't help them noticeably with that. Instead, they want the facilities to build abstractions so that their programs are very short.

It sounds like you're saying that their position is very well founded.

I have to think, though, that software in Java tends to be a lot shorter than in C, and so it would be easier to prove correct. But I don't know much about proofs. Is it?

Not in the May issue....

For what it's worth, it seems to have been published in the June CACM instead.

Re: Not in the May issue….

Re: comment-59024:

The May issue of CACM has hit the newsstands. The seL4 paper did not appear to have made the editorial cut just yet.

Re: comment-60256:

it seems to have been published in the June CACM instead

To boot, it is preceded by an introduction by one K. Rustan M. Leino.

Fixed

Fixed the cite.

Atanassowism

the thread quickly (d)evolved into a three-party debate between neo-Markovians, neo-Hilbertians, and Atanassowists on the meaning of the Curry-Howard isomorphism.

How intriguing. What is an Atanassowist? (And, for that matter, what is a Markovian?)

Re: Atanassowism

comment-60269:

What is an Atanassowist? (And, for that matter, what is a Markovian?)

Caveat: You have to keep in mind that I subscribe strongly to the Humpty-Dumpty theory of meaning.

Neo-Markovian

A neo-Markovian, such as myself, possesses a corpus of previously skimmed texts and generates LtU posts by feeding a key phrase from a parent post to a simple Markov chain. At his most trivial, a neo-Markovian is content to perform simple keyword matching. For instance, I can declare Milawa, like Coq, to be based on the Curry-Howard isomorphism because the words “Curry” and “Howard” are right there in the dissertation!

As a neo-Markovian, I would also conclude that the prominent verificationist John Rushby is in complete agreement with the conclusions of DeMillo, Lipton, and Perlis, because the following utterance can be found in his recent paper:

The axioms and definitions that underpin a verification are generally of two kinds: those that are directly concerned with the subject matter of the system (its requirements, design, assumptions, and so on), and those used in developing the various theories that are required to express this subject matter (e.g., the theories of clocks, synchronous systems, ordinal numbers, and so on). Many of the latter theories will be widely used in other formal developments, so the burden of ensuring their correctness is supported by a broadly shared social process.

An advanced neo-Markovian can also arrive at conclusions by noticing an absence of a keyword match rather than its presence. For example, does Barras say anywhere in his thesis that the logic of a certain well-known theorem prover is inconsistent? No, he doesn't! Therefore, said logic is consistent.

In summary, neo-Markovianism is a powerful analytical technique.

Neo-Hilbertian

A neo-Hilbertian picks up Hilbert's famous idea of substituting “pint” for “point” and runs with it as far as he can. (Hilbert soll einmal gesagt haben, man könne statt „Punkte, Geraden und Ebenen“ jederzeit auch „Tische, Stühle und Bierseidel“ sagen; es komme nur darauf an, dass die Axiome erfüllt sind.) For example, a neo-Hilbertian views any sort of computation, including Perl scripts, as a proof of some sort. (That's the "P" in Perl, in case you were wondering.) This is known as the Extended Curry-Howard Observation or ECHO for short.

Atanassowist

An Atanassowist is the most boring of this bunch. He's a purveyor of Sachlichkeit possessed of an understanding of the subject and wont to characterize concepts like the ECHO chamber as feel-good platitudes devoid of any actual mathematical content, end of discussion. (Now, where is the fun in that?)

Atanassowism has its benefits

The nice thing about Atanassowism is that it makes forum moderation much easier. If a comment is not accompanied by a formal proof, preferably machine-checkable, it should be considered to violate policy.

Making the machine-checkable aspect a requirement also solves the spam problem, at least until spammers learn how to encode their messages in Agda or Coq terms. Come to think of it, spammers have been promising more of the latter for quite some time, although notoriously failing to deliver.

In Certain Circles...

...that product is known as "poof-carrying code."

Come to think of it,

Come to think of it, spammers have been promising more of the latter for quite some time, although notoriously failing to deliver.

True, but when we are coming closer to the singularity only spammers will be left - proving spam de luxe. Yummy!

I know you are joking, but...

I know you are joking, but...

The reason I like people to talk about objective topics on LtU is that, if you want opinions or polemic you can find shovelfuls of it in the blogosphere. The web has developed a weird populist mentality, where people — even people who consider themselves technical workers — now seem to believe truth grows out of consensus. And it's self-perpetuating because the younger people see it and eventually start to think, "Well, I guess this is an acceptable way of reasoning."

IMO, the greatest, most wonderful thing about PLT (and programming) is that we don't need to argue about the facts. Programs aren't (don't need to be) approximations of their designs; they aren't built out of parts that suffer wear or break; they don't spontaneously mutate or rot. They're numbers or partial recursive functions or Turing tapes, and we already have our Grand Unified Theory. Every computer is a perfect laboratory and every result is reproducible. This is the biggest advantage we have over other scientific fields, and the one we use the least. (In fact, some people seem dead-set against anyone using it. They want to muddy everything up so that no one and nothing can ever be found at fault for anything, and so that they can exploit their priestly status to interpret the whims of the gods to laymen.)

Talking factually is not as sexy as rhetoric; you won't gain fame and fortune via Technorati; you can't make as many exciting, sweeping claims; you won't even (sadly) convince as many people. But at least you'll be talking about the truth, instead of just the perception of it. I don't want to waste my life arguing (or reading) about things that people believe about things; I want to learn about the things themselves. Otherwise I would have been a critic or a fashion consultant.

That said, I'm aware that this post itself is rhetoric, so I'll shut up now.

Bit-rot

If you spoke of 'algorithms' rather than of 'programs', perhaps I'd believe you. In practice, programs do spontaneously rot - unless you happen to program a closed system with fixed software and hardware payloads, possessing neither sensors nor effectors.

The view looks a lot different in the composition space, distributed or cloud computing, configuration management and portability, maintenance, tuning, caching, moving goal-posts and productivity, sensors and effectors and drivers and quirks and hardware changes and protocol/API standards committees, UI and end-users, and so on.

Context-rot

programs do spontaneously rot - unless you happen to program a closed system with fixed software

So 2 + 3 = 5 in a closed system, but it has a different value when I evaluate it in a context like 6 * (2 + 3).

No. You said it yourself: "fixed" system. It's the context that changes, not the program.

Define 'Program'

2 + 3 = 5 is not a 'program' by any definition I've ever seen.

A computer program is typically defined as a sequence of a instructions to perform a task. That definition is heavily biased to the imperative programming paradigms, but it could be generalized to 'an organized collection of expressions' to perform a task.

To me, the critical words are "to perform a task". [Alternatively, 'to meet functional requirements'.]

Naturally, the 'context' will play a big part in determining the correctness, robustness, safety and other characteristics of a 'program'. Even if the 'organized collection of expressions' never changes, the fitness of the program for its purpose can change. Its behavior, its results, whether it performs the task can change. Thus, programs may 'rot' if not maintained.

You said of "programs" that "every computer is a perfect laboratory and every result is reproducible". That isn't true - unless you constrain yourself to a trivial class of 'results', a closed and unchanging system, and deterministic expressions. Those constraints are not applicable to a wide variety of useful and marketable programs.

Blah blah blah bit rot blah blah blah

2 + 3 = 5 is not a 'program' by any definition I've ever seen.

Of course the "= 5" part was to show the value of the program. Is that what you mean? Or are you just complaining that it is a toy example?

Naturally, the 'context' will play a big part in determining the correctness, robustness, safety and other characteristics of a 'program'.

Correctness is a relation between a specification and a program. It has nothing to do with the context, except inasmuch as you would devise the specification with the contexts in mind. Similarly for the other things you mention.

Even if the 'organized collection of expressions' never changes, the fitness of the program for its purpose can change.

Well, what changes is your idea of its purpose and consequentially its fitness, yes. Still, the program is not changing.

Its behavior, its results, whether it performs the task can change. Thus, programs may 'rot' if not maintained.

No. Its behavior does not change; what changes is that you might see new outputs because you previously only gave it a certain subset of the possible inputs. Its results do not change; same reason. No, it does not rot.

Look, it's really simple. I say: X does not change; Y does; X+Y is different, but X stays the same. You say: X does not change; Y does; X+Y is different, therefore X changed. No. Incorrect. Contradiction. You violated your hypothesis. Either X changes or not. Make up your mind.

Why you're so dead-set on finding a way to justify the term "bit rot" is beyond me. It's just an expression, and I know what it means, and you know what it means, and interpreting it literally is just silly.

You said of "programs" that "every computer is a perfect laboratory and every result is reproducible". That isn't true - unless you constrain yourself to a trivial class of 'results', a closed and unchanging system, and deterministic expressions. Those constraints are not applicable to a wide variety of useful and marketable programs.

Yes, it is a bit of a simplification, and I made it knowingly, and nondeterminism changes things. However, I think it is also a simplification to call deterministic programs "trivial"; people fuck them up all the time. Furthermore, a computer is still a far better laboratory than any you can find in the real world. It might be hard to pinpoint race conditions, for example, but once you do, if you want to reproduce that particular event, you can always do it. In the real world, though, getting lightning to strike twice in the same spot is a lot harder. Doing an experiment can be very costly both in time and money. Etc.

Specification, Context, Correctness

'2+3' is not a program. Even a 'toy program' needs to perform a toy job. Examples: printing "Hello, World" to the screen. Or sounding an alarm whenever a value T rises above 60. Or giving me an infinite stream of secure-random numbers each between 0 and 255.

Correctness is a relation between a specification and a program.

That is true. Nonetheless, in practice specifications are almost invariably against an open system.

For example, a specification that the program is to print "Hello, World!" to the screen does not specify the type of screen, nor upon which hardware the program is to run, nor when the program is to run, nor what is already on the screen when the program runs, nor the font and point-size, nor much of anything else.

It has nothing to do with the context, except inasmuch as you would devise the specification with the contexts in mind.

You have that backwards. Correctness will have a great deal to do with context except inasmuch as you would explicitly constrain the context in the correctness specification.

If you constrain context via specification, you are saying: 'the program may be undefined outside these constraints'. This can be useful, but my experience is that it is rarely specified in practice.

what changes is your idea of its purpose and consequentially its fitness, yes.

Not generally. Even if we hold the specification as constant, a program's fitness to accomplish that specified tasks can change with varying context.

Still, the program is not changing.

The corpus of source-code doesn't change (usually). But important relations, such as correctness against a specification and even the interpretation of the code, can (and do) change.

Is a program just a blob of 'source-code' entirely independent of its interpreter, Frank? Or is the thing we call a 'program' properly a relationship between source-code and interpreter? Where does the specification of the interpreter get involved?

I say: X does not change; Y does; X+Y is different, but X stays the same. You say: X does not change; Y does; X+Y is different, therefore X changed. No. Incorrect. Contradiction. You violated your hypothesis. Either X changes or not. Make up your mind.

I infer that you mean 'X = Source Code' and 'Y = Context' in very broad strokes. We could break Y down a bit to include IX - a specific interpreter for X. And SX - a specification for X. And SIX - a specification for IX.

We can speak of the correctness of IX, and the correctness of X. We can note that the two are orthogonal: that is, X might have been developed to work correctly despite bugs in IX (or possibly even to leverage bugs in IX).

We can note that X might depend upon features of Y that are hidden to IX - i.e. for the "Hello, World!" program we depend often upon the notion that there exists a 'console' that will print characters encoded in ASCII. If not, though, we could perhaps switch to depending upon VGA and drawing the glyphs out individually.

What we call a 'program' is not independent of the relationships between X, IX, SIX, and Y. That is, I think you make the mistake of saying 'X is the program' without fully considering what 'program' really involves.

Why you're so dead-set on finding a way to justify the term "bit rot" is beyond me.

I object to your use of the word 'program'. (As noted earlier, if you said 'algorithm' I might be more agreeable.) And I consider unfounded the assumptions behind your earlier claims. Program-results are often neither reproducible nor representative enough to settle a dispute about 'fact' (especially not for comparative facts). And programs that do settle arguments will often require maintenance. And there exists no Grand Theory for programming that is 'Unified' (in Quantum vs. Special Relativity sense: supporting both in-the-small and in-the-large programming aspects).

There is a not uncommon notion that expressions to accomplish a task reduce always to mathematics. This view works quite well in-the-small. The hand-waving starts when people start saying it therefore also works quite well even in-the-large.

Consider even a trivial example of the "Hello, World!" program. Let's say I claimed you couldn't give me a program that will print "Hello, World!" to your screen. And so you hammer out a quick Haskell program that will do the job when you run it, and give it to me. Now, I'm quick to point out that you gave me text that prints "Hello, World!" to my screen, which is not your screen. So you correct it. You add a TCP/IP connection, develop a client-server. Then you find out that there's a firewall on your end that prevents me from talking to you, so you need to either open a port (outside of the program) or somehow integrate with a proxy service. So you invent IRC. Ultimately, you hand me this small mudball of code - basically, an IRC client that performs the necessary handshakes to the IRC server that will allow me to send "Hello, World" over to your machine, where it will display on screen.

That mess is relatively simple compared to others involving interactions with open systems, complex systems, and in-the-large programming. With regards to the program's suitability for settling an argument about facts, the question is: what can I prove with the code you gave me in the end? Can I even prove it 'correct'? What does it mean for just one component of a larger program to be 'correct', anyway?

Also, calling me 'dead-set' against anything is not a proper characterization. Rather I make effort to adhere to advice from Ralph Waldo Emerson: "Speak what you think now in hard words, and to-morrow speak what to-morrow thinks in hard words again, though it contradict every thing you said to-day." If your argument convinces me that 'rot' shouldn't apply to software, I would later speak to that effect in equally hard words.

I think it is also a simplification to call deterministic programs "trivial"; people fuck them up all the time.

Granted.

Programs don't rot, but they can become obsolete

Physical hardware and software both can be viewed as blackbox components that produce signals according to a specification. A 'hello world' program is correct if it signals (print "hello world") and then signals termination. When this program doesn't work in your latest GUI environment, that's more akin to having an AGP video card that doesn't fit into your new PCIe slot than it is to having a water damaged video card.

The only results we programmers can establish about our code are hypotheticals of the form "If the program is used in such and such way, then such and such happens." It doesn't make sense to talk about a program that washes your hair - only about a program that controls a hair washing machine.

Specification vs. Assumption

A 'hello world' program is only correct if it meets the specification I gave you: when run, prints "Hello, World" to screen. A program that signals (print "hello world") then terminates would certainly be incorrect unless we either (a) change the specification, or (b) make a lot of assumptions about context of this signal.

Consider instead: software is often responsible for coordinating sensor and effector devices. A program to drive me from point A to point B meets its specified requirement only if it drives me from point A to point B. We cannot plausibly write this program by signaling (drive from A to B) then signaling termination - and doing so regresses all the relevant issues to whichever program component receives the signal. Saying the software is correct because it produces this signal would be laughable.

The reason (print "hello world") appears more credible than (drive from A to B) is the many contexts where a well-targeted signal of (print "hello world") would, in fact, produce the desired effect. Ultimately, though, that is still regressing relevant issues - this time to the context. You cannot call the program 'correct' because it produces the signal. You can only call the program 'correct' because the signal causes the specified effect.

And, since you are depending upon an assumed context, your program may become incorrect (relative to the specification) should that context change.

It doesn't make sense to talk about a program that washes your hair - only about a program that controls a hair washing machine.

It wouldn't make sense to talk about a program that washes your hair in the absence of effectors to achieve this, certainly.

But it makes plenty sense to talk about a program that will control a machine to the effect of washing your hair, as opposed to some other effect (such as accidentally breaking your neck). That is, one can talk about correctness in terms of hair becoming clean and necks remaining unbroken (let's not set our standards too high, such as keeping the hair attached to the head).

While a particular machine might be good with just a (wash hair) signal, the more interesting examples would require a great deal of coordination: minute details of pressure, temperature, detergents and fragrances, brushes, and possibly kinesthetics and visual coordination. Even if one did have the easy-button (wash hair) signal, it would ultimately be regressing these issues into the program that receives the signal.

The only results we programmers can establish about our code are hypotheticals of the form "If the program is used in such and such way, then such and such happens."

If only program specifications were written in the same way, eh? But they aren't. And in practice the 'such and such happens' is not often guaranteed by any black box with its own specifications, but is rather the desired effect achieved empirically by tuned and tweaking, then maintained indefinitely.

A program is only part of a system

You cannot call the program 'correct' because it produces the signal.

But I do!

We cannot plausibly write this program by signaling (drive from A to B)

Of course not. The program would presumably alternate between signaling finer grained controls and gathering input.

But it makes plenty sense to talk about a program that will control a machine to the effect of washing your hair, as opposed to some other effect (such as accidentally breaking your neck).

Let's you and I go into the robotic hair washing machine business. I'll build the hardware and you write the software. When the inevitable first customer decapitation occurs (my design involves a table saw and a restraining chair), we'll apply your criteria to determine whether to blame the program.

Hair-washing decapitation chairs

Lol. I guess I should substitute 'cannot' for 'should not' in that first statement there.

The program would presumably alternate between signaling finer grained controls and gathering input.

I suspect doing both at the same time would be ideal, given the continuous and real-time nature of vehicle control and information gathering.

Anyhow, I'm not seeing how signals to effectors and from sensors fit into your earlier 'view' involving signals, black-boxes, and specifications. If you meant your 'of course not' to be reflective of your earlier position, you'll need to elucidate further.

When the inevitable first customer decapitation occurs (my design involves a table saw and a restraining chair), we'll apply your criteria to determine whether to blame the program.

But I'd rather start with a T-1000 model hair-washing companion!

I suspect I'd find it rather difficult to write a viable hair-washing program for your choice of hardware.

Oh yes, I should correct...

In the black box analogy, software needs to accept input as well as signal output.

Anyway, we're just arguing definitions here. I just don't think it's fruitful to think of specifications as time varying. For example, I don't think it's intuitive to declare that my 'hello world' program is no longer correct because the power went out.

Power and Responsibility

If the power going out is what you'd consider an example of a 'time varying program specification', we will indeed be arguing definitions.

Can a program be considered incorrect if it fails when the power does? Not generally, no. Conversely to what Ben Parker said: Without power, there can be no responsibility :-).

A given program can only be responsible within the limits of its information and authority. Only after you establish these limits can you usefully discuss negligence and state-of-the-art, willful ignorance, incompetence, malice, and responsibility for failure. If relevant information or authority are compromised, then so is responsibility.

But within the very broad domain of responsibility, there is still plenty of opportunity to write an incorrect program - i.e. a program that causes the machine to break necks while washing hair, or a program that causes a vehicle to drive in circles rather than from point A to point B.

Since a question of responsibility is likely to be coupled with concerns about liability and accountability, you might wish to CYA legally by clarifying and constraining the spec.

Still differ

If the specification was for a program that works with the power out, then the program never met it's specification. Similarly, if there is a requirement to drive a vehicle from A to B when the roads are icy, then a program which fails under such conditions doesn't cease to meet spec when the roads are icy - it never met spec (though you may not have tested it well enough to notice).

Two notions of specification

If the specification was for a program that works with the power out, then the program never met it's specification.

I think this sub-thread is illuminating some of the issues with the approach described in the original post.

The formal notion of a program specification is complete: anything not specified is free from constraint and outside scope. In this sense there is no "context" that can change, because every factor is either in the spec or it's not.

On the other hand, the usual notion of a specification for software is open and underspecified: the software is required to work under any "normal" conditions whether those conditions are explicitly defined or not (and they generally aren't explicit, and they do change). These variable conditions form a context, and learning how to build software that is robust in the face of the unknown unknowns of its context is central to the art of software development.

Both notions of specification are useful and important, and they share a core of overlap, but they aren't really interchangeable for each other: formalists aren't interested in the latter, and customers aren't interested in the former, though developers should probably be interested in both.

Terminology

the software is required to work under any "normal" conditions whether those conditions are explicitly defined or not (and they generally aren't explicit, and they do change)

If the software I'm developing is to work in an undefined context, then its specification is ambiguous, but I wouldn't say that the specification has changed when you show me the deployment context.

Time Varying Specifications

I honestly haven't a clue why you think I've been talking about 'changing' or 'time varying' specifications (though those would be interesting, indeed, in a system with executable specifications). Frank made an assumption that I was speaking of variable specifications, and I corrected him on that a while ago. I mentioned 'moving goal-posts and productivity' in passing among a list of in-the-large programming issues that Frank was ignoring with his various claims about 'settling facts' wrgt in-the-large programming. But that wasn't part of the argument about bit rot.

The argument isn't about time-varying specifications. That is, nothing in my argument requires the text of the specifications to change or vary over time. I wouldn't say the specification has changed - unless its text has changed.

Nor does my argument hinge upon 'incomplete' specifications - though I've never seen a complete spec, I'm willing to accept that both parties are agreed on the requirements.

My argument is based on the fact that specifications are generally 'open': that, even if their text doesn't change over time, they still name or reference environments and components that will change over time.

You're implicitly making

You're implicitly making specs time-varying by having them dependent on the state of various indexicals. Under your definition a specification is not a predicate on programs, but on (program, world) pairs. I'm arguing that a proper spec doesn't do that. It may include language such as "works on all machines produced by Dell on the current and next versions of Windows" but not "works on all machines available today."

A specification of a unicorn

I'm arguing that a proper spec doesn't do that.

And hence you have in the terms I've been using a firmly "formal" attitude to specification.

Sadly, most of the world doesn't share your position, which is too bad, since it would be so much easier to produce correct software if it did.

I disagree

If you get sued for breach of contract, the court will be interested in figuring out whether your delivered program meets the agreed upon specifications - yes or no. I suspect any reference to "today - the day you are reading this" in a written document would be treated as nonsense.

Amicus curiae

If you get sued for breach of contract, the court will be interested in figuring out whether your delivered program meets the agreed upon specifications

I think you just shot your argument in the foot, because courts are par excellence non-formalists: everything legal is subject to "reasonable" interpretation of "reasonable" circumstances.

Let's say that your specification says nothing about what should happen in a power outage. If your software causes a death due to a power outage, are you safe from liability, since it met the spec? Even if the spec specifies that the software must only be used when the power is on, are you safe?

With a good lawyer against you and the right unpredictable circumstances, I wouldn't be too confident that a strict formal interpretation of the spec will save you.

I'm not arging for formality

I'm not arguing that specifications are necessarily formal. I'm arguing that a specification is a yes/no classification of a program, even to laypeople. Specifications can be ambiguous - "the program will run fast". Specifications can be defined in terms of the world - "this program will run on Windows 95, XP, and Vista." Specifications don't, however, depend on the day that you read them - "this program shall run on the most recent Windows OS."

If you come to court with this latter language, the court will either interpret it to mean "this program shall run on the most recent windows OS at the time of the agreement" or "this program shall run on every new windows OS", both of which are (informal, but) proper specifications.

Non-distinction

Sure, a court that treats a specification as a civil contract will do its best to apply some sensible and legally enforceable meanings to the specification. In some cases, these interpretations might set precedent for particular phrases in a contract, thus formalizing snippets of English with respect to liability, and thus creating more work for lawyers.

But a court also won't insist on nonsense such as 'the program must "run on every new windows OS" without ever changing a line of source-code'. If you're going to insist on the court as your standard for understanding a specification, you should be prepared to accept the court's judgement regarding whether a program meets specification, whether an effort in-good-faith is being made to achieve support for the newest Windows OS, and so on.

In any case, I do not see a valid distinction with regards to temporal variation between "runs on Windows XP" vs. "most recent Windows OS", especially once you account for service packs and such.

Further, I do not see any practical way, in reality, to verify a program against a specification involving past and future conditions. That is, a practical specification must apply to present conditions, as 'the present' is the only possible time for verification. A program specified only against historical or future conditions can be compiled to a no-op and still meet present requirements.

Further, I do not see any

Further, I do not see any practical way, in reality, to validate a specification against past and future conditions.

There may be no way to validate the spec now, if it references future unknown conditions, but we can still agree it now. And later, when the conditions arise, we may learn that it failed to meet the agreed upon spec. Note the wording: we may later learn that the program does not - and never did - satisfy the spec, when we learn it doesn't do the things we agreed it would do at a then-later date.

Boolean Reliability?

You attempt to redefine past observations based upon a hypothetical future. I just can't see this as a sensible (practical, productive) thing to do. I'm not even certain what past and present are (outside of imagination).

Terms such as 'reliability' describe probability of systems operating within specification. I note the wording from of your post might be re-framed as an assertion that reliability is either 1 or 0, with no intermediate values, and that a failure in the future should unmake all records of historical correctness.

Great Scott!

Redefining past observations? Reliability? Anyway, I think you're using the term 'specification' wrong. When hardware meets spec, it may later fail to meet spec, due to wear or damage. When software doesn't wear or damage. When it meets a spec, it meets spec forever. 'Works on my machine at time X' is family of specs.

Specification

It is possible to write a spec and a program, as a pair, such that a program either meets the spec at all times or no times. But I suspect it requires a narrow view of 'program' (excluding, as one example, self-modifying or learning programs). And I don't believe your position much affects the specs written in practice, even if you go around dutifully telling everyone that they're "using the term 'specification' wrong".

You step in the wrong direction entirely by use of the word 'software'. The narrow class of programs most likely to share the properties you desire of them are 'algorithms' or 'functions' - especially of the pure, terminating, and deterministic sort.

I don't need to spread the

I don't need to spread the gospel, since, AFAICT most everyone else uses 'spec' the way I do. You're the only one I've ever heard talk about software meeting a spec one day and then failing to meet it the next day. Maybe you can clear this up with some citations to the contrary.

And this definition applies to all software, with sensors, effectors, nondeterminism, coconut shrimp, shrimp creole, shrimp gumbo, etc.

Reliability

Reliability was already mentioned - we call software that meets specs only part of the time 'unreliable'. I suppose you could argue that 'reliability' itself could be part of the specification, but even so it seriously muddies up your claims. Specifications often have tolerances, including tolerance for partial or temporary failure to meet specification.

Anyhow, you've some selection bias in your problem scenario: it is true that people don't often speak of software 'meeting specs' one day then failing another, but this is because saying software 'meets specs' is synonymous with 'it's done'. We don't use the phrase when software requires maintenance just to remain within specifications. Instead, we use phrases such as 'tuning' or 'configuration' or other maintenance terms to support the software in achieving its specified functional requirements. (We also have plenty of euphemisms for software that has slipped outside specified tolerances. And the 'configuration' issue offers plenty of opportunity for playing political blame-games to avoid any appearance of fault.) We have whole industries of IT personnel and DBAs whose responsibility it is to perform this maintenance.

dijkstra

The nice thing of this simple change of vocabulary is that it has such a profound effect: while, before, a program with only one bug used to be "almost correct", afterwards a program with an error is just "wrong" (because in error).

Court

When a specification is unclear, point goes to the defendant, barring various considerations for state-of-the-art.

I doubt you'll find a jury of peers that groks your concept of formal specifications.

I'm not arging for formality

See my response to Marc

Formal Specs and Productivity

Sadly, most of the world doesn't share [Matt's "formal" attitude to specification], which is too bad, since it would be so much easier to produce correct software if it did.

Assume we make our specifications formal enough to produce executable specifications. Then the software written in this manner would be reflexively correct by our current operating definition of 'correct software'. But you must ask: who would become responsible for developing these specifications? Chances are, it would be the same guy who is currently developing the client software. So what really changes? I posit: only the words we use. 'Software' is easier to produce because we aren't writing software, we're writing 'specifications'. 'Correct' software is easier to produce because we no longer have a separate standard for correctness.

I suspect that we would start to grow concerned about the correctness of our specifications, and so we would write specifications for validating and verifying our specifications. Wonderful, eh? And just how formal would these latter specs be?

Admittedly, that's a slippery slope argument, but it illustrates that arguments relying upon formal specifications may just be yet another form of infinite regression in disguise. Specifications for software are only useful insofar as they are partial and achieve a separation between specifier and implementor - while still allowing sufficient communication of purpose.

There is a balance to be had:

We cannot eliminate essential complexity, only shift it around. We can eliminate plenty forms of accidental complexity, such as that arising from syntactic and semantic noise. Development and analysis of requirements are a form of essential complexity in the software life-cycle, part of in-the-large programming. The language for specifying those requirements, however, would do well to avoid introducing ambiguities and support analysis for inconsistency or incompleteness. (Queries, abstractions, and modularity would be nice, too.) Use of English for specifications introduces far more accidental complexity than is desirable.

So I'll agree there's a lot of room for improvement, and for a more formal specification language, even when considering open and incomplete specifications with various informal attributes.

More terminology

then its specification is ambiguous...
I wouldn't say that the specification has changed when you show me the deployment context.

Entirely within my point. The difference between the two notions of specification is how any ambiguities are resolved with respect to correctness.

The formal approach says that ambiguities don't affect correctness. The development approach requires that all "reasonable" resolutions of ambiguities do affect correctness.

If you prefer, rather than think about there being a difference between two interpretations of the word "specification", think instead that there are two distinct notions about what it means to "meet a specification".

From a development point of view, a program that fails in some "reasonable" deployment context has failed to meet its specification.

A technical way of saying this is that formal approach disallows defeasible reasoning, while the development one requires it.

Interesting

I would say that I can do defeasible reasoning formally, and have no problem discussing/creating/implementing defeasible formal specifications.

Just really difficult

I would say that I can do defeasible reasoning formally

You can, it is just really hard (and generally impractical) to come up with a formal model for all the defeasible condtions that are possible.

Never met a formal spec

To meet specification is to meet a set of requirements that are written down in a document called the "specification", not to meet a set of requirements in general.

I use the term 'open' specification to describe those that name or reference something outside the specification itself and the component being specified. Terms such as "the roads" name such a thing. Today, the roads can be dry. Tomorrow, they can be wet. The day after tomorrow, they can be icy. And then they may become dry again, but with large potholes. Some time later, they might be wet with potholes, and icy with potholes. Later they can be fixed. Sometimes they'll be extended, repaved, or re-routed. The meaning of "the roads" changes from one day to another, and yet roads are mentioned as part of a specification.

Thus, the meaning of a specification can change from one day to another, even if the text of the specification does not change. And thus, a program or system may meet the specification at some times and not at others.

AFAICT, talk about closed and complete formal specifications is wishful thinking; one might as well be talk about culinary preferences of pink invisible unicorns. I've never seen either of them, and I doubt such things exist - at least not for programs that coordinate sensors and effectors and UI and other real-world interactions.

You said earlier that 'hello world' is good once it signals (print "hello world") then signals termination. So, to where shall it signal these? And from where (any location transparency requirements)? When shall it signal them? How shall it signal? What are the time limits? How should we handle shutdown in middle of signal (should we be developing the program in a language with orthogonal persistence)? Do we need to integrate with transactions? How are we to handle disruption between the signaling program and the recipient? What are the reliable-transmission requirements? Where does the spec say one way or another about these conditions?

Last remark

I'm not going to write any more about this because I don't think it's productive, but just one last remark:

The reason (print "hello world") appears more credible than (drive from A to B) is the many contexts where a well-targeted signal of (print "hello world") would, in fact, produce the desired effect. Ultimately, though, that is still regressing relevant issues - this time to the context. You cannot call the program 'correct' because it produces the signal. You can only call the program 'correct' because the signal causes the specified effect.

No, in that case you can call the system correct. The program is correct w.r.t. its own spec when it produces the signal; the spec of the signal handling system says that the signal causes the effect; the signal handler is correct when it satisfies its own spec. The spec of the system composes these two specs to ensure that the program's signal causes the effect. The system is correct if each component is correct and possibly some additional conditions. (Of course, you could make the program spec depend on the rest of the system and you would be right, but that would be a very poor way to write a spec.)

This brings into relief a problem with your "holistic" notion of specification: it does not let you assign blame anywhere. If the signal handler fails to produce the effect, I can blame it. If the program fails to produce the signal, I can blame it. But I do not blame the program if the signal handler fails to produce the effect, or the signal handler if the signal is not raised.

But you cannot assign blame anywhere, or rather you assign it everywhere, because you demand every component to be responsible for the entire system; calling them "components" at all is probably wrong. "Holistic" is after all practically the same as "monolithic", which is the opposite of "modular".

Ignore client's spec and substitute your own

The program is correct w.r.t. its own spec when it produces the signal

The spec I provided was for a program that prints 'Hello, World' to screen. What good does me a program that signals (print "hello world") to an unspecified location then exits? What Matt did was ignore the client's spec and substitute his own.

the spec of the signal handling system says that the signal causes the effect; the signal handler is correct when it satisfies its own spec. The spec of the system composes these two specs to ensure that the program's signal causes the effect

I suppose this is the crux of the argument. It seems to me that you're pulling this alleged "spec of the system" out of thin air. In the general case, as when dealing with 'the real world' through sensors and effectors, foreign function interfaces, and third-party services, no such spec exists.

your "holistic" notion of specification does not let you assign blame anywhere [...] "Holistic" is after all practically the same as "monolithic", which is the opposite of "modular".

That isn't true. Holistic systems can still be modular - via pluggable architectures, publish-subscribe services, application servers, and so on. Also, use of robust/secure composition techniques, object capability model, etc. can control distribution of authority and information (and thereby control distribution of responsibility). And my understanding is that blame doesn't fall on program components; it falls instead upon organizations and individuals. You might read Horton's Whodunnit to learn about tracing responsibility in-the-large.

What you want is for correctness to be composable, in the sense that you can take two correct components and compose them into a larger correct component. That is a very nice ideal.

But one doesn't need to look very far to find counter-examples, such as issues of concurrency control and resource accounting. Nor is your ideal especially applicable to systems and program goals as a whole. Consider, for example, a program whose specified task is to scrape images from a particular website; you DO NOT have a spec for how images are arranged on that remote website - and they might even be adversarial to your screen-scraping (intentionally changing their layout to stymie you).

I believe you could reasonably apply your ideals to particular aspects of 'algorithms' and 'functions' and various other in-the-small components internal to a program. And that's about the extent of it. 'Programs' are often specified holistically (even over your objections) and thus require maintenance to remain within specification in a changing environment. And 'software' is an ephemeral, amorphous mess that lacks even holistic specification.

Decomposition

There are a number of good points here, IMHO, but all it really reminds me of is that client specs first need to be decomposed, and one of the reasons for that is precisely to identify the boundaries between that which can be specified in such a way as to support compositional correctness and that which can't.

With that said, I do have some specific issues...

What you want is for correctness to be composable, in the sense that you can take two correct components and compose them into a larger correct component. That is a very nice ideal.

But one doesn't need to look very far to find counter-examples, such as issues of concurrency control and resource accounting.

That's funny: concurrency control and resource accounting are two of the areas in which compositional correctness has made the greatest strides!

Consider, for example, a program whose specified task is to scrape images from a particular website; you DO NOT have a spec for how images are arranged on that remote website - and they might even be adversarial to your screen-scraping (intentionally changing their layout to stymie you).

That exact example is among the things that we do at $DAYJOB. We don't have a spec for how images are arranged on that remote website. We do have a spec for how our system must behave given that lack of information, for measuring how successful we are at the scraping, and for revising our process(es) accordingly.

It seems to me that you're interpreting "spec" to mean some static thing that makes no account of variable input/output phenomena. One of the things that I've tried to emphasize is that that has never been what I have been referring to, because as you and others have pointed out, it can't be. But the fact that we lack enough information about the outside world that our software interacts with doesn't mean, at all, that that outside world can't be modeled and contended with by a formal specification. It just means that that formal specification will need to rely on temporal logic or probability theory or some other defeasible/non-monotonic form of formal description.

Great Strides and Initial Specifications

concurrency control and resource accounting are two of the areas in which compositional correctness has made the greatest strides!

That's true. But on the other side of "great strides!" is "look how far we needed to move!", "just how long is it going to take for everyone else to catch up?!", and "how much further do we need to go?"

Many challenges remain in these spaces.

the fact that we lack enough information about the outside world that our software interacts with doesn't mean, at all, that that outside world can't be modeled and contended with by a formal specification

Sure. If a specification provides a model for you, you are free to program against that model and call your program 'correct' (independently of whether the specification is 'valid'). I've never said anything contrariwise, other than to suggest that the model being provided to me is not something I see in practice (outside of a few lucky domains, such as avionics software).

However, as I noted earlier (in an apparently forgotten post) pushing these issues into the specification is ultimately a form of regression:

As we shift essential complexity from a program to a formal specification, we improve our ability to automate processing of the specification and attribute absolute 'correctness' to a program. But, at the same time, we lose our ability to say that the specification is 'obviously correct', and we end up needing specifications for our specifications. The line between specification and program is a fuzzy one - and, paradoxically, it only grows more fuzzy as we formalize the specification.

That said, formalization of specification isn't a bad thing - it would be convenient if we could avoid seeing specifications in English and other natural languages with their ambiguities, metaphors, analogies, anthropomorphisms.

The 'bad thing' might well be our artificial mental divisions between program and spec, and our use of hand-waving whiteboard and powerpoint and English discussions as the ultimate 'foundation' of (nearly) every program. If we're going to improve our lives of programmers, perhaps we need to improve that foundation - i.e. by getting meeting-sized rapid-prototyping tools into the hands of our pointy-haired bosses, clients, and stakeholders.

Two brief remarks. Frank

Two brief remarks.

Frank made a normative statement. A specification is a mathematical object. Otherwise you are talking nonsense and he certainly doesn't want to be engaged with people who do. It is like debating the definition of a differentiable manifold with a plumber. This is "unproductive" something a true hero-of-work needs to defeat. Although he might present a somewhat cartoonish version of a positivist, it is good to see one is showing up. Apparently not all of them have been either extinct or gone ironic.

In practice an informal specification is translated into a set of test cases, which are sometimes linked to the original requirements documents for review purposes using specialized tools like DOORS. This is also true for avionics software, you mentioned, which standardizes trust levels. Those correspond to testing adequacy criteria which are commonly expressed using code coverage measures.

Ideally, specifications

Ideally, specifications would contain much greater mathematical structure and semantic content than type String = [Char]. Something closer to CafeOBJ would be nice. But accusing a plumber of speaking nonsense just because she doesn't speak your preferred specification language is not nice.

My observation has been that most programs are attempting to achieve something in-the-large - to perform a task, or accomplish a goal, that is fundamentally outside of themselves. No amount of 'normative' ideal will change that (normative-idealize in one hand...). Attempting to redefine the program's fundamental goal - for example, to translate the specification strictly in terms of input-output relationships - is so much unproductive, arrogant hand-waving. A program is correct while it is accomplishing its specified goal, not when it is accomplishing what a mathematical-minded developer wishes were its specified goal.

That said, the developer is free to break down a large program into component parts, specify each of those in a mathematical manner, validate each against spec. This can help manage complexity within a program, or break it into modules that can be developed and maintained independently. Yet, it is unfortunate that all this zealous effort does little to support reasoning about whether the program is, holistically, achieving its specified goals.

For in-the-large programming - of the sort necessary to accomplish client goals - I believe we need programming models that focus upon local reasoning about in-the-large system properties (scalability, safe concurrent composition and resource acquisition, graceful degradation and resilience, disruption tolerance and persistence, efficiency, utilization, distributed data-fusion and command-and-control, eventual consistency, termination, isolation, authority and confidentiality, cause-and-effect). Yet the mathematically inclined seem often to pursue the opposite: programming models that support in-the-large reasoning about local system properties (substructural typing, small-step type-safety, typeful programming, determinism). I find this rather frustrating.

I agree with Frank's comment that we muck up often enough in-the-small to use all the help we can get, but I think that does excuse us to dismiss the relevance of holistic or in-the-large programming questions: which data is available, which data to gather, when and how to gather it, how to understand it, identifying which external actions are available, under which conditions to take them, how to recover from partial failure. It would be convenient to live in a world where all such details arrive with a specification. But that world is fiction, and I suspect it always will be. We need better tools for supporting us in answering these in-the-large questions.

Self-Made Frustration

I believe we need programming models that focus upon local reasoning about in-the-large system properties (scalability, safe concurrent composition and resource acquisition, graceful degradation and resilience, disruption tolerance and persistence, efficiency, utilization, distributed data-fusion and command-and-control, eventual consistency, termination, isolation, authority and confidentiality, cause-and-effect). Yet the mathematically inclined seem often to pursue the opposite: programming models that support in-the-large reasoning about local system properties (substructural typing, small-step type-safety, typeful programming, determinism). I find this rather frustrating.

Of course you do, with the size and scope of that false dichotomy you've constructed.

Yeah, it's a false dichotomy

Yeah, it's a false dichotomy in principle. But I'm talking about practice, here.

Someday we might have tools and models and 'Grand Unified Theories' that support both sorts of reasoning. That day is not today.

(And frustration is always self-made; it can be expediently avoided by favoring ignorance. Not sure what's so profound about that.)

Practice Makes Perfect

Yeah, it's a false dichotomy in principle. But I'm talking about practice, here.

And frustration is always self-made; it can be expediently avoided by favoring ignorance. Not sure what's so profound about that.)

There you go, being offensive for no reason again. Believe it or not, you aren't the only one engaging in practice. Think about it. Harder.

Dichotomy of Practice

Believe it or not, you aren't the only one engaging in practice. Think about it. Harder.

You misunderstand.

What I said (or meant to say) is that there is a dichotomy in the practices - in the automated tools and executable models and the world-views they encourage. There is no consensus on any 'unified' theory for reasoning about both holistic or in-the-large properties (the impact of a program on the larger system, effects and side-effects and requirements in fulfilling a purpose, coordination and competition and emergent behavior with other programs) and for reasoning about in-the-small properties (the impact of the larger system on the program, understanding programs as a functional mapping of inputs to outputs).

I do not claim to be the only one engaging in practice.

My opinion won't change by 'thinking harder'. (It might change by 'thinking differently', but so might yours.) Perhaps there is some key information I'm missing that would change my opinion. I've seen plenty of efforts that combine the practices to some extent - in computer vision, object recognition, automated path and mission planning, for example. I've seen efforts at combined theories, such as those embedded in CafeOBJ and Bayesian logic. But we've a long way to go.

Alternative solution

I thought about this a bit more, and you could argue that over time the specification of a program changes, and it therefore becomes incorrect. I grant you, if you take this view, you might be able to substantiate the term "bit rot" literally.

However, and pardon me for saying this but, this is a really bad way to look at things. A specification is, like a program, a formal object with a mathematical existence. It does not mutate suddenly just because you change your mind or because you don't bother to write it down or because a new version of Windows comes out. Similarly, its relationship with a program does not change.

The better way to look at things is: In 200whatever, I wrote a program P which satisfies spec X. Time has passed, and now I have a different specification Y because my new OS environment or whatever does not work with X. P does not satisfy Y. Therefore, I need a new program Q that satisfies Y.

So: P has not changed. X has not changed. Instead you have a two new objects, Q and Y.

Why is this is a more productive perspective? Because it is not fuzzy about denotations because things don't spontaneously mutate.

If you really want to bring state into this, then you can have a variable (mutable cell) V which holds the spec as contents and say you changed the contents of V, or Microsoft changed the contents of V or something silly like that.

(Are you an imperative programmer? Is that what this is all about? :)

A specification is, like a

A specification is, like a program, a formal object with a mathematical existence.

That might be true of a few, rare specifications. Perhaps they keep one made of platinum-iridium under a thick plate of glass over at Bureau international des poids et mesures? I can't say I've ever seen a wholly formal and complete specification in the wild.

More relevantly: a specification for a program is often independent of the programming language, and often will involve accomplishing a task that involves interacting with elements whose models are opaque to the implementation. Thus, when one specifies a task to accomplish in any open system (including the real world, such as getting from point A to point B), a correct program will include assumptions about the open system that cannot be fully verified or validated based upon inputs from outside the program. Thus, hidden models are fundamental even when they are not part of the specification.

We might call those assumptions by such words as 'tunable parameters' and 'configuration' and 'hard-wired assumptions'. Even if the specification never changes, a change in context might invalidate these assumptions and thus leave the resulting program incorrect or unsafe. Maintenance of the program, to update these embedded assumptions, allows the program to continue functioning correctly. There are strategies to minimize the invasiveness of these maintenance changes.

One might discuss the 'stability' of a program as its ability to operate correctly in many different contexts. Stability is another of those 'facts' you can't prove by just writing a block of source code.

If you really want to bring state into this [...] (Are you an imperative programmer? Is that what this is all about? :)

I write declarative code when I can, and imperative code when I must.

State is not a uniquely imperative concept. Declarative techniques that involve shared state include concurrent constraint programming, programming with temporal logic, and various forms of reactive or dataflow programming. I'm developing another along these lines: an Agent-oriented dual to the Actors model, called 'reactive demand programming', that is highly declarative and thus far seems quite promising for in-the-large and open systems programming [1][2].

The most common reason for bit rot

Is a changing compiler. I guess we agree, most people in the field know this. A relative of mine got stuck for a few weeks because his high-energy physics simulation didn't compile on new Fortran, which was just because the floating point representation became wider, and I remember even KPN being somewhat embarassed for having lost the compiler for the software of some old telco switches.

Similarly, C programs are pretty prone to bit rot too. In embedded systems, a lot of properties are assumed and simple C compilers are often an advantage. Change the compiler, and something which cleanly compiled to a 4Kb system suddenly doesn't fit anymore.

If you don't want bit-rot, you need to keep the whole development environment around somewhere, which sometimes means maintaining the development machines, compilers, and OS-es until all products in the field are phased out. (Just imagine the problems Siemens has with this.)

objective topics

Most of the "opinions and polemic" about PLs which can be found in the mainstream seems to involve pragmatic factors which cannot easily be made "objective", or implementation which, while being "objective" in some sense is expressly outside LtU's scope.

I'd argue that PL research needs to take pragmatic and social considerations into account, since this is the only feasible approach towards extracting some real-world relevance out of the clouds of "nice and formal" math it has been buried in. A limited amount of opinion and polemic--duly informed by objective facts--is conducive to this goal.

Hold yourself to the same standard.

OK, let me try a different tack.

If you want to take "social considerations" into account, then hold them to the same standard as your technical remarks: prove them. Ask yourself if you could publish them in a sociology journal or a psychology journal. Can you hold yourself to that standard of rigor?

Psychologists don't just print opinions anymore. They back up their claims with statistical evidence, and reasoning that draws on knowledge gained from rigorous studies, peer-reviewed research and, oh yes, a degree in psychology.

I often think that programmers so frequently inject social commentary into their technical commentary because, paradoxically, they have such a low opinion of "soft" sciences.

rigorous standards

Re: comment 60356:

Ask yourself if you could publish them in a sociology journal or a psychology journal. Can you hold yourself to that standard of rigor?

Psychologists don't just print opinions anymore. They back up their claims with statistical evidence, and reasoning that draws on knowledge gained from rigorous studies, peer-reviewed research and, oh yes, a degree in psychology.

(See also a much earlier post by Frank.)

This makes for some interesting reading:

  1. Publication Decisions and Their Possible Effects on Inferences Drawn from Tests of Significance — Or Vice Versa by Theodore D. Sterling, Journal of the American Statistical Association, Vol. 54, No. 285 (Mar., 1959), pp. 30-34.
  2. Why Most Published Research Findings Are False by John P.A. Ioannidis (2005), PLoS Med 2(8): e124.

from the horse's mouth

For the sake of completeness, here's a quick xref to a comment by Gerwin Klein.