Opposing Hierarchies of Complexity

Here is a thought for discussion: There are two models in programming

1. The machine/cpu model with pointers, addressing, manual memory allocation, modular arithmetic, floating point numbers etc.

2. The mathematical model with values, integers, real numbers etc.

The code itself built up from machine-code is simplest when closest to (1), and that abstractions are built of ever increasing complexity. For example a GC is a fairly complex piece of code. Adding two stack allocated machine-word ints is not.

The mathematical model is built from type-theory. We can define mathematics constructively. (I am thinking more of Twelf than Coq here). It seems the type system is at its simplest expressing the fundamental concepts of mathematics, but modelling a real CPU is a complex application for type-theory (and mathematics).

So a function with a simple type, has a complex implementation (involving garbage collection and layers of abstraction), and a simple function would have a complex type (possibly something involving linear refinement types, monads etc). You can see this in the difference between a simple functional language which requires a large runtime, and typed assembly language which requires a complex type system.

TAL, whilst an interesting idea seems to low level fixed on a single CPU instruction architecture. So the idea is a language like 'C' with a complex type system like TAL, that can implement a pure functional DSL which would be enforced by the type system. You could have a pure strict functional language, but where a simple monadic program would compile directly to 'C' code.

There should also be some proofs, for example the garbage collector implementation should result in the correct simple types for objects managed by the GC. The complex linear/effect types expressing the manual management of memory will only disappear from the type signatures if the GC code is correct. As discussed in another topic, the GC would require the language to have the capability of reflecting memory layout, stack context and concurrency control.

I am interested in any work / papers that might present a similar language concept, or develop a suitable type-system or implementation to this.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

transient acronym lacuna

By TAL you mean Typed Assembly Language, as opposed to Tandem Application Language?

Typed Assembly Language

Yes, I did write it in full the first time, but must have lost it in an edit.

You might look at the

You might look at the methodologies used in large scale verification efforts for low level software (e.g. operating systems .. maybe seL4).

So a function with a simple type, has a complex implementation (involving garbage collection and layers of abstraction), and a simple function would have a complex type (possibly something involving linear refinement types, monads etc).

I had a little trouble following the role of this sentence in the rest. This sounds like you're talking about keeping all of the complexity of low level code but adding the rigor of sound type theory everywhere? Sounds like a recipe for not ever getting anything done.

But then you write this:

There should also be some proofs, for example the garbage collector implementation should result in the correct simple types for objects managed by the GC. The complex linear/effect types expressing the manual management of memory will only disappear from the type signatures if the GC code is correct.

So it sounds like you expect much of the complexity to go away, but I'm having trouble imagining what you're imagining here.

Thinking along the same lines as you're doing here led me to a design involving a separate representation selection mechanism which I've mentioned in other threads with you. I suspect something along those lines will be the right answer to questions like this, but I have not much more than suspicions for now.

Type Theory

This is meant to be a reply to Matt's comment above, but I got it in the wrong place.

The type system needs to be a sound basis for mathematics, so something like Intensional Intuitionistic Type Theory (ITT) or Extensional Intuitionistic Type Theory (ETT), or Homotomy Type Theory (HoTT) depending on the treatment of equality. In proof relevant logic, the type is the theorem, and if there is a program that occupies that type then the theorem is proved true. Then we define the primitive functions for the language such that they are something like the functionality of 'C--' but correctly typed in ITT/ETT/HoTT.

We can express a theorem that the monadic language can implement a pure function. We know this is true because we can write such a program. So we must be able to implement an abstraction that behaves like a pure functional language, and we have at the same time proved the implementation correct.

There are of course two problems:

1. What would a theorem in type theory that states that you can implement a functional language in an imperative language look like.

2. What would a proof of (1) look like? It would look something like the implementation of the runtime for a functional language.

One problem we've discussed

One problem we've discussed before is the problem of bounded memory. If you fix a pointer width, then your C functions no longer serve as inhabitants of nontrivial theorems (will not be typed by any of the type systems you mentioned). For this and other reasons, I think it's better to be proving theorems of the form "complicated gadget X represents nice simple function Y in __ precise sense." So you have the nice function with monadic or whatever semantics and then you extract a representation for that function in a way that provably works on some target (possibly abstract) machine. The proof step should be entirely optional because it's often highly non-trivial and "we're not shipping this product that otherwise works because we haven't finished proving that it works" is usually a business plan that will be naturally selected for extinction. (BTW, it's HomotoPy Type Theory.)

Path Dependent Types and Exceptions

We don't know that a program will be executed on the machine that compiled it, so memory must be s runtime property. This would imply that the memory allocation function must return a 'Maybe Pointer'.

We can use path dependent types to get a 'pure' type out of this:

ptr = new Object :: Maybe (Pointer Object)
if (ptr not Nothing) {
   assert(typeof(ptr), Pointer Object)
}

Or to get the pointer out of the nested block, we can use exceptions, so that new will throw if the allocation fails, so the returned pointer is always valid. In this way the assumption of
infinite memory at compile time is valid.

What we have proved is that the program will produce a correct result, or fail with an error, it will not produce an incorrect result. Pragmatically I don't see how you can do better than that with a finite and unknowable amount of memory.

built from type-theory

The mathematical model is built from type-theory.

I tend to think type theory is an afterthought to mathematical theory. I approve of mathematical elegance, but I don't like complicated type systems, I'm skeptical about monads, and I'm not sold on "pure" languages.

Mathematics is type-theory

Type theory is a much simpler foundation for mathematics than ZFC set theory. So if you dislike complicated systems you should prefer ITT to ZFC as the way to construct the whole of mathematics. Type Theory originates from the attempts to formalise the foundations of mathematics, going back to Russell's "Principia Mathematica".

Russell and Whitehead

I'm familiar with this claim, that type theory goes back to Russell and Whitehead. Frankly, I find this claim pretty unconvincing. Type theory justifies itself by harking back to Russell and Whitehead, yes, but I suspect this is mathematicians not wanting to admit (especially not to themselves) that any of their ideas come from those dirty computer scientists. Rather like Gibbs claiming that vector analysis comes from Grassman's Ausdehnungslehre rather than from Hamilton's quaternions.

In any case, as a practical matter the use of type theory in programming is patently not simple.

Not simple compared to what?

I like it that you ask those hard questions, which are good to think about once in a while, as a reality check.

In any case, as a practical matter the use of type theory in programming is patently not simple.

Not simple, compared to what? Are there other, simpler ways to reason about programs that provide the same benefits (modularity/compositionality)?

This criticism is interesting in any case (you don't need the existence of a simpler system to say "this is too complex, we ought to look for something simpler"), but I do not feel there is evidence (patently visible or not) that the complexity is due to type systems rather than the inherent complexity of reasoning about programs.

I'm not even sure I share your feeling that type theory is "patently not simple", or at least I would need to quantify it. I find the use of simplified typed lambda-calculi to study a very specific aspect of programming languages rather simple, in fact: I can fit a complete description of the subsystem making novel claims, along with formal correctness proofs, in 20-30 pages of LaTeX, which is a luxury that not many scientific disciplines have anymore.
There is something not-simple, in fact frankly intimidating, about doing a full soundness proof of a real programming language instead of studying specific aspects in isolation. For example, attempting to prove Rust sound is "not simple", but is this about types?

Proof relevant logic

I believe Russell refines classical logic to IPL. It is a matter of fact that IPL and ITT are the same thing, although I am not sure this occurred to Lof, but had to wait for the Curry-Howard isomorphism to be applied for this to be understood. Type theory does not have to justify itself, it is just proof relevant IPL.

It has been said that without proof relevant logic, intuitionistic propositional logic interpreted as type theory, there are no proofs in mathematics. Every other proof works by assuming simple arithmetic is sound and complete, or is misleading. For example people claim simple arithmetic can be proved sound and complete in ZFC, but imagine the converse had been proved? Would we abandon simple arithmetic as useless, or abandon ZFC. The answer would obviously be to abandon ZFC, so the idea of proving an intuitively simpler thing in a more complex one is flawed. With ITT a proof is true because you can run a program.

Dangerous grounds

I think you're walking into dangerous territory here. You can make practical use of type theory to design clean programming languages without:
- necessarily using a type system that is sound as a logic
- making foundational claims about the meta-level reasoning system (ZFC, Principia, ITT or whatever)

There is a place for arguing about issues related to the foundation of mathematics, but it needs not be tied to programming language design.

My Position

I think programming languages should be using type systems that are sound as a logic. Of course other people are free to develop languages as they see fit.

The meta level theory seems relevant to programs as proofs, and this seems influential on the direction I think programming language development should take. Its actually essential to understand the difference between ITT and ETT, to understand what equality is, which is a pretty fundamental programming concept. The fact that this has also produced new useful things like proof assistants (Twelf and Coq) should be a hint that something fundamental is going on. This is all related to proof search, which is relevant to logic languages. A logic language can be defined as a goal oriented proof search, which takes us full circle to a type system that is sound as a logic.

My personal interest is programming languages, and I would not be mentioning mathematics if I thought it could be easily separated from programming. Hopefully it is clear that I am not making the foundational claims (hence introducing it as 'it has been said'), but to be clear one of the people who have said that is prof Robert Harper.

With ITT a proof is true

With ITT a proof is true because you can run a program.

I see you've tried the Kool Aid.

Is this controversial?

Modulo bugs in the type checker, but there are plenty of unit tests available like "thousands of problems for theorem provers" :-) but I would have to call myself a constructivist if that is what you mean. A formal proof system has to be better than the hand-wavy appeals to intuition that seemed to pass as proofs when I last studied maths.

As for not being able to prove simple arithmetic complete/consistent, didn't Godel actually prove that you cannot for any proof irrelevant formal system. So a proof relevant system is the only way we know that can?

How about the consistency of

How about the consistency of System F? I personally understand the consistency of System F only in terms of a set theoretic modeling of what the terms could mean. Is it obvious to you that type:type leads to contradiction in certain systems even though you can view proofs as programs? Why?

Consistency

You are free to understand the possible consistency of System-F however you want, and I guess it has been proved that System-F is at least as consistent as simple arithmetic using set-theory.

I think the simplest understanding of the inconsistency of type:type is just that it is non-terminating at the type level.

Too much philosophy

I was going to reply that you didn't take my point, but I'm not sure now that I've been reading you correctly. I'm not sure, but I think gasche is correct that this discussion probably doesn't belong here.

Consistency of System-F

Here's a proof of the consistency of system-f in coq. As far as I understand it, this means you can prove its consistency without needing sets, proving it directly in type theory. I would prefer a proof in Twelf though.

http://www.andrew.cmu.edu/user/croux/coq/sysF.html

Hm

Proving the consistency of System F by using an impredicative type-theory as meta-level logic is interesting, but not very good as far as foundational strength is concerned: if System F was inconsistent, the impredicative fragment of Coq, which is used in this proof, would be inconsistent as well.

You're deep in the dangerous grounds you created for yourself.

As for not being able to prove simple arithmetic complete/consistent, didn't Godel actually prove that you cannot for any proof irrelevant formal system. So a proof relevant system is the only way we know that can?

Gödel more generally let us understand that sufficiently expressive proof systems (predicative or not) cannot prove themselves consistent. The only way to prove a system consistent is to move to a stronger logic you're more familiar with. The way you go there can be very interesting (and trying to do this is a good way to reject wrong systems), but in the end the proof always relies on some strong meta-level logic which is what you called "the hand-wavy appeals to intuition".

Predicativity

Firstly I'm not too concerned about the danger, as its of no real consequence. It's not like the compiler will stop working tomorrow, and the benefit is I might learn something :-)

As you can prove arithmetic in predicative type theory (with the addition of transfinite induction), the same should be possible with System-F, although I can't find an example.

With ITT consistency is not a problem, however you still have the appeal to intuition that transfinite-induction is sound to prove some things. So the choice is to intuitively believe in arithmetic or transfinite-induction.

However the fact that you can still compute useful stuff with ITT without needing transfinite-induction means there is a lot of computation you can do without any appeal to the intuition.

Assumptions

Great, so in the tradition of logic, can we continue the discussion under the assumption that type-theory / Gödel's 'T' is a suitable foundation for some/all mathematics. Can we also assume programs as proofs prove things about mathematics?

Getting back on track, a type system that is consistent as a logic will have the property that if a program correctly abstracts away memory, the monads / linear types (the representations of state) should disappear from the type signatures, leaving pure functions.

In other words we should start with simply compiled imperative primitives (a 'C' like portable assembly language) which have complex type signatures involving monads / linear types, and be able to write a functional DSL embedded in the language that has the simple and obvious type signatures we would expect for a functional language (something like simply typed lambda calculus would be a good first example). You should get a proof that the DSL implementation is correct 'for free'.

I suspect that a simple enough functional language would just be a subset of the host language. For example primitive types that fit in the stack would be fine, as would any regular type (an object that behaves as if it were a primitive type, copied not referenced).

Consistency?

I still don't understand the part about your type system being "consistent as a logic". Besides your own stated personal taste, do you make any relation between excluding non-terminating terms and having "the property that if a program correctly abstracts away memory ..." ?

The rest seems not-unreasonable, in the line of Bedrock for example, but I disagree with your goal of being as close to the CPU as possible, because CPUs are ugly and not-necessarily well-designed (I'd rather want my low-level language to have a good design than a perfect mapping to legacy systems), and CPU design changes over time in ways that you can't always predict. I'd rather have a clean, expressive low-level language, and possibly adaptation layers here and there (to support the CPU of yesterday or the GPU of today that work in a slightly different way), be them presented as compilation steps (with possibly some optimization) or runtime services (for example, virtual memory).

Clarification

By close to the CPU, i mean an idealised CPU like a register-machine, or a stack machine. I am thinking of something like a subset of 'C'. I don't really see that changing in the short-to-medium term. Hardware architectures need to respect the physical properties of digital electronics, registers, memory, busses etc.

By consistent as a logic I probably mean a fragment of ITT, as it is the proof-relevant equivalent of IPL. I could also say the type system is a Heyting algebra if thats any clearer? Its not just some ad-hoc collection of 'types'and rules like say the C++ type system. System-F is consistent as a logic for example. As consistency is not relevant to ITT, I guess I mean sound?

As for termination and correct abstraction of memory, I think at compile time you have to assume infinite memory, as you don't know how much memory a real machine will have. Out-of-memory would be a runtime-error still, and therefore allocation would have side-effects (an exception) and needs to be in some Monad.

Correct abstraction of memory would be the guarantee that the lifetime of the object is greater than or equal to the lifetime of any references to it, so I don't see the relevance of termination. Perhaps I am missing something?

Hm

I suspect you may be confusing "consistent as a logic" (which still looks irrelevant to your broader point) with "sound" which, well, mostly everyone wants out of a type system. Sound only means that well-typed programs do not fail at runtime (for a notion of failure that has to be defined for each language, but generally corresponds to absence of dynamic typing exceptions).

Sound and complete

I'll riff on this a bit.

When you have a system that's "correct" by definition, and a formal system that's supposed to model it, the formal system is "sound" when everything it models is correct, and "complete" when it models everything that is correct. In areas involving full-powered logics or full-powered programming languages, you rarely encounter a model that is both sound and complete. In small-step semantics, for instance, you have an operational semantics (which is correct by definition), and a rewriting calculus (which is meant to model it). When you say it's "sound", what you usually mean is that the equational theory of the calculus (the reflexive symmetric transitive compatible closure of the rewriting relation) doesn't equate any two terms that aren't operationally equivalent in the semantics. When you say it's "complete", what you usually mean is that the calculus rewriting step includes every step that's allowed in the operational semantics. You wouldn't want the calculus step to be sound, because that would mean the calculus doesn't allow any rewriting step that isn't in the operational semantics (and the operational semantics is usually deterministic and non-compatible). It's rare for the equational theory of the calculus to be complete relative to the operational equivalence of the semantics, though that can happen; the operational equivalence and the calculus theory are usually both undecidable, so Godel doesn't prohibit their being the same relation. It is however very common for your semantics to be "sound and complete", because the part of it that's sound is different from the part of it that's complete.

Now, a type system is a formal system that's modeling something, but there's a problem with what "correct" system it's modeling. When you say the type system is "sound", you mean that every program admitted by the type system is... "correct" — but not all program errors are "type errors"; if they were, then typing a program would be a guarantee that the program would run correctly. No type system can tell you whether you want the program to do something else. So, really, your type system has to be sound relative to some particular class of errors it's meant to address. To say that the type system is "complete" would mean that the type system admits every program that does not contain the class of errors you wanted to prohibit; and for a Turing-powerful programming language, it's usually formally undecidable whether or not any given program can produce a dynamic type error. So if your type system were sound and complete, it would also be impossible for a compiler to typecheck an arbitrary program in finite time. Decidability is closely related to logical consistency, which is perhaps why consistency starts creeping into discussions of this sort.

The more kinds of errors you want your type system to prohibit, the more you feel the pressure not to prohibit more than necessary, so the closer you try to get to completeness, and thus the closer it all comes to tangling with Godel.

My own experience is that having a few, primitive types is immensely useful — as one may discover when working with a markup language that tends to treat everything as text — but it seems to me that as soon as you start introducing structured types you're on a slippery slope to a dark place. (Surely record types are good, right? Array types? Function types? And by then, you're already mired in the quag.)

Quagmires

So my language lives in a cabin in the middle of the swamp :-)

What happens if the operational semantics are precisely that which is required to prove the theorem in ITT? The pure lambda term has no 'errors'. It may be untypaeble, but that is the same as it not proving (inhabiting) the type/theorem being proved.

Curry-Howard

I've long felt (and occasionally opined here) that Curry-Howard is overrated. I really do wish luck to those who choose to pursue it, but for my part, I'm exploring a very different take on the relationship between logic and computation; I maintain that logic and computation are different in essence (fwiw).

Proof Assistants

Proof assistants seem to be doing useful things without needing any luck. Isn't the significance of Curry-Howard defined by how much useful new maths and computation it enables.

Overrating is relative

It can be significant and overrated at the same time. It's practically worshipped in some circles these days.

Proof Theory

The blog you linked to looks interesting. Perhaps we can discuss it a bit, as I would like to understand the parts relevant to computing.

The reality is there is no law of the excluded middle in proof-theory. Just because you do not have a proof of P does not mean you have disproved it. We also have propositions the truth of which is unknown, there is neither a proof nor a disproof, these would be the open questions like the Riemann hypothesis.

Sympatico

It seems you and I are on the same wavelength on many of these issues. You've said many things over the years on LtU that resonated with my own thinking. For example, I think the Curry-Howard correspondence is neat, but don't think it necessarily serves as the best foundation for studying computation. Untyped computation doesn't correspond to anything on the logic side, and, on the logic side, truth doesn't correspond to anything from the lambda side.

Some of the stuff you've written over at the link above strikes a chord with my own logical system work. (And I'm pretty far down the rabbit hole with it ... I think I have the model... fingers crossed... I'm gonna be bummed out if the details of the proof don't go through...) It properly separates computation from logic, though ends up having much of the flavor of type theory in some ways.

I'm not sure about your attempt to bypass Godel, though. It's a really robust result, as I'm sure you're aware, with some variants ruling out languages that self-interpret. Did you ever look at the thread I posted a while back? (http://lambda-the-ultimate.org/node/4785) That's as close as I was able to get to bypassing Godel.

Finally, how does one keep the law of the excluded middle and get rid of reductio ad absurdum? Does that just mean that you want to allow that expressions might be nonsense, rather than Boolean propositions? I think that's a good way to keep Curry's paradox away. You shouldn't be able to suppose nonsense, prove X, and conclude 'nonsense --> X'. Prove you have a valid proposition before you get to use it as the hypothesis of an implication. This is how my system works.

And I'm pretty far down the ...

And I'm pretty far down the rabbit hole with it ... I think I have the model... fingers crossed... I'm gonna be bummed out if the details of the proof don't go through...

You're such a tease!

:)

Sorry.

Separation.

An untyped program might be considered to inhabit a single type Program, interpreted as the simple hypothesis there exists some program. You prove this by example, any valid program will satisfy this. However you still can prove it (a valid terminating program), or have it not proved (by a nonterminating program) even though you cannot disprove it (as the type system only has "Program" and not "not Program". So this still fits the IPL model, with no law of the excluded middle, as to not prove "Program" is not the same as disproving "Program".

Untyped computation

Untyped computation doesn't correspond to anything on the logic side, and, on the logic side, truth doesn't correspond to anything from the lambda side.

Nicely put.

I'm not sure about your attempt to bypass Godel, though. It's a really robust result, as I'm sure you're aware, with some variants ruling out languages that self-interpret.

A very robust result, yes. That's the beauty of the bypass strategy: the word "insurmountable" means "impossible to go over", which says nothing about whether one can simply go around.

In one of my various attempts to distill the difference between logic and computation, I observed that mathematical proof is essentially timeless, whereas computation is inherently embedded in time. Once you prove a theorem, you know it is true now, always will be true, and always was. That's why paradoxes tend to have a collapsing hierarchy in them; to take the usual example, Russell's paradox has set A simultaneously belonging and not belonging to itself, whereas infinitely recursive function A builds an infinite stack of continuations. In Russell and Whitehead each proposition has an explicit nonnegative-integer type, and then to found analysis they introduce an explicit axiom that for any proposition involving propositions of type n there exists a proposition involving only propositions of type 0.

Unprovable

Maybe there are things that cannot be proved true (as it would take an infinite amount of time) and yet cannot be proved false either? You would require a proof that all proofs terminate in finite time to enforce the law of the excluded middle. Does such a proof exist?

Maybe we can intuitively see that such proofs must exist. If a system is complete and can express all theorems, then we might suspect there are an infinite number of theorems of increasing length, including theorems of infinite length. I would suspect that the proof of an infinitely long theorem would not be possible in finite time.

Does such a proof exist?

Maybe there are things that cannot be proved true (as it would take an infinite amount of time) and yet cannot be proved false either? You would require a proof that all proofs terminate in finite time to enforce the law of the excluded middle. Does such a proof exist?

Well... proofs don't halt, just as computations aren't true or false. However, sliding past that, it seems to me your question "Does such a proof exist?" should have the answer no, provided you're trying to prove it using a consistent logic. This is the essence of Godel's theorem (one of his theorems): Suppose M is a system of logic. If M is sufficiently powerful, then it contains a proposition A that is true if and only if it cannot be proven within M. If M were able to prove A, then A would be false yet provable, making M inconsistent; therefore, if M is consistent then M cannot prove A; therefore, if M is consistent then A is true. So, if M were able to prove its own consistency, that would prove A is true, which would mean that M is inconsistent. Thus, any sufficiently powerful logic that can prove its own consistency is, in fact, inconsistent; but also, in any sufficiently powerful logic some true propositions are unprovable. Hence my expectation that there is no proof such as you ask for (unless your logic is inconsistent).

Proofs have to be computed

I think proofs do not exist without a computer to run them on, whether that computer is electronic or biological. Without the computer the proof is just a bunch of random scratches on a piece of paper with no intrinsic meaning at all. I think the idea that a proof is true or false, without the context of the computer to evaluate it puts the cart before the horse. Mathematics is a product of the human mind, not the other way around.

A non-terminating proof is one that you cannot ever write down because it just goes on forever, so proofs do halt.

...and computations proved?

You can, if you want, take a computational view of proof, adding along the way a notion of "halting", though I don't find this a particularly natural way to think of a classical proof. You can, if you want, take a logical view of computation, adding a notion of "truth"; probably, "correctness", which you somehow have to furnish from outside since a program doesn't inherently have a purpose. Each of these projections, in either direction between logic and computation, requires you to add something that isn't native to the codomain of the projection. Neither of them comes "before" the other, they're simply different. Likewise, mathematics and the human mind needn't be placed in an order, with either before the other.

Computations prove theorems

Computations are not proved, they prove something. Every computation proves something at some level. For example running a Newton-Raphson iteration can prove convergence of a system with given initial conditions. Every computation proves something, although what it proves may not be useful or interesting. It could simply be proving that this machine does not have an Intel CPU with the FDIV bug :-)

Just to be clear about my

Just to be clear about my position on this (reiterating): Computation doesn't prove anything. You can set up an injective projection of computations onto proofs, so that in that peculiar sense each computation "proves" something, but the projection itself has added something that isn't native to computation, in order to produce a proof. It is also possible to provide computations with a property of "truth", specifically a property of "correctness", which is then subject to being proved in a logic.

It seems your arguments

It seems your arguments implicitly assume that truth has some status as a coherent, meaningful concept. The nature of truth is far from settled, and some even consider our notions of truth fundamentally incoherent. So to simply declare that proofs don't compute and computations don't prove would seem to be assuming the conclusion. It very well could be that the only sufficently general, coherent conception of truth is computational. CH is then an important hint on the nature of truth.

Finally, you once agreed that hypothetically, computation could produce a mind. This seems to contradict your position here. The notion of truth as present in proofs and minds as opposed to raw computation as expressed by machines and rules, is Searle's classic "semantics cannot follow from syntax". Except computational minds do imply that semantics follow from purely syntactic manipulation.

The simplest position would seem to be that computation inherently contains what we would recognise as semantic content, and computations multiply and shape this content. Or you could change your mind on computational minds, and continue to hold the division between logic and computation, but this comes with an additional burden of proof in philosophical logic on the nature of truth.

That which is most useful...

It seems your arguments implicitly assume that truth has some status as a coherent, meaningful concept. The nature of truth is far from settled, and some even consider our notions of truth fundamentally incoherent. So to simply declare that proofs don't compute and computations don't prove would seem to be assuming the conclusion.

Metaphiscal truth doesn't even come into it. I maintain that proofs have natively a notion of truth, and that the notion is alien to computation, but that doesn't dicate any position on the metaphysical status of that notion.

you once agreed that hypothetically, computation could produce a mind. This seems to contradict your position here.

This [the proposition that computation could produce a mind] is unrelated to my position here. I've maintained throughout that thinking is qualitatively different from computation. Saying that computation can give rise to thought is like saying that chemicals can give rise to a tree. Studying chemistry won't prepare you for what a tree is like.

Here it is

I think this is where our disagreement originates:

"I've maintained throughout that thinking is qualitatively different from computation."

Do you have a proof of that (perhaps an experiment to test the hypothesis)?
:-)

The meaning of meaning

I have no confidence that we've the same understanding of the term "qualitatively different".

I maintain that proofs have

I maintain that proofs have natively a notion of truth, and that the notion is alien to computation, but that doesn't dicate any position on the metaphysical status of that notion.

That thread is discussing a metalogical notion of truth, which seems relevant, but let's get right to the point. It seems that you are now making claims about what it is qualitatively like for us to engage with logic vs. computation, not what their fundamental nature is. This is not the claim Keean is arguing, he is arguing this claim that you agreed with:

Untyped computation doesn't correspond to anything on the logic side, and, on the logic side, truth doesn't correspond to anything from the lambda side.

This claim is about the fundamental nature of logic and computation, not about their qualitative properties given a human mind.

Studying chemistry may not prepare you for what a tree is like, but I don't see how that's relevant to the question of whether chemistry actually produces trees.

This is not going well

I'm all in favor of discussions that promote understanding, and in this venue the understanding should be related to programming languages. I keep hoping this discussion will eventually promote some understanding somewhere, and it all seems as least loosely related to programming languages, but I'm not seeing any sign of convergence here. It may be approaching time to cut bait.

I am not making "qualitative" claims about logic and computation; I'm making, afaict, substantially the same claim that you quote from Matt M. I agree that it's about the fundamental nature of logic and computation, in the sense of how they differ from each other; the nature of truth is irrelevant to that contrast. None of this has anything whatever to do with the obvious qualitative disconnect between computation and thought. And, btw, I wasn't joking when I suggested we might have different understandings of what "qualitative" means.

    [note: if this software supported the <s> tag I'd strike the word "obvious", since it appears to be not obvious to some]

Dislike of Types

Qualitative means they differ in "qualities", often opposed to quantitative, which is about the amount.

I think the difference between thought and computation is quantitative, ie given enough computing power a human brain can be simulated (IBM have already simulated smaller brains, a cat I think, but its not real-time). This is currently done by simulating the chemical reactions, because we don't understand which factors are important to optimise the model.

To me mathematics is clearly that which is common between thought and computation.

The above whilst interesting will eventually be confirmed by experiment (or not), and therefore there seems little point in speculating, we may as well work towards conducting that experiment.

I guess it depends on what you mean by 'thought'. I will admit that human emotional response is qualitatively different from computation, but I really don't think this has much bearing on the discussion. Yes you may take longer to find a proof when you are grumpy, or maybe a smell reminds you of something that triggers an associative recall that presents a solution to the proof - I would argue however that this is just an optimisation, the fundamental nature of proof search is the same.

As for the CH issue, it seems to me that it is a dislike of typed programming languages that is at the root if this. Those who favour types in programming find CH a useful tool. Those who prefer untyped (Lisp) languages seem to feel left out that CH has apparently given something fundamental to typing advocates. What I have yet to see is anyone who advocates typed programming over untyped reject the fundamental importance of CH.

As for my position, I want a typed language, therefore CH is useful to me because I can use it to express logic axioms in the type system that constrain the implementation. I need the logic for the type system to be constructive, that is represent things that are computable, hence intuitionistic (excluded middle can always be added as an axiom where needed). I also need liniarity. I am currently heading towards something like LIX, a fragment of intuitionistic linear logic that supports goal oriented proof search.

As for the point of the discussion, for me the important thing is not to reach total agreement, but to find out which parts can be agreed, and to find the minimal statement of disagreement. If you can precisely identify the minimum point of disagreement it can often lead to an experiment to decide, or the agreement that it comes down to a personal preference on a particular point.

for me the important thing

for me the important thing is not to reach total agreement, but to find...

I wished, when rereading my last post after the fact, I'd thought to clarify it by saying mutual understanding rather than simply understanding. Discussion leading to mutual understanding does not require agreement. On that, I think perhaps we agree.

Qualitative means they differ in "qualities", often opposed to quantitative, which is about the amount.

I don't consider them mutually exclusive. Cf. continuum fallacy, sorites paradox.

To me mathematics is clearly that which is common between thought and computation.

The word mathematics, like the word logic, is very hard to pin down. (I did spend one paragraph on this in my dissertation, section 8.1.1 if you're morbidly curious.) I don't see mathematics as common to either of those two, though; I'd actually say it's separate from both. Of course, I'm something of a... well, the modern term is Platonist, though I've found philosophers may object to the use of that term for the modern school of thought.

...The above [about computation producing thought] whilst interesting will eventually be confirmed by experiment (or not), and therefore there seems little point in speculating, we may as well work towards conducting that experiment.

Perhaps; if it happens in our lifetimes (it could, one suspects, easily be a year from now or a century from now), it'll be interesting to see whether there's general agreement on what does and doesn't constitute a successful demonstration — will people get tripped up by asking whether a true AI has a soul? How about whether it has the right to vote?

However, our disagreement here and now is not about whether computation may give rise to thought; I tend to think it may; you seem to think it probably can; but even if we'd both seen it demonstrated we still wouldn't necessarily agree on the matter of qualitative. If that's subject to experiment at all, one would perhaps do well to go back to the the remark by Ehud that, as I recall, started the whole thought/computation discussion here.

I guess it depends on what you mean by 'thought'. I will admit that human emotional response is qualitatively different from computation, but I really don't think this has much bearing on the discussion.

I agree, it doesn't have much bearing.

As for the CH issue, it seems to me that it is a dislike of typed programming languages that is at the root if this.

I don't think it's at the root. One would expect a statistical correlation, but correlation does not imply causation (nor, if causation is present, determine its direction).

Those who favour types in programming find CH a useful tool.

That strikes me, also, as a fairly safe assertion.

Those who prefer untyped (Lisp) languages seem to feel left out that CH has apparently given something fundamental to typing advocates.

Lisp languages are not neessarily untyped (or, more precisely, their typing is not necessarily dynamic).

What I have yet to see is anyone who advocates typed programming over untyped reject the fundamental importance of CH.

Is anyone here claiming CH isn't important? We could get tangled up over what we mean by fundamental. See also my remark above about correlation versus causation.

As for my position, I want a typed language, therefore CH is useful to me because I can use it to express logic axioms in the type system that constrain the implementation.

Verily, CH may be of help to you with what you're trying to do. I don't claim it mightn't be useful; I do sense that when one chooses to limit oneself to what CH can handle one is accepting some limits, and there may be rich insights to be had in the corners that CH does not illuminate.

Agreement

I find myself agreeing with all that, which might mean I agree with all that went before, it certainly seems so at the moment. I will have to think more about how CH might be limiting - its a valid point, but probably not of practical importance to my current project.

Humans Compute

That is like saying: thinking does not prove anything, you can set up an injective projection of thinking onto proofs so that in that particular case each thought "proves" something.

Apples and tesseracts

Thinking and computation are (as I was just remarking in another reply) qualitatively different. I don't find it at all plausible that one could set up an injective projection of thinking onto proofs.

Self Reflection

When I think, I am either intuitive (which is simply a form of associative recall) or logical (where I apply the axioms of a system by working out the reductions by hand).

So I would say that thinking is exactly like computation, in that there is a proof search, but augmented with pattern recognition that guides the proof search by recalling successful proofs from problems that have similar structure, but this just a speed optimisation of the proof search.

Cogito

When I think, I am either intuitive (which is simply a form of associative recall) or logical (where I apply the axioms of a system by working out the reductions by hand).

That may be what it seems like to you. I don't think a thought process can more than approximate logic (or computation).

Durnken Logic

There are logics for human thought processes.

Best paper I have read in ages.

That's a great paper. Scores highly for clarity and shortness :-)

Proof Search

I normally find that typing the axioms into a logic language and letting it do the proof search is faster, and limit myself to optimising the axioms to restrict branching factor.

Theorems of unusual size

Usually theorems and proofs are taken as finite data. Infinite theorems aren't really theorems, usually.

Seems Arbitrary

I don't understand why you would do that. It seems to be avoiding the problem.

How many theorems are there then? If you claim a finite number, I would expect a proof :-)

Relating Logic and Untyped Computation

There is another link between logic and computation that has nothing to do with types. A computer is made up of combinatorial logic and sequential registers. The logic gates obey the rules of logic, normally boolean. A contradiction in the combinatorial logic would result in a bit that is unsettled.

So it is obvious that the thing that makes the difference is the sequential aspect. A function can only oscillate in value because of the sequence in discrete time. If you remove the registers, it would all become unstable.

In effect the discrete time is creating a stratification like universes in type theory, such that a result at time t can only depend on the results in universes less than t.

To me there appear to be striking parallels between the type level and value level, where we could call both computation. Both can be stratified logics, it seems that both universes and discrete time impose a dependency order on functions.

Alternate pigeonhole (reply to Matt M)

I don't think CH is interesting as a foundation for studying computation as a physical phenomenon. I do think it is one of the most fertile abstract ideas we have in making bridges between the world of what computing machinery does and the world of human understanding. If you agree with me about that, and you are broadly in sympathy with the Stracheyesque approach to TCS [1], then it sort of follows that you should regard CH as very significant indeed. Maybe I think it is rather undervalued...

I would agree that CH is often used in a way that is not very fruitful. For instance, as a checkbox item, "being CH" is no better an indicator of good PL design than "being OO". Nor does it give us clear answers as to what kind of thing we should learn about a piece of code when we find out what type it has. It's an altogether trickier concept, like the notion of 'proposition' in logic.

[1] IIRC, Scott reports Strachey as saying "The point of programming is to get the computer to do what you want it to do" and regarded programming as the fundamental 'thing' that computer science is studying - this idea you can trace back, I guess, to Grace Hopper and no doubt beyond, but the Scott--Strachey approach [2] has borne very surprising and precious fruits. On this view, then, I suppose this machine--intention bridge looms in importance and irreducibility in something like the same way as the idea of the material world does in physics, or preference in economics.

[2] The work of Peter Landin, a colleague of Strachey, on describing PLs via the lambda calculus is what I'd the starting point of this investigation. Which makes this approach a natural home for CH.

"Overblown" isn't the right term

I do think it is one of the most fertile abstract ideas we have in making bridges between the world of what computing machinery does and the world of human understanding.

I'm not sure. What other abstract ideas are in the running?

My concern with CH is not that it's overblown, exactly. It's that we're reading too much into it and/or missing more useful foundational frameworks because we're drawn in by the elegance of CH. Instead of unifying typed computation with proof, what if we just introduce untyped computation and logic? Maybe we can observe CH equivalences on top of that foundation instead of baking it in. I'm not sure CH deserves the foundational role in our ontology that it's been getting.

Building better bridges

Well, the observation that compiling and interpreting abstract descriptions of algorithms into machine realisations is itself something that can be described by means of an abstract description of an algorithm and realised in a machine has to be something that outruns the formulae-as-types correspondence in the race to be coolest bridge idea, and it predates Landin's work by 30 to 35 years...

I'd say more that most people are rather confused about what is at stake with CH, so we see there are people mesmerised by cargo-cult invocations of 'proofs are programs', and see far fewer people trying to really milk the idea and see how it can be used as a ladder to reach even better bridging ideas (if you will excuse the mixed metaphor of milkable ladders).

Bridging better buildings

the observation that compiling and interpreting abstract descriptions of algorithms into machine realisations is itself something that can be described by means of an abstract description of an algorithm and realised in a machine

That is pretty cool, yes...

predates Landin's work by 30 to 35 years...

Um. What historically do you have in mind there?

I'd say more that most people are rather confused about what is at stake with CH,

It's usually safe to say most people are rather confused.

so we see there are people mesmerised by cargo-cult invocations of 'proofs are programs', and see far fewer people trying to really milk the idea and see how it can be used as a ladder to reach even better bridging ideas (if you will excuse the mixed metaphor of milkable ladders).

Of questions suggested by CH, the one I'm most interested in is, what fundamental insights can we gain by studying those facets of logic and computation that CH doesn't capture?

UTMs

Turing's class of UTMs are classes of Turing machines that can interpret each other, themselves, and all other Turing machines. This idea is in Turing's 1936 paper [1].

Turing had already been talking to Max Newman, from whom he took lectures in 1935 on the significance of Goeldel's theorems at Cambridge, apparently about exactly this concept [2].

[1]. Lots of background at http://www.turing.org.uk/scrapbook/machine.html

[2]. Max Newman talked about these 1935 exchanges in a taped 1976 interview with Christopher Evans for the exhibition, The Pioneers of Computing: an Oral History of Computing London Science Museum, 1976; the tape has archive object number SCM/archive-1997-1628/15. I'd love to have a transcript of this. Jack Copeland cites this interview heavily in his many articles on the history of Turing's ideas, where he says more or less that the idea of universality of a machine came up in discussion between the two in the context of Newman's lectures.

Types

The problem really seem to be to do with typing programs and not CH.

CH is really obvious, ITT is just proof-relevant IPL, there's nothing magical, its the same thing. CH is nothing more than an isomorphism... Another way of saying it might be that both intuitionistic propositional logic and intensional type theory are both a Heyting algebra with an exponential.

I am not sure what you mean by bridging ideas? What do you want to bridge?

CH not obvious

It's odd that, if the idea is obvious, that there were so many near-misses at the idea before 1969. I contend that the idea is not obvious. What is a type, anyway?

The 'isomorphism' bit is maybe a distraction: the word doesn't occur in Howard's paper, and I'm not sure if it is supposed to occur in any of Curry's writings on the topic. The whole idea of morphisms in this context I think is due to Lambek.

The bridges I am interested in are between (i) the world of machines, the things we make that are to do things for us, and which we seem to be able to frame quite well with computer-science concepts, and (ii) the world of intentions, the things that we want to be accomplished, which seem to be rather harder to frame satisfactorily: they might be best approached through language, or might be best approached through psychology, or might best be viewed as a different kind of thing entirely. The nice thing about types is that they are fuzzy enough so we can suppose that they apply to both worlds, but not too fuzzy to be contentless. But what content do they really have?

Obviousness not obvious

Heuristically, the profundity of a truth is proportional to how difficult it was to discover the first time, divided by how obvious it is once successfully explained. While some truths only have to be uncovered to become obvious, others may have a lucid explanation that is even more difficult to find that the truth itself was. If there exists a lucid explanation, the difficulty of finding it should be counted in the numerator of the heuristic.

(footnote in my dissertation, p. 335)

If Ch had near misses and is now obvious, that'd give it a fairly high profundity index.

profundity metrics

what a wonderful definition :)

Lucidity and its opposite

I like your profundity metric, although I also like the implicit metric in Morris Kline's remark [1] about Newton's greatness being measured by the 120-year shadow he cast on English mathematics:

The English mathematicians had not only isolated themselves personally from the Continentals as a result of the controversy between Leibniz and Newton, but also suffered by following the geometrical methods of Newton. The English settled down to study Newton instead of nature.

So profundity can obstruct as well as provide clarity.

I suppose giving some sort of account as to what was the difficulty that caused the big time lag between the publication of Church's simple theory of types in 1940 and the circulation of Howard's preprint of 1969 would be of interest. Not feeling up to the task of historical exposition right now...

[1] p 622 of Kline, 1972, 'Mathematics from Ancient to Modern Times', vol. 2.

yet another pigeonhole

I suspect that a large part of the "unreasonable effectiveness of mathematical logic in informatics" is that logic has a great deal of machinery for dealing with trees of terms.

In most engineering or scientific disciplines, the calculations tend to be associative and distributive, and so in the end one tends to "flatten" the work out into accumulated multiplies.

To the extent that the meaning of a program depends upon its structure (and, even descending to hardware level, NAND is not associative) then it makes sense we would tend to organize statements about programs in languages which respect that structure.

One comment

If you take an approach like Bedrock, then assumptions about run-time state (including garbage collection) will permeate every function signature. I don't think such annotations are ever going to "go away" until all constraints are satisfied at top level. Is that what you have in mind?

Bedrock seems interesting

Bedrock seems a better starting point than typed-assembly-language because its not platform dependent. I think it is lower level than I am thinking about. I think I am also slightly shifting position, because clearly some part of the functional' language will be a subset of the imperative one. It occurs that only the garbage collector is not in that subset.

Lets imagine a simple generic function, using Haskell type notation:

f :: AdditiveSemiGroup x => x -> x -> x
f x y = x + y

X and y could be any objects, but we have the implicit requirement that their lifetime is greater or equal to that of the function.

If the garbage collector correctly walks the stack, then it should never remove a referenced object, so the type signature of 'f' does not need to express this implicit requirement. But also an allocator that just allocates and never frees would be compliant.

Naively it seems like I want linear types to handle the construction and destruction, but which can be treated as ordinary types by functions that neither construct or destroy them, but this places a requirement that nothing else destroys them (if the GC runs in a different thread).

If I ignore threading for now, and stick with reference-counting for simplicity, then all I need is a type system that can prove objects behave as values (externally) then they can be used in the functional part of the language safely.

But what if the garbage

But what if the garbage collector stores data along side values in memory? For example a mark and sweep collector. Don't the types need to be aware of the mark bits in your scheme?

Proof of unreachability

Again naively, if it can be proved the garbage collector only removes unreachable objects, then it would satisfy the implicit requirements.

It may need linear types to prove the required properties of the GC (or a given object), but having discharged the proof of 'valueness' for the object, it should be usable in a functional subset without any extra type annotations, outside of the object or garbage collector.

Aren't mark bits and

Aren't mark bits and reference counts really just witnesses to the safety of coercion back to a linear type? That is, you start with a linear type, then a coercion to a shared type attaches a value that acts as a witness to the safety of coercing back to a linear type. The returned value from GC allocation is a non-linear abstract value with read and write operations to operate on the encapsulated value, ie. read/write barriers.

Seems the right kind of thing

This seems to be exactly the lines along which I am thinking.

C uses a runtime system

By close to the CPU, i mean an idealised CPU like a register-machine, or a stack machine. I am thinking of something like a subset of 'C'

But C does use a runtime system: your OS kernel is doing a lot of work to make C programs work properly (virtual memory, signals, etc.). Why do those runtime systems find grace in your eyes, whereas you're convinced that GC-as-a-library is a fundamental aspect of a good design?

I think as long as your low-level language is idealized in some sense you'll need a runtime system -- and that means that this runtime must depend of the machine you're running on. Where do you draw the line?

C runtime.

The C runtime, "CRT.o" just sets up the stack. There is no actual runtime, but it uses the functions in libc which you don't have to use. The kernel virtualises the resources, but is otherwise not necessary. C can be compiled to run directly on the hardware.

The aim is there is no runtime beyond what 'C' needs, but allow linking with existing libraries. If you wanted garbage collection it would be implemented in the language as a module you would include in your source code.

Hm

I think we should count most of what the OS does as a runtime system; or is there a fundamental difference in semantics with what a language runtime does?

What is the justification for setting C as the frontier? Is it a practical one (that's easy to do on today's computers) or a scientific one?

Not Necessary

The OS is not necessary. 'C' can run on raw hardware without an OS, the OS functions are just library functions. 'malloc' has no special status in 'C'. You can take a fixed pointer to the first address past the end of the program, and start using the memory, incrementing as you go.

And the CPU?

I agree with gasche; furthermore, the CPU itself is no different. I don't think the HW-SW frontier is intrinsically privileged. Each level implements a "(virtual) machine" or a "language" on top of previous levels.
A hosted C implementation assumes a library as given; the library implements "hosted C" on top of "freestanding C", and after some levels, you get to the hardware. If you (or the OS) use virtual memory, you're using the MMU as "runtime system" to interpret your page tables. You can even partition the hardware further.

Luckily, given a Turing-complete level, you can ignore everything below (see John McCarthy on this more in general: http://www-formal.stanford.edu/jmc/universality.html). So you're free to start at whatever point you like in the chain. Of course, as long as you keep the choice fixed, you won't study anything below the level you've picked. So the choice is ultimately a function of what you consider interesting (for either technical or personal reasons). If we fix some technical goal, there can be a meaningful discussion of what level to stop at.

For instance, to study memory safety (as I imagine it) I guess you can use the C-like language used by the early separation logic papers (which is close to a C-like assembler). To answer gasche, it's fine that you ignore virtual memory, signals, and everything else, unless you want to distinguish different memory violations and their exact effects, and pay the attendant modeling cost.

Virtual Machines

I agree with this.

Summary so far

In the course of the discussion I have realised a few things:

  • There is a functional subset of the imperative language that is always simpler. So its not quite opposing hierarchies of complexity, but probably a two dimensional space:

    Implementation Complexity For Type-Safe Languages
    type-system
    implementation
    language
    implementation
    description
    simple simple a functional language with only primitive (stack based) values and copying.
    complex simple a 'C' like imperative language with direct (linear) memory control.
    simple complex a functional language with algebraic data types references and garbage collection.
    complex complex an imperative language with direct (linear) memory control, algebraic data types, references and garbage collection.

  • That what I want is probably a linear type system, with some kind of witness (as pointed out by naasking) that the object behaves as a value. This way I can treat the objects as simply outside the object definition. So it should look like a simple functional / imperative language from the function level, but defining the objects would be more involved. The kind of objects I am thinking about are C++ like containers: List, Vector, Set, Map, Tuple, Variant etc, but defined with parametric polymorphic types instead of templates. Tuple and Variant together would give the same functionality as algebraic datatypes

what's old is new again

In some cases

In some cases you may want a notion that a sub-computation returns a value, even if the internal workings of the computation involve mutation prior to the point of return. An initializer that does not allow the object being initialized to escape can then return an object reference to a frozen object.

More generally, the mathematical notion of functional does not preclude mutation during intermediate computation, and exploiting this is sometimes useful.

I agree with this.

In fact all object construction involves mutation, as memory does not come pre-initialised. So I think that's a useful concept. Also as we discussed elsewhere creating a mutable copy of an immutable is possible, and if the immutable is never referenced again (for example returning a mutable copy of a local immutable) then you can effectively unfreeze the object avoiding the copy.

linearity

It wouldn't be difficult to tie mutability to a linear reference.

Plaid language already does similar, but supports 'atomic' sections to manipulate shared objects. You might eschew the atomic sections, or limit them to a specialized subset of shared objects.

I don't quite follow

Not sure what your "language" column is supposed to measure. An ML-like functional language (like your characterisation in row 3) is certainly far simpler than C by any semantic measure I can think of. What you probably meant is not "language", but "implementation (on top of raw mainstream OS)".

complexity of implementation

I do indeed mean complexity of implementation on top of common hardware. There wasn't space to write that in the heading though. I don't mean semantic complexity - I tend to think semantic complexity follows type system complexity, when we are considering a type-system based on ITT.

You could argue the type-system column is not very clear either, as I don't mean the complexity if the type system, but the complexity of implementing a sound type system for the language on top of ITT.

Maybe

Maybe put "implementation" in the heading? ;) I agree with type system usually being correlated with semantic complexity.

Safe or unsafe?

Relatedly, I think it matters a lot whether you want your language to provide mechanism to check safety or not. It's simple to implement C because the complex underlying semantics (not screwing up with bare memory) is left to the user. Trying to implement a type system (or whatever static analyzer) for this is not simple.

Keean, it's rather clear to the experts that putting "complex" in the type-system column means that you want not only a "C like" language, but a *safe* "C like" language -- Cyclone-like rather than C-like. But I think you should clarify that in your description text.

All Safe

Yes, you are right, that was an assumption that I did not write clearly. I'll add it just in case someone is looking at the table without reading this follow-up.

File-Vector behaves as a value.

I think the File-Vector I have just written is an example of this kind of thing:

https://github.com/keean/File-Vector

Externally it behaves as a value, but internally it has to manage the memory mapping of a file, that can grow and shrink. I don't see how you could anticipate all the useful abstractions that might be built ontop of mmap into a functional language. A mmaped file-vector is just one useful example. I think a language needs an imperative low-level layer to allow new abstractions to be built.

However, the object as a whole has value semantics and behaves according to a much simpler set of rules. It may not be functional (as its mutable), but it could be assigned an ML type I think.

This also raises the possibility of an anonymous mmaped vector for use in inter process communication, and a private mmaped vector for in memory only modification of a read only file. I could change the constructor to take a file-descriptor instead of a filename, and add some flags to enable these use cases.

One interesting feature is resizing happens by remapping the physical memory rather than copying it could be quite fast. The only drawback is the memory use is always multiple of the page size.

The most important thing about type theory....

In my estimation anyway, the most important thing about type theory is learning how to stop doing it WRONG!

It is possible to create a language in which there are no types other than binary-blob and all operations work on all values, each treating the bits found there according to assumptions about its type which may or may not match the assumptions in force when that location was most recently written, and programmers labor mightily to keep track of everything themselves to avoid errors that turn out to be simple; But that is clearly doing it wrong.

It is also possible to create a language in which absolutely every bit of semantics is a feature of the type system, it is nearly impossible to code without a Ph.D level understanding of category theory, and several truly elegant proofs exist that you have excluded certain types of errors given certain sets of assumptions about the program most of which are incomprehensible to ordinary programmers, but these proofs are no help whatsoever to ordinary programmers in making a program actually do what they want it to do and instead they spend most of their time trying to satisfy the type system or trying to understand how they have failed to satisfy it. This is also clearly doing it wrong.

Somewhere in between these two nightmares is a type theory (or possibly something more humble like a mere "type system") that is highly understandable, useful to both beginning and advanced programmers, reasonably correct, and allows programmers to think mostly about algorithms and solving problems rather than laboring to understand and satisfy labyrinthine and opaque requirements. A type system, in fact, that they can completely learn in an afternoon and never thereafter be surprised by.

I think that when we fully understand the interaction between type systems and algorithms, AND the interaction between human minds and program construction, we will wind up working to design type systems that that have a high utility-to-complexity ratio and not very much complexity at all, rather than type systems that are labor-intensive and intellectually challenging to satisfy because they place interesting and novel constraints on the set of errors a programmer might make if he or she were untrained, stupid, had no sense of design, or was writing a program completely unrelated to accomplishing something he or she cares about accomplishing.

So, yah. Types are useful. But not so useful that they can ever substitute for thought and design, and not so useful that they deserve more of a programmer's attention than the attention he or she devotes to thought and design. There's a point of diminishing returns in what type theory contributes to programming, and the really important discoveries are not about how much the type system can check. They are about pushing that point of diminishing returns further out. A really groundbreaking discovery would be about how to make something conceptually simple and easy-to-satisfy that checks as much or more than some other type system that's harder to understand or use.

complex inside simple outside.

Do you not see any point in being able to package something complex (like a memory mapped file) as something with simple predictable semantics (an array of mutable values). Where the developer of the complex thing makes certain guarantees, that can be enforced by the type system, so that the user of the simple object can be sure that it behaves properly.

Sure. Language standard libraries.

Of course it's valuable to be able to package something with simple predictable semantics. For example, I would happily import bindings to it using a standard header, so that developers can say something like

"using #FileMap"

at the top of a file of source code and then use the simple outside. It's even better if changes they make to their programs which use it do not require a repetition of its type proof to be constructed, which would delay putting routine changes into use.

But if they have to look at the complicated inside and understand it, or need for any reason to change it even though they don't understand the type proof so those changes are shots in the dark, or if updates to a 'live' system are delayed while the type proof of the file map has to be repeated because anything that uses it changed, then it becomes a loss.

Of course, it's in exactly that kind of "live" system that such a persistence mechanism is very valuable; keeping your live data file-mapped will help you make it persist across program updates when you need to swap out objects higher in the hierarchy than the ones you're saving (which will change their definitions via inheritance and may change their sizes) without losing the runtime data, closures, etc, that they hold.

And depending on how long it takes to construct a type proof, we may have the situation where you need to run a type-tagged (dynamically checked) version of the new code interleaved with type proofs until you finish doing proofs for the new version and discover that you can (or can't) swap out the typetagged and dynamically checking code vectors for code vectors that skip the tagging and checking because they've been proven in advance not to need it. A persistent file map is excellent for that too, to the extent that you can cache information that helps you construct valid type proofs fast during run time.

I guess I don't live in a world where there's infinite time to check and reject and adjust and fix before the program starts running, nor in a world where the only typed data that the running program needs to deal with is typed data that was created strictly by the current version of the running program. Static checking is a valuable technique but it can never be the only technique I use. I can't always meet the clean-start requirements on which its proofs of correctness rest, and I don't always have the time to wait for the proof-checking to complete.

Trapdoors

I think I agree with all of that. I am sure there need to be some trapdoors that allow the programmer to say "trust me this complies" or perhaps better "assume component complies if it passes these unit-tests", where the unit tests can referred to in the type system as an alternative to a proof.

One thing that sometimes

One thing that sometimes drives 'live' sites to put up dynamic versions first (or use very lightly checked languages which are designed for JITting) is that it's usually fairly quick to construct a JITted/tagged/dynamically checking executable -- you may not know for sure that it can be trusted to have the *other* properties you want or hope for until more proof checking completes, and the dynamic checks (or the JIT environment's abbreviated static checks) may trap only the very grossest of errors, but it's a very quick trip from source to running code, and that can be crucial.

The question of how much you can prove about the code (or even, about the code given the restrictions on its set of callers in this particular program and the arguments they can provide it with) is worth pursuing though even after that first-cut version is up and running, because it will definitely enable further optimization. You can get rid of tags and runtime checks at the very least, if you prove that the errors they check for cannot occur. But more generally if you prove something qualitative like "This is purely functional" it means that it joins the set of things whose evaluation order in an argument list doesn't matter, or which are amenable to transforming optimizations like memoization, etc, or if you can prove that there's no state shared between call sites, it can be inlined correctly, etc, and then further optimizations can happen unloading the CPUs on your live servers.

Or maybe it's even something simple and human, like the programmers are allowed to go home once the "statically proven" indications for everything that might require immediate further intervention in the running program come on.

Proofs are Paths?

I may have got this wrong, as I have not been learning about Homotopy Type Theory that long, but a proof is basically a path, the same as an equality. So given a type that expresses some constraint or property, you can infer a type from the function implementation (allowing full type inference), and then find a path from the inferred type to the constraint type. If you can find such a path, you have a proof. What I wonder is if the topographic part of finding a path makes search easier than a simple depth-first or breadth-first search. I am wondering if some path-finding algorithm like Dijkstra's algorithm would work?

I think the idea is that

I think the idea is that paths in appropriate spaces serve as proofs of equality of the endpoints. I don't know that there's a generalization to all proofs being paths.

Weaker Statement

Some proofs are paths? Proof search in some cases seems to be finding a path from A to B in a topological space defined by the axioms. I guess inductive proofs seem harder, You have to show there is a path between A0 and B0 and that a path from An and Bn to An+1 and Bn+1 (is that a 'higher' path?) you have a proof.

Isn't that just a question of what analogy you use?

There are many many ways to visualize the process of finding proofs. You can regard it as an induction, or a unification, or a path search, while actually carrying out the same steps. The biggest difference comes about depending on whether you're talking about a language with the type erasure property or not.

In languages with the type erasure property, type annotations (or type inference) does not change the semantics of the program -- it identifies cases that can never occur so you don't have to worry about them when optimizing or generating code, identifies things that are amenable to certain types of optimization (such as memoization and inlining) without error, and etc. This suits 'live' systems very well, because a straightforward translation (made ignoring annotations and starting with a 'universal monotype' hypothesis) can start running NOW, and fully-analyzed code an order of magnitude smaller or more efficient can be substituted in during runtime as your type work proves it valid. I tend to prefer languages with type erasure, so I think of type information mostly in terms of doing inductive proofs: type information means having axioms, lemmas, productions, and theorems about what can and cannot happen during runtime, and until you derive theorems that eliminate the scenarios you've identified as error cases, you are running without guarantees. If you get type information wrong, not only will the system be unable to prove that it will never hit one of your error cases, but it may actually hit one during runtime while you're still waiting for the proof, and you have a Homer Simpson "Doh!" moment.

In languages without type erasure, you have types that alter the semantics of the program, so there must be a complete type solution before we even know what the program means. Variables and routines can be polymorphic, meaning they have different values depending on the numbers and types of their arguments, continuations, or both. Without type erasure, it's probably more valuable to think of type information in terms of a search for a path or a unification. The path is crucial because you can't even create an executable without finding a complete solution to the type information first. So finding that solution becomes the first problem you have to solve in order to create an executable, and the single most common (in many systems the only possible) reason why creating an executable might fail.

The point is that these type systems are duals of each other; in a system with type erasure, we start with the 'universal' everything-is-every-type hypothesis and then start narrowing it down as we disprove cases. In a system without type erasure, we're starting with the universal nothing-is-any-type hypothesis and building up productively as we prove cases.

The method is the same -- to reach a unique solution that includes the specific types that can happen at runtime and excludes those which cannot. The goal is the same -- to show that our identified error cases are not among those which can occur at runtime. The steps are the same - induction, unification, lemmas, and proofs.

Type erasure allows the existence of an executable in which some cases that cannot actually happen have not yet been disproven; if these cases include the error scenarios, then an admissible type solution for that program has not yet been found. If the error scenarios are in fact among the cases that can occur, then an admissible type solution for that program CANNOT be found, and until you fix it and derive a type solution, you are at risk of actually invoking one of the error scenarios.

The absence of type erasure on the other hand doesn't permit the creation of an executable at all until all cases that can happen are found. If in the course of finding these cases they are discovered to include error scenarios, then there is no sensible semantics that the program can mean because there is no solution to the requirements of its polymorphic elements. Again, this is a matter of having constructed a program for which an admissible type solution cannot be found, and you need to fix it before you can find one. The difference is that without the type erasure property there is nothing that a program without a type solution can mean. So in that case I am not doing proofs about a program with semantics, I am only doing proofs about an input string with syntax.

So, to your direct question: In languages with type erasure, I'm doing proofs about a program both to optimize it and to try to show that the identified error cases cannot arise. In languages without type erasure, there is no program to talk about unless a solution to the type equations can be found in the first place, I typically have no input into what is designated to be an 'error case', and I'm 'pathfinding' or 'solving' or 'unifying' or whatever you want to call it in order to determine what the semantics of the program ought to be in the first place.

The 'path' from input string to program both demonstrates that the input string corresponds to a program that has semantics and shows what semantics that program has. Finding it is THE crucial work in creating an executable. But aside from the crucial and pre-eminent place it has in these languages, it really isn't different work than it is in languages with type erasure.

Type Erasure

I think want type-erasure for what I am working on. The output of the compiler might well be a C (C--) program that can obviously exist without any types.

However, I also want complete type inference, so there will not be such a thing as an 'untyped' program.

The proofs as such would be optional annotations where you are specifying a type constraint. If a constraint is specified the compiler must be able to prove the constraint type is equal to the inferred type, which effectively means you can find a path from one to the other using the axioms of the type system.