Extremely static language?

Hi,
I'm interested in languages which do compile time restrictions on the values of variables. Specifically, does anything like the following exist:

x : int [0, 2, ... 10]
x = 50 //error
x = 3 // error
x = 2 // ok

x = rand() // error, rand() : [0.0 ... 1.0] doesn't match [0, 2, ... 10]
x = floor(rand() * 2) * 2 // ok [0, 2, 4] is a subset of [0, 2, ...10]

Thanks,

Alex

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Because someone is bound to say it...

The theory that most fully generalizes this notion is that of "dependent types", which basically means types which can depend on or include values. This is an area of much active research, and a search for "dependent types" or "dependently typed" will produce lots of good stuff...

For a number of reasons, dependent type systems present barriers to practical application, or at least application in a naive way. So there have been a number of attempts to realize the benefits of dependent types without having to dramatically change the way we write programs (indexed types, GADTs + extensible kinds, etc.)

You can find quite a bit of discussion on these topics here on LtU. I'm sorry I don't have time to include links.

On the other hand, I feel a little bad pointing you at one of the more esoteric and experimental areas of PLT in answer to a relatively simple question. You should be warned that none of the dependently typed languages provides anything as straightforward as your example, and without considerable preparation, you might not even recognize them as doing the same sort of thing. If the only thing you're looking for is relatively simple bounds checking and enumerated types, you may find a system that does this stuff using, for example, constraint propagation. Lots of languages provide statically checked enumerated types, for example. However, you'll probably find that you quickly run up against cases that can't be (trivially) statically analyzed, and your language will need to provide a run-time checked way of getting dynamic values into a constrained type.

I believe that you can emulate this type of behavior with C++ templates and Haskell type classes, but it gets a bit esoteric. Perhaps someone else knows of a practical language that does this sort of thing in a more friendly way.

Wrong way. Head for a very dynamic language like Ruby.

After all, what is compile time but an early step in run time.

Consider C++ template metaprogramming. It's the worst and most horrid mini-language in the world, right up there with InterCal and BrainF*k, but you can use it as a programming language in it's own right.

Now consider Ruby (or lisp or scheme or...). Classes and Functions are actually defined and defineable at run time.

What you need rather is to implement your strict checking language in Ruby (or Scheme or ...), 90% of the work can be done creating strict checking classes overlaying the standard primitives. 9% of the work can be achieved by directly dropping through to eval or module_eval.

ps: Check http://intervals.rubyforge.org/ for a very nice implementation of Interval arithmetic.

No runtime errors though

What about when the code is running in an application where runtime errors are never acceptable (automotive control systems, for example)? The application may contain some complex, hard to statically analyse code, like the rand() example above.

In the extremely static case, the code may be correct but fail to compile as the compiler is too 'stupid' to work out that the code is correct. The code may also be incorrect. Either way, the programmer is forced to make it a bit simpler, to allow the compiler to figure out what is going on.

In the extremely dynamic case, with all strictness analysis being done on the executing program, there may be cases where runtime errors occur. In some areas that may simply not be acceptable, so wouldn't the extremely static language be better for these areas?

All language that I know of

All language that I know of have runtime errors, no matter what kind of typing they have.

A static type system might prove that a program doesnt have any type errors. But it could be that its just moves the errors into the non-type error field.

edit: Ok, I'm not sure if you get any runtime errors in Charity. But I'm not sure if want to program in Charity anyway.

But static checking removes more than no checking

Yes, all languages can have runtime errors (what if the hardware fails, after all?) But, static checking can remove a lot of those runtime errors. If your code runs in an application environment where runtime errors can't be tolerated, then static checking would seem to be a better direction to head in.

At least, for these sorts of applications.

Actually the static typing

Actually the static typing doesn't remove those errors, it just finds them. Removing them has to be made manually. And there has been studies that show that for large programs there comes a point there removing errors generally causes new errors in a proportionall fashion. So I just wonder: can we be sure that then we remove static type errors, that we dont introduce just as many new non type errors?

Excellent Question!

This is why it's important to strive for languages in which local changes have local effects. On the other hand, part of the point is that when you correct type errors, you have more assurance that your code is correct, rather than less. Of course, correcting the type error can still reveal an error that was in the code and that isn't addressed by the type system, but it's quite unusual to introduce such errors in the process of correcting type errors.

How tight is your spec and

How tight is your spec and how much of it can you check on a computer (optionally, without requiring an undecidable typechecker)?

And of course, how do you ensure the spec says what it's supposed to? :-)

Robert Eachus had an interesting comment on this

The post on Usenet
"There have been several times I have thought
of creating a bug tracking system that tracks how many bugs are caused
by a previous "bug-fix." The number of such bugs for each earlier bug
is a major determinant of when existing software has become too
"brittle" and has to be replaced. But it also allows you to determine
when, and in fact if, a new development project is ever going to be
complete."

there's always Spark Ada

Please Be Serious

John Carter: After all, what is compile time but an early step in run time.

The obvious crucial distinction being that compile time occurs before you deliver the software to the user, and any error that the compiler can prove can't occur at runtime won't, so by definition you can create more robust software from the user's point of view with any statically-typed language than any dynamically-checked language.

Now, as has been discussed here ad nauseam, if the statically-typed language is essentially Standard ML or O'Caml or Haskell98, then without jumping through various encoding hoops, the static type system probably isn't going to be able to prove anything dramatically more interesting than what a good test suite in a dynamically-checked language is. So you have a couple of options: jump through those encoding hoops (cf. "lightweight dependent typing," the various pieces on phantom types, etc.) or use the GHC extensions to Haskell98 (GADTs and so forth); or use one of the more powerful, but even less mainstream than Haskell, functional languages.

We're also ignoring the issue of scaling these checks. The larger the codebase, the less likely that you're going to have a comprehensive test suite with adequate coverage. Consider Tim Sweeney's POPL '06 presentation, in which he discusses Gears of War, which has some million lines of C++ and UnrealScript. Tim discusses four static language features which he calculates would eradicate approximately 50% of the bugs in Gears of War—that is, the codebase wouldn't compile if those bugs were present and the compiler would tell you where the bugs were. Now imagine test coverage that found every instance of a bug caused by the lack of those four language features in a million line codebase. You'd spend enough time writing the tests to implement a compiler for the language with Tim's four features, a compiler that you could then use on multiple projects.

When you really think about static typing vs. dynamic checking coupled with programming in the large, it rapidly becomes apparent how fundamentally unserious the dynamic checking advocates are.

I'm deadly serious, so let me be more explicit.

In no way am I invoking the "dynamic typing is better than static". I merely stating static typing can relatively easily be recreated on top of a dynamic one.

Consider C++ template metaprogramming.

It is a truly horrible language, but it is a programming language whose primitives are static types.

If I were to point to the one great weakness of C++ is that you need a truly twisted genius to metaprogram in C++ templates.
Andrei Alexandrescu has this twisted genius in spades, but the rest of don't.

Is doesn't need to be so.

The types themselves can just be (yet another) object in a dynamic OO programming language , and you can program with them the same way as you program with any other object.

Thus instead of creating yet another Bondage & Discipline language like C++, you can merely create a thin extension to a very dynamic language like ruby / scheme / ...

All static type checking is, is the evaluation of constant expressions prior to running the program. (A Good Hint to this is the existence of RTTI in C++. RTTI and dymnamic_cast is what you do when you have a type expression that is not constant.

Static type checking is just constant folding of type related assertions.

Consider this fragment of Ruby..

 def myfunc( an_int)
   raise "Static type check failure!" if an_int.class != FixNum

   an_int.split(,) # FixNum's do not have such a method and we have
                   # just asserted an_int ISA FixNum.
 end

 my_func( "a string")

There is no reason why that fragment, since it involves constants, cannot be evaluated

So I'm suggesting instead of creating a super strict static checking language, merely create the thinnest extension of a language like ruby or scheme or ....

Static type checks merely become assertions of constant boolean expressions involving instances of "type" objects. Where the "type" objects are merely vanilla objects within the base programming languages that have the appropriate attributes and methods to model the types used.

Once the instance of the type object has been created, it can be marked as "frozen" or "constant".

On loading a module, do the obvious optimization... ie. "constant folding", evaluate all expressions involving objects that are "constant" or frozen at the time of loading.

Static types check failures merely appear as assertion failures at module load time.

Even the speed of a static language can be recovered in a dynamic one in that some dynamic languages permit you to explicitly state, if you know beforehand, which instance/implementation of a virtual function you will invoke. Thus at the constant folding stage, one the expressions you can constant fold is the lookup in the virtual method table.

Well, since you mention it...

...I do feel that since I can get the advantages of static typing (type safety and speed) via a thin extension to a dynamic language, but not the other way round, it does suggest a certain, umm, ordering in my language preferences.

:-)

Thanks for Clarifying

This is very helpful, and while I am a fan of soft typing systems for dynamically-checked languages, the major point remains that either it's being done after the software has been deployed, or it becomes static typing that doesn't check everything. I mean that last literally: there is code in your program that isn't checked, vs. a statically-typed language in which everything is checked, and that checking prevents you from saying some perfectly valid things that we don't yet have a type system to express. I suppose a soft typing system that told you what it gave up on would be a kind of happy medium. And in the limit, what we end up with, of course, is the inaptly named Dylan, which sits precisely at this intersection of static typing and dynamic checking.

Furthermore, multi-staged programming blurs the distinction between compile-time and runtime in such a way as to preserve the properties you want at each stage, and introduces runtime code generation as an added bonus. Note that this is distinct from, but subsumes, meta-programming, of which C++'s template system is an accidental, and not exemplary, instance. So this is a field worth continuing to watch, but unfortunately, there aren't many multi-stage languages as of yet; they're the most "out there" of the research languages.

So I remain convinced that the future will get us out of the static/dynamic conundrum, and in the meantime I also remain baffled as to just what kinds of things people find themselves able to express in their favorite dynamically-checked language that they find themselves unable to express in their closest-to-favorite statically-typed language. But I suspect, given my experience, that I'll never hear a satisfactory answer to that—that is, given a set of functional requirements, someone could implement those requirements in a dynamically-checked language and I could implement them in O'Caml, and each of us would find the other's solution abhorrent for one reason or another. :-)

I've found myself faking

I've found myself faking first class modules in haskell a few times now, and I'm pretty sure that sooner or later I'll find a use case based on them and functors that requires impredicative polymorphism. I've not found an alternative that doesn't sacrifice some typechecking ability - AIUI switching to ocaml would likely do so as well, I'm rather fond of monads for restricting the range of side-effects.

Right tool for the job and all that

I mean that last literally: there is code in your program that isn't checked, vs. a statically-typed language in which everything is checked, and that checking prevents you from saying some perfectly valid things that we don't yet have a type system to express.

As someone else pointed out you can always go with something like SPARK Ada which has another layer of checking above static types which can cover exactly the sort of cases outlined in the original post on this topic. Every argument for static types and their improved checking and correctness can be converted into a perfectly analogous argument for formal specification over static typing due to improved checking and catchign errors that static typing would simply miss.

Personally I would like to see people recognise that the whole thing, from completely dynamic types with no checking at all (even runtime checking) up to full formal specification and theorem proving for checking is simply a continuous spectrum of choices of exactly how much you wish to specify about your system, trading flexibility for assurance. There is no "correct" spot on the spectrum, only a choice as to which point in that trade off provides the sweet spot for the project you are working on. Some projects require a high degree of flexibility and don't require excessive assurance. Some projects require strong assurance, and development flexibility really isn't that important in comparison. Use the tool that suits the job. There are going to be a lot of different tools to fill the different niches. Being familiar with all the possibilities is probably a good idea.

I Think This is Understood...

...and the question basically comes down to the value of the phase distinction. What would you like to be able to say, iron-clad, about your code before it gets deployed? Conversely, what would you like to be able to say, iron-clad, about your code after it gets deployed? There's a big world in which you don't get to change your code after deployment except via replacement. There's another big world in which you can incrementally apply patches as long as the code isn't running, and another big one in which you can't stop running the code. One reason I'm fascinated by systems such as Acute is that they attempt to address safe upgrade without sacrificing either abstraction or type safety. But in the meantime, we continue to be haunted by the phase distinction and the limitations on the current state of the art in allowing us to express what we'd like to express.

In the extreme, static

In the extreme, static typing becomes the verification of a full formal specification.

No. Diffferent problem, so try a diffrent extension. Duck Typing

Static typing is merely constant folding. It isn't anyway so ambitious as to be program proving.

What would be a more interesting goal for enhancements to static typing is to take duck typing to it's logical conclusion.

ie. Accumulate every operation that a function parameter participates in as a list of methods which it responds to.

Then check (statically) every call of that function and verify that the calling instance actual arguments all respond to that list of methods.

If some of the actual arguments are also parameters, just feed the list transitively up the call graph.

And lo, you have a statically checked duck typing system.

What is static typing?

Static typing is merely constant folding. It isn't anyway so ambitious as to be program proving.

This is very incorrect. Static typing involves proving theorems about programs. These theorems may not be that exciting to you, but they are theorems. And there are type systems that prove very interesting theorems (race-freedom, or deadlock-freedom, or ...). I don't believe that all static analysis of programs is really type systems, but type systems are a static analysis of a program that proves certain things about that program.

Incidentally, the same is true for constant folding.

To go a step further,

To go a step further, there're some well-understood turing-complete type systems out there - so the real question is "can a type system get enough input to prove what I'm interested in?"

Structural typing

Duck typing is a form of structural typing (as opposed to nominal typing). Statically-typed languages with structural subtyping have been around for a long time.

Static typing is closed form abstract interpretation

Static typing is merely constant folding.

Static typing (ST) is far more than constant folding (except in trivial type systems).

ST consists of verifying your algorithms against an "embedded logic". ST is a simplified form of abstract interpretation (AI); ST is thus simpler, faster, but less precise than AI in every way (that I can see). Unfortunately, AI seems much more difficult and time-consuming (compilation-wise - correct me if I'm wrong!).

Can you can have abstract interpretation to verify algorithms in dynamic languages like Ruby though? Perhaps.

Others have commented that type systems can prove deadlock-free code, and other interesting properties, so I won't go into that. :-)

"Everything" is checked except the bits you're ignoring

or it becomes static typing that doesn't check everything. I mean that last literally: there is code in your program that isn't checked, vs. a statically-typed language in which everything is checked

It's not as simple as that. When you encode latently-typed data into a statically-checked language, not "everything" is statically checked. You're simply wrapping some things in such a way as to escape the typechecker. The typechecker is satisfied by checking the type of a wrapper, but the contents of the wrapper remain latently typed.

In such programs, there's usually information about the type of the contents of a wrapper which can either only be known at runtime, or which is known to the programmer but not easily expressed to the typesystem (e.g. knowledge about constraints on the input data).

The value of static checking tends to be pretty low in these cases — the only reason to use it is because you care about static checking in the rest of the program, the bits that aren't dealing with latently-typed data.

Examples?

While this is a good point—I'm reminded immediately of serializing statically-typed data over a network connection, for example—I wonder to what extent, if any, this isn't addressed by any language that has a good ADT-building system. If I'm using a member of the ML family (loosely defined to include Haskell) and I define a module that exposes an abstract type that can only be manipulated by the functions exposed by the module, in what meaningful sense is the data "latently typed?" Granted, the module's implementation might be very arcane, making use of Obj.magic in O'Caml or unsafe operations in Haskell, but once the module exists, it really does define a type in the type system, and that type really can be inferred, reasoned about, etc. using the concepts that are enforced by the module system.

I suspect strongly that this isn't appreciably different from what you meant, and without specific examples it will prove difficult to say much more. I would only like to point out that in O'Caml, for example, thanks to modules and polymorphic variants, it has proven relatively straightforward to provide a type-safe interface to OpenGL, a C API that is somewhat notorious for essentially embedding a "latently typed" API in such a way that C's type system is too weak to be of any use in enforcing the type distinctions. The question, to me, seems to revolve primarily around what level of granularity you think a module should be defined at as to whether values of the exposed types qualify as "latent" or "static."

constant folding for typechecking

On loading a module, do the obvious optimization... ie. "constant folding", evaluate all expressions involving objects that are "constant" or frozen at the time of loading.

Static types check failures merely appear as assertion failures at module load time.

Can you point to places where this is being done?

I.e. where such assertions are being caught at module-load time.
Not that it looks like a difficult optimization -- I just haven't seen it used thus (other than proofs-of-concept).

Typing is delicate

I have a feeling that things may not work out so cleanly when you really get down to implementing this. Typing is delicate. It's hard to pack everything you want into a type system while still ensuring your checker doesn't wander off into halting problem land. Allowing free-form assertions on types is probably not an option if you want to guarantee that you can check them statically.

Typing constructs are just part of a DSL that's integrated into a statically typed language. It optimized to match the problem domain, making it concise and efficient to process. Though I've seen some efforts to share constructs between the two, I don't think you can take an existing language like Ruby and make it work.

I'd much rather write:

def myfunc(an_int : FixNum)
   ...
end

What are the advantages of using the language's "normal" syntax to make assertions on types? I suppose the language's grammar may become a little simpler without the type-related syntax. You even might be able to share code between the type checker and the optimizer. Anything else?

BTW, I'm curious... did you mean "constant folding" in a literal sense? I think I understand how it'll handle the simple cases, but what about trickier features like parametric polymorphism, and subtyping?

A good test suite...

...the static type system probably isn't going to be able to prove anything dramatically more interesting than what a good test suite in a dynamically-checked language is.

I think you can take the "probably" out of that statement: a sufficiently good test suite is going to be able to prove essentially anything provable, while any practical static type system is going to be strictly less powerful. (Are there any type system implementations incorporating a theorem prover and allowing any predicate to be a type description? That might be fun to play with, but I can't actually see myself using it, much less anyone else.)

The primary power of a static type system is in what it allows you to not do (ahem): The type system means you don't have to come up with quite so many tests. I have even more doubts about the existence of a "sufficiently good test suite" than about the theorem-proving type system.

The primary purpose of a type system?

I can't speak for anyone else and I may be stating the obvious but...

Isn't the primary purpose of type systems is that it allows us to reason about types? And whether you favour static typing or dynamic typing, types are a fairly important component of programming.

Because it's fresh on my mind, let's take a simple example from SICP 2.1.1 where the authors define a set of functions to operate over Rational types. In my translation to Alice, I did two translations - one that's a fairly literal translation of the Scheme code - and one that takes advantage of ML modules. Specifically, the ML signature for the Rational type gives us:

      signature RATIONAL =
         sig
            type rational
            val make_rat  : int * int -> rational
            val numer     : rational -> int
            val denom     : rational -> int
            val add_rat   : rational * rational -> rational
            val sub_rat   : rational * rational -> rational
            val mul_rat   : rational * rational -> rational
            val div_rat   : rational * rational -> rational
            val equal_rat : rational * rational -> bool
            val print_rat : rational -> unit
         end

Now what does this signature buy me? Well, the Scheme code and the ML code basically give the same functionality, so it couldn't be a matter of computability. The ML code also imposes some hard limits here over the definition and interface to these functions, so I guess one could say the static Rational type in ML is less flexible than the implied types of Scheme. And Scheme unit tests could perform a parallel universe of unit tests to impose similar type contracts, so I don't think it's necessarily an advantage of testing.

So why do I still think that the above signature of ML provides me value in my programming endeavors? Specifically because it allows me to reason about the type of Rationals without having to look at the implementation of these functions, the unit test contracts being imposed, or any other comments. I can reason about the type. And reasoning about types is a very valuable abstraction.

Of course, the argument comes back to whether this ability to reason about types outweighs the limitations being imposed that are necessary for this level of reasoning to be meaningful (i.e. if the types are not enforced, then they are no more valuable than comments - being subject to becoming antiquated).

Anyhow, I do think it important to bear in mind that types are an important force in PLs. But let's not get carried away and say that the only purpose of types is to reduce the number of runtime tests.

But arent you confusing

But arent you confusing static typing with abstract datatypes now?

Can they really be divested from one another?

Type abstraction is the primary purpose of typing systems. You have types in the variables, types in the expressions, types in the functions, types in the modules, and types of the ADTs. We call them static type systems, only because of one behavior associated with these type systems - the ability to check type consistency at compile time.

If you want to restrict this at a lower level, just grab one of the functions within the signature and run with it. We can objectively say that the make_rat function takes two integers and returns a rational. We can say this without resorting to the implementation. How is this contract guaranteed? Static type checking of course. But was static type checking the goal of this exercise? Or is static type checking simply the glue that makes reasoning about the types possible?

To me, the only connection

To me, the only connection between ADT and ST is that former makes the later easier. Making a ST language without ADT would be difficullt, and involve a lot of structural typing. And in some way would be a wasted effort as ADT is a quite good thing. And this is becourse ADT makes it easier to reason about your types. Easy enuogh that it could be done at least in part by a machine.

So i think that causallity is in the other way. ST demands a typesystem that is easy to reason about. It doesn't give it to you.

Not your grandfather's Scheme

So why do I still think that the above signature of ML provides me value in my programming endeavors? Specifically because it allows me to reason about the type of Rationals without having to look at the implementation of these functions, the unit test contracts being imposed, or any other comments. I can reason about the type. And reasoning about types is a very valuable abstraction.

Of course, the argument comes back to whether this ability to reason about types outweighs the limitations being imposed that are necessary for this level of reasoning to be meaningful (i.e. if the types are not enforced, then they are no more valuable than comments - being subject to becoming antiquated).

I agree that the important thing is the ability to reason about types. But note that this can be achieved without requiring static checks, or the associated "limitations being imposed that are necessary for this level of reasoning to be meaningful".

Something like PLT's provide/contract gives this ability to declare, reason about, and enforce types (as well as other things which most statically-checked systems wouldn't recognize as types). It offers the same sort of capability as an ML signature. However, it doesn't require that typechecking be static. Your example would look something like the following. I've shown it in the context of a module, so it has some extra stuff - the provide/contract expression is the part that's equivalent to the ML signature definition.

(module rational mzscheme
  (require (lib "contract.ss"))

  ; Note: 'rational?' already exists in Scheme, chose not to override
  (define (rat? x)
    ; implement appropriate predicate (likely just use a struct predicate)
  )

  ; name the contract - for convenience, readability & friendly error msgs
  (define rational (flat-named-contract 'rational rat?))

  (provide/contract
    (rat?       (any/c . -> . boolean?))
    (make-rat   (integer? integer?  . -> . rational))
    (numer      (rational . -> . integer?))
    (denom      (rational . -> . integer?))
    (add-rat    (rational rational . -> . rational))
    (sub-rat    (rational rational . -> . rational))
    (mul-rat    (rational rational . -> . rational))
    (div-rat    (rational rational . -> . rational))
    (equal-rat? (rational rational . -> . boolean?))
    (print-rat  (rational . -> . any/c)))

  ; implementation goes here
)

Test it with an erroneous call:

(numer 10)
=> repl-8:1:1: top-level broke the contract (-> rational integer?) it had with rational on numer; expected <rational>, given: 10

Systems that use such contracts share many characteristics with statically-checked systems. Working with such systems demonstrates quite clearly that many of the benefits people tend to associate with static typechecking don't actually have that much to do with the static checking aspect. Plus, of course, these contracts are capable of expressing and enforcing constraints which most typesystems can only dream about.

A nice intro to these contracts is linked from here.

Quite interesting work

I didn't really intend to compare static and dynamic type systems. Just interjecting that sometimes we lose sight of what the purpose of having a type system within a language really aims to achieve. In the case of ML, I view the type system to be a sort of DSL that rides on top and intertwines with the code. In the case of Scheme contracts, it is similar in purpose but uses more of a proxy pattern to define and enforce the types.

Personally, I still consider it easier to work with and reason about types in ML, as the syntax is highly dedicated to types. But then Lisp/Scheme has always shown to be amenable to any programming style you can possibly throw at it. As with all PL issues, the question back is whether having availed itself of libraries that can be used for typeful programming, will that lead the community to avail itself of those techniques? Or will it just be used as an argument over the never ending static/dynamic debate? Whichever side wins the longstanding argument, I'd hope the result is still that types, contracts and seperation of concerns is central to any PL and/or libraries.

Reason

I agree that the important thing is the ability to reason about types.

I think the important thing is the ability to reason about programs, not types. The runtime contract check above does not prove that no module (even the ones not yet written) can violate the contract. If you take that benefit away from static checks, what else is left anyway?

It offers the same sort of capability as an ML signature. However, it doesn't require that typechecking be static.

This almost reads like static checks are something that a type system imposes on people against their will. Let's assume there was a magical analyzer that checked contracts statically in scheme programs. Would you elect to not use it?

What else is left

I think the important thing is the ability to reason about programs, not types.

Yes. To rephrase what I wrote, the important thing about types is the ability they provide to reason about programs. To do that, though, one reasons about types in programs.

As I wrote in Why type systems are interesting, all programmers rely on the ability to reason about static properties of programs, even in dynamically-checked languages. Types provide a framework with which to reason about (some of) those properties. That reasoning can be done by automated systems, as well as by humans. Human reasoning about types is more important than static automated reasoning in the sense that it always has to occur, regardless of whether static automated reasoning is occurring.

The runtime contract check above does not prove that no module (even the ones not yet written) can violate the contract.

That contract expression is agnostic with respect to proof. A static checker could certainly use it to assist with a proof.

If you take that benefit away from static checks, what else is left anyway?

I'm not taking any benefit away from static checks, I'm saying that when you take automated static checks away, the types don't necessarily disappear. The benefits of automated static checks should not be conflated with the benefits of types.

Dynamically-checked languages demonstrate what you get if you remove static checks entirely. Programmers still have to reason about types in those languages. (More pedantically, they still have to reason about type-like static properties, which are close enough to types that the distinction is little more than a quibble in this discussion.) Tools that help deal with types are still useful in that context, whether or not they operate statically. Test suites combined with contracts, or even with less sophisticated assertions, work very well in practice — I wrote about this in The unreasonable effectiveness of testing.

Automated static checking is not essential to get benefits from dealing with and reasoning about types. Such checks can be useful, but they have costs, and doing without such checks doesn't mean doing without types, or many of the benefits of types. It's worth considering type systems as useful constructs independent of the automated static checking of programs that use those systems.

This almost reads like static checks are something that a type system imposes on people against their will.

Type systems don't necessarily impose automated static checks, but statically-checked languages do. In most such languages, you don't get a choice not to statically check a piece of code. So in those languages, the static checks and their ramifications can, indeed, be against one's will.

Let's assume there was a magical analyzer that checked contracts statically in scheme programs. Would you elect to not use it?

I would certainly use it, if it worked well enough, and if I didn't have to globally transform the rest of my program to support the analyzer, in the same sort of way that statically-checked languages require one to do. But the reason we're having this discussion is that such an analyzer doesn't exist. Instead, we have a choice between writing programs in a way which suits the requirements of automated static type checkers, or living without such checkers and dealing with and checking types in other ways.

Contracts

The contracts are checked at runtime, yes? If you wrote a block of code that called (numer 10) and did not write a test that exercised that block of code, the first time you would discover that your block of code violated the contract would be at 2:00 a.m., when you get the phone call telling you that the system had fallen over, right?

If I am right, then that plus your comment, "Plus, of course, these contracts are capable of expressing and enforcing constraints is goinwhich most typesystems can only dream about," are essentially equivalent to my earlier comments about the relationship between static type systems, dynamic type systems, and testing. Correct?

Contracts are a good basis for a test setup, and are, overall, a really good thing if your system will do something horrible if the conditions embodied by the contract are violated or when you need to figure out what went wrong. In fact, given the strictly weaker nature of a static type checking system, contracts are a win in any language. But I do not see how they can change the fundamental relationship.

The contracts are checked

The contracts are checked at runtime, yes?

That depends on whether you have a "sufficiently smart soft typechecker". My point was mainly that the lack of automated static checking doesn't prevent you from expressing and reasoning about types in a way very similar to the way they would be expressed and reasoned about in a language like ML. The checking aspect is, to a large extent, orthogonal, although it's often not seen that way.

If you wrote a block of code that called (numer 10) and did not write a test that exercised that block of code, the first time you would discover that your block of code violated the contract would be at 2:00 a.m., when you get the phone call telling you that the system had fallen over, right?

Only if you're inclined to install code which could result in 2am phone calls without testing it well. In my experience, in practice, this doesn't happen. If 2am phone calls are going to happen, they could just as well be the result of a bug that's not statically preventable. Testing thorough enough to catch such dynamic bugs will also catch the bugs that would have been caught statically, especially if you're using assertions or contracts so that type bugs aren't likely to go unnoticed when they do occur.

In general, though, I don't think I'm disagreeing with your overall perspective. However, the "fundamental relationship" you mention is perceived by some people in a way which connects types, and the benefits of types, more strongly to automated static typechecking of programs than is actually warranted. There's also a tendency to underrate the effectiveness of testing, in part because of a lack of recognition of the extent to which testing benefits from the same properties of programs which allow type inference to occur.

Contracts as typechecking

That depends on whether you have a "sufficiently smart soft typechecker". My point was mainly that the lack of automated static checking doesn't prevent you from expressing and reasoning about types in a way very similar to the way they would be expressed and reasoned about in a language like ML. The checking aspect is, to a large extent, orthogonal, although it's often not seen that way.

I agree that its orthogonal, presuming you check every contract every time, but how can this be done efficiently?

That is, given a super-smart soft-typechecker, and contracts that express more than it can statically deduce (as mentioned several times already), won't the contracts need to be checked every time a new value of the given type is computed?

Or put another way, won't using the "smart" soft typechecking load even more complexity on the programmer, since he/she must carefully consider which checks will be removed at compile time, and which runtime checks will be done too often to be practical?



On another note, what are the smartest soft-typecheckers out there? How do their compile-time analyses compare to e.g. Haskell or ML?

Or put another way, won't

Or put another way, won't using the "smart" soft typechecking load even more complexity on the programmer, since he/she must carefully consider which checks will be removed at compile time, and which runtime checks will be done too often to be practical?

The same thing could be said of high level functionall language with optimising compilers. That they load more complexity on the programmer, since he/she must carefully consider what machinecode the program will end up as. And that this is a lot easier in a more lowlevel language.

The solution is to not worry to much and just trust the compiler.

Its called abstraction.

Testing is better than a

Testing is better than a static type checker because testing will give you more info on WHY it gone wrong; insted of it does not pass the test and give you a line number.

Reasoning about types

Given the discipline not to violate the interface ("Doctor, Doctor, it hurts when I hit myself on the head with a hammer...") or a sufficient test setup that will throw a hissy if you do violate the interface, what prevents you from reasoning about the type of Rationals without static type checking?

Now, both discipline and "sufficent test setups" are in rather short supply, and a good static type system should both encourage reasoning about your types and make both the types and reasoning as pain-free as possible. But I don't see how static type checks are going to allow you to do such reasoning, or how the lack of such checks, by itself, would interfere.

I absolutely agree that reasoning about types is a very valuable abstraction, and that a good static type system makes such reasoning the easiest approach to take. Far too often in my experience, programs in languages without static type checks wind up dealing the representations rather than types. But that is a weakness of the programmer, not the language.

Good Thing, Too!

mcguire: I have even more doubts about the existence of a "sufficiently good test suite" than about the theorem-proving type system.

Me too, because there are several existence proofs of the validity of the theorem-proving type system idea: you can find a brief list of them in my post here.

Um, no

mcguire: a sufficiently good test suite is going to be able to prove essentially anything provable, while any practical static type system is going to be strictly less powerful.

May I kindly remind you of the folklore fact that testing can only prove presence of errors (i.e. disprove certain properties), while typing can prove their absence (i.e. actually prove properties)?

Moreover, there are quite interesting properties that are inherently unapproachable by testing. For example, race and deadlock freedom were already mentioned.

The nature of "sufficiently good"

May I kindly remind you of the folklore fact that testing can only prove presence of errors (i.e. disprove certain properties), while typing can prove their absence (i.e. actually prove properties)?

Hence my qualifier, "sufficently good." :-)

If memory serves, the statement is actually about full-on, formal program verification, not typing. Formal verification is strictly stronger than any practical type system that I am familiar with.

...properties that are inherently unapproachable by testing. For example, race and deadlock freedom...

Ok, I am not terribly familiar with type systems for a general purpose language that can guarantee freedom from race and deadlocks (although I understand there are languages where race and deadlock situations cannot be expressed---such a language would probably change the relationship between static checking, dynamic checking, and tests). But, I do not see why such a type system could not be replaced by an appropriate test. That test may require a preprocessor that walks over the code using exactly the same algorithm as the type system, but that is just a matter of programming.

Where Types and Semantics Collide

mcguire: Ok, I am not terribly familiar with type systems for a general purpose language that can guarantee freedom from race and deadlocks (although I understand there are languages where race and deadlock situations cannot be expressed---such a language would probably change the relationship between static checking, dynamic checking, and tests). But, I do not see why such a type system could not be replaced by an appropriate test. That test may require a preprocessor that walks over the code using exactly the same algorithm as the type system, but that is just a matter of programming.

You've hit on a crucial point: there comes a point at which you cannot separate the type system from the language's semantics. The given example is a great one: the type system and language in question (presumably TyPiCal) go together, because you can't expose, e.g. the POSIX thread model of concurrency and enforce the rules of the Pi Calculus using them. As Benjamin Pierce points out in TAPL, generally speaking, language design and type system design actually do work hand-in-hand. It's somewhere between very difficult and impossible to retrofit a type system onto a language that was designed without one in mind. I think the line into "impossible" gets crossed when the semantics being enforced by the type system preclude the implementation of alternative models in the language. So it isn't true that you could accomplish the same thing with tests given a "general-purpose" (presumably meaning one with the usual shared-state concurrency semantics) language.

Typing is formal verification

If memory serves, the statement is actually about full-on, formal program verification, not typing. Formal verification is strictly stronger than any practical type system that I am familiar with.

If your type system is sound, then it is a formal verification tool. Unfortunately, most mainstream languages do not have sound type systems. But the ones usually preferred here (e.g. ML) have.

As an aside, this is precisely the reason why I think that soft typing and friends are not worthwhile - they are unsound by definition.

But, I do not see why such a type system could not be replaced by an appropriate test. That test may require a preprocessor that walks over the code using exactly the same algorithm as the type system, but that is just a matter of programming.

Then your test is (a most likely ad-hoc, informally-specified, bug-ridden, slow implementation of) a type system. ;-)

Probabilistic Type Systems?

Speaking of unsound type systems, does anyone know of work done on probabilistic type systems? One in which instead of relying on exact proofs, you use probabilistic proofs? Let's say you start with a Turing complete type system. You take a proposition, give it your best shot at proving it in a conventional manner, and if that proof seems to be taking too long (i.e. is might be undecidable), then you switch gears and start trying to find a random counter example to try to disprove the conjecture. If after 1 billion tries (or whatever) you can't find a counter example, you call it good enough, and declare it well-typed. Is there anything similar to this out there?

Call to arms

Tim discusses four static language features which he calculates would eradicate approximately 50% of the bugs in Gears of War—that is, the codebase wouldn't compile if those bugs were present and the compiler would tell you where the bugs were.

Attention all dynamic programmers: please start being more serious about arithmetic overflow and access to uninitialized memory!

Eh, huh?

If the compiler doesn’t beep, my program should work
[...]
§ Accessing arrays out-of-bounds
§ Dereferencing null pointers
§ Integer overflow
§ Accessing uninitialized variables

I don't think he's complaining about overflow or uninitialized variables in themselves, only that they can't be discovered until runtime. (Note to the pedantic: yes, testing is runtime. :-) )

Your an idiot it is weak

Your an idiot it is weak ASM/C type systems that your talking about not Lisp/Scripting/Smalltalk Dynamic Type Systems.

[Community] Stop

I'm quite confident Luke is not an idiot, and that there is no need for the aggressive nature of your opinionated posts. If you want to add something of actual value, then please do.

Prevention vs. detection

The point I'd really meant to make is that there's more than one way to skin a cat. The classes of programming error under discussion can be attacked either with static analysis or with higher-level languages that make them disappear in a puff of smoke. There's no concept of an uninitialized variable in Erlang, for example.

Occam's razor makes me much prefer removing problems to retaining them and adding extra machinery to detect them. Why wait until compile-time to catch errors that you could have prevented at brain-time, anyway?

Following Church's Church

Saying that types detect and reject erroneous programs follows the Curry view, where you define the semantics of an untyped language and restrict the set of allowable programs with a type system.

In the Church view, semantics comes after typing, so there is no such thing as an ill-typed program, just terms that are invalid because they don't type, like most languages have text that is not a valid program because it doesn't parse.

From that point of view (for a programmer that thinks of types as a part of code), a type system is also a way of making certain errors nonsensical. You can write down ill-typed things that look like they might have certain problems, but you could also look at "Erlang" with statements like "X" or "X := X + 1" and think they might create an uninitialized variable, or use mutation. If you really think of types as part of the language, type errors are equally nonsensical and inconceivable. That's exaggerating how I actually think when coding, but it isn't completely untrue.

Values as types

Hi,
I'm interested in languages which do compile time restrictions on the values of variables. Specifically, does anything like the following exist:
x : int [0, 2, ... 10]
x = 50 //error
x = 3 // error

ADA is a programming language that one can declare user-defined ordinal types.

Many popular programming languages of today do not treat values as types, although they implicitely use values as types. In my opinion, this is a major mistake that lowers performance of the final code, as well as performance of the developers.

Take the for loop, for example:

for i = 0 to 10
next i

the type of i is [0..10] in the first iteration, [1..10] in the 2nd iteration, [2..10] in the next iteration etc.

Another example is that every pointer operation (dereference, access from offset etc) has a type of pointer value [1..max_int], yet almost no programming language has a restriction that a pointer should not be null in order to use it.

Yet another example is that of array indices: most programming languages use integer types for array indices, but the real type of an array index is [0...array_size - 1].

All this means that it is values that are types and not categories of values only. Categories of values are logical union types. For example, the set of integers is the logical union of all integer values.

Lots of type information that exists in programs is never used by compilers, thus missing lots of chances for static optimization of code.

All static type systems put

All static type systems put "compile time restrictions on the values of variables". The real question is, "What limits can you accept on those restrictions?"

Ahh, an important question.

For example, in a Ruby collection object (eg.Hash and/or Array and/or Set) I put all kinds of things into them.

Interestingly enough 90% of the time in any one instance of a collection I put one and only one class of object into the Hash, and the other 9.99% of the time I put a objects which are all genuine Liskov certified subclasses of ancestor class and on extracting them from the collection I only every treat them as the ancestor.

The one thing I expected to do when I first learnt OOP's was to have "pencil box" collections and be able to say, "With each object in this pencil box, write"

The pencil would then scribble in pencil, the pen would write in ink, the ruler would do nothing, the rubber would rub out,...

I guess part of the reason why I don't have more "pencil box" collections in my programs is I grew up writing in statically typed languages like Pascal.

The other part is most OOP languages unhelpfully would thow an NoMethodError on telling a ruler to write instead of just doing nothing....

So again I question the direction, should we add generics to our collections so we can type check them more harshly, or should we relax even more, and do nothing on invoking an absent method. And using the null object pattern to do nothing in response to any method on nil?

Both can co-exist.

So again I question the direction, should we add generics to our collections so we can type check them more harshly, or should we relax even more, and do nothing on invoking an absent method. And using the null object pattern to do nothing in response to any method on nil?

I do not see a reason why both solutions can not co-exist in the same programming language. If you want an array of Pens, then the language should allow you to instantiate an array of pens. If you want an array of anything, then the language should allow you to instantiate an array of anything.

The division between static and dynamic languages is an artificial one. Normally, it should not exist: a good programming language shall offer both. Why it is not so, I think others more knowledgable can explain that.

Merd!

The Merd programming language has something like that:

http://merd.sourceforge.net/types.html

Automated theorem prover + Compiler

When you start to place constraints on types you are essentially proposing a theorem... (like all x^2 are positive) which would be a function like:

type PositiveInt = x :: Int where x >= 0

f :: Int -> PositiveInt
f x = x * x

So what is needed is to combine an automated theorem prover with a compiler... Then you can statically guarantee really useful things. If the theorem prover cannot prove the theorem the program would not be compiled.

I am working on a language based on these ideas, originally to be called "Russel", but unfortunately there is already a language (although obscure) called Russel (suggestions for a new name welcome).

I don't see how a dynamic language could offer this kind of sophisticated static type checking... You could of course implement a theorem prover, but nothing would tie the theorem to the code to be executed, you could for example in a dynamic language do:

f = "some function"
if prove f then exec f else fail

But there is nothing to ensure you remember to call prove on the function... and besides you would still need to express the theorem to be proved... and as the theorem _is_ the type signature this is infact making thins more complex not simpler.

A possible name

Had you considered Skolem? Thoralf Skolem was a mathematician who did a lot of interesting work in mathematical logic, and some of the foundational work for model theory, among many other things. I did a quick search and it desn't seem to be used by any other programming language, and is both distinctive and pronounceable: important features in a language name.

Semantic subtyping

dbaelde@igloo ~ $ ledit cduce
CDuce version 0.4.0
# let i = ref (0|2|4--10) 0 ;;
val i : { get=[ ] -> (0 | 2 | 4--10) set=(0 | 2 | 4--10) -> [ ] } = { get= set= }
# i := 2 ;;
- : [ ] = ""
# i := 3 ;;
Characters 5-6:
This expression should have type:
0 | 2 | 4--10
but its inferred type is:
3
which is not a subtype, as shown by the sample:
3

I'm no CDuce expert and I haven't used it for a long time nor for serious projects, so I won't say more.