Home
Feedback
FAQ
Getting Started
Discussions
Site operation discussions
Recent Posts
(new topic)
Departments
Courses
Research Papers
Design Docs
Quotations
Genealogical Diagrams
Archives
Follow these links.
From Grady Boochs Handbook project page:
The primary goal of the Handbook of Software Architecture is to fill this void in software engineering by codifying the architecture of a large collection of interesting software-intensive systems, presenting them in a manner that exposes their essential patterns and that permits comparisons across domains and architectural styles. Reflecting on his work on patterns, Christopher Alexander notes that he and his colleagues "made observations, looked to see what worked, studied it, tried to distill out the essentials, and wrote them down12." This approach is at the core of all good science.
I'm wondering if there is such a void? It seems to me that Booch tries to contaminate empirical research on software architecture ( striving through bigness, evolution and legacy ) with the more platonistic approach of the pattern community seeking for the pre-images of the good. Building nice, good, beautiful, terrific... software may be one goal, examining what works besides all this classicistic aestheticism and peideia is another issue.
Kay
Let's try to kepp the discussion here focused on programming languages. Thanks.
What I found interesting about Grady's talk was the notion of software as a cultural heritage that is worth studying and preserving. For example, he proposed that museums should collect software and that there should be books where the structure of existing master pieces of software could be studied, similar to other disciplines.
Ehud, I'd be glad if topics like these were ok for LtU - for me, software engineering and programming languages are very closely related and influence each other in profound ways. Programming languages are mainly there to produce software, after all.
[I just came across this thread, which is of relevance here.]
I did not comment about project management or UI design but followed a link to the current work of Grady Booch that was linked by the Lisp-fanpage you linked by yourself. Suffices to say that OO seems to have something to do with SE these days much more than with "the most beautifull algorithm in the world" or the eternal question who is the king of pop?
The OOPSLA reports contain many stories about talks about programming languages (e.g., Sussman about language and thought, Steele about learning new languages).
But naturally what we have is yet another static-vs-dynamic typing debate... ;-)
There is this quote:
Booch spent a next part of his talk comparing software architecture to civil architecture. ... One of his primary distinctions ... involved the different levels of understanding we have about the materials we use. The transformation from vision to execution in civil systems is not different in principle from that in software, but we understand more about the physical materials that a civil architect uses than we do about a software developer's raw material. Hence the need to study existing software systems more deeply.
The development of the understanding of the material of software is unarguably a core goal of programming language theory. While I applaud Booch's project, I think he's only got half the story if he doesn't also address the programming language side as well.
But if it's only a part of the "material of software" we should ask how big this part is. If he's "only got half the story" it would seem that PLs are the other half. But would Booch agree that PLs are half of the "material of software"? I am rather curious to know.
You see, Booch's "SE with Ada" is a textbook we use. It seems to me that when he wrote it, ages ago, he was quite enthusiastic about the role of PLs in improving software, especially about Ada. Much more optimistic about the role PLs can play than I am...
But during his later career (i.e., the move to UML) he seems to worry much more about issues that might seem language agnostic (though how agnostic they really are is arguable). My sense is that he now regards the programming language as much less important for successful SE.
It would be interesting to hear his take on this issue.
You are correct -- I believe process plays a big part as well. As to the proportions, I don't know. I would need some data before I tried to make any estimates.
I also suspect Booch would consider language relatively unimportant (if you know him, it would be interesting to hear his opinion). However my opinion is that the use of programming languages in the vast majority of industry is not indicative of their true power, for the simple fact that industry lags theory so badly in this regard (by about 20 years). In contrast processes seem to get adopted much faster (e.g. XP was about 5 years). If you're working in industry, and you can get process adopted much more easily, it is natural to emphasise this.
Now I gotta do my Saturday chores. ;-)
Yes, we should distinguish clearly between to role PLs paly in current SE, and the role they could (or should) play, i.e. their potential.
Sorry, but I would like to start a rant here. In lingusitics ( theory of natural languages ) it is common to distinct bewteen syntax, semantics and pragmatics or between langue and parole in the work of Saussure. My impression from the discussions here on LtU is that syntax and semantics are studied exthaustively in PLT but pragmatics seems to be an issue for "another department" to quote Ehud ( it might not be much better in NL-lingusitics research practice but they would hardly dismiss their distinctions nevertheless ). While it should be clear that PLs belong to the "raw material" the expression and analysis of Boochs estimated 1 trillion LOC cannot be reduced anyhow to the study of the semantics of a certain group of paradigmatic PLs as well as the work of poets cannot be reduced to study grammar.
What I like about Boochs approachs is that he does not address PLs in particular ( his attempts to abstract from OO-languages by means of UML is well known and has become mainstream ). How to iterate over a sequence or generate a side effect seems not be his issue. Side-effects will be generated, closures will be created and iterators do their work - period. It's like stating nouns, verbs and adjectives will be concatenated to build phrases. Please the next level! It does not seem to be his attempt to create the "100-year language" or other triumphalistic nonsense.
Entering at the pattern level / pragmatics is reasonable and if certain patterns arise only in presence of a particular language paradigm ( what I would expect ) this should be remarked but I would also expect that differences vanish in the big picture. A magifying glass need not be the most adequate instrument for all kind of investigations.
The sad fact is that little quality programming language research is done in these areas. The topic of pl pragmatics, and whether they sould be part of the science of programming languages was discusser here many times. You might find relevant stuff by searching the archives for "pragmatics", "human factors" and the like.
I agree in part with you, and think it is a great shame that pragmatics aren't discussed, and more importantly researched, more often. However, I want to provide a counterpoint in defence of PL. I think from the synthesis of your argument and mine does the truth emerge.
I once went to a talk (on an AI system) where the presenter made the perceptive comment that proofs are better than empirical results, as proofs generalise. An experiment gives you a data point from which you might try to extract a general principle to apply outside the confines of that experiment. A proof is a general principle -- once proved it holds for all situations meeting the preconditions of the proof.
The same situation occurs in PLs. The work on folds (catamorphisms) generalises to all recursively defined types in a mechanical fashion. In contrast the Visitor pattern is some vague waffle (at least I've always found it this way) subject to programmer interpretation. Of course the applicability of PL theory goes much further than relatively small things like patterns of iteration. For example, much server software is designed as a chain of processing steps. Often it is desireable to avoid creating intermediate data structures. This is just lazy evaluation. Once this is realised a huge amount of preexisting work becomes applicable.
So in conclusion I believe that ignoring PL theory hobbles the project just as much as ignoring process would.
From Martin Fowler's summary there are a few choice quotes:
Dynamic types are stronger than static types, as they don't flee the field at runtime.
--Brian Foote
Static types give me the same feeling of safety as the announcement that my seat cushion can be used as a floatation device.
--Don Roberts
Where's Frank when you need him? :-) Ok, I don't think these guys were setting out to advance the debate but the fact people are still whipping this dead horse (anyone for soft-typing?) is pretty sad.
Moving on, Subtext is pretty interesting. It looks like a nice system for exploratory programming. It makes all state visible, unlike a REPL where you have to keep a lot of things in your head. On the other hand I disagree with plenty. Read the manifesto and see how much you (dis)agree with.
Martin Fowler's keynote has a point I'd like to address:
Why isn't the patterns community bigger? Martin thinks it should be bigger, and in particular should be a standard part of every academic's life! We academics should go out, look at big old systems for a couple of months, and try to make sense of them.
PLT academics actually do spend a great deal of their time finding patterns, except they're called something different. For every Pattern there is a Functional Pearl, for every Visitor a fold. What's the pattern equivalent of a Zipper? There certainly could be more cross polination here. The pattern community would benefit from the rigour of the PL community, and the PL guys who gain a better understanding of the tradeoffs between different approaches from the pattern guys.
That's enough for now.
2. The idea of mining patterns for language features comes to mind.
3. You can have static typechecking and still retain the types at runtime, for relfection, genericity etc. (e.g., CLR).
Brian Foote's claim seems correct to me, notwithstanding the interpretability of dynamic typing in static types. The point is that dynamic types give you more power at your fingertips, in exchange for abandoning the assurance of static checks. The interpretation can be seen as a way of getting around the unwanted assurance mechanism when dynamic is what is wanted. This shouldn't be seen as flameworthy.
The problem is that this just isn't correct along at least two readily apparent dimensions:
Dynamic checking doesn't give you more power at your fingertips; it gives you more ways to say incoherent things,
It also gives you more ways to say coherent things, without going through the bureaucracy of a type system that doesn't understand what you're trying to do.
The counterargument, "I know that this code is type-safe at runtime," has a technical term naming it: "hubris."
There are different kinds of hubris: another kind involves broadly applying your preferences and prejudices to other situations and applications which are not subject to the same requirements and constraints you take for granted. The real counterarguments tend to be along the lines that perfect knowledge of runtime type safety isn't essential, in many practical situations, and is balanced against many other considerations.
The practice of mixing unit tests and test-coverage tools is, in the limit, just static typing outside the language, so why not put it in the language?
That's rather disingenuous, since this is exactly where a majority of the arguments on this subject arise. I hardly think it's worth rehashing.
OK, leaving aside the conclusion that correctness vs. "I know what I mean" is the fulcrum upon which the observation that the question is inherently economic rests because we're all willing to stipulate that, let me point out that "Full-coverage unit-testing is just static typing outside the language, so why not put it in the language?" is a sincere question that may have been misplaced amidst my admittedly exasperated reaction to the defense of Foote, about which I'll reply to the other poster.
Let me point out that "Full-coverage unit-testing is just static typing outside the language, so why not put it in the language?" is a sincere question
OK, I'll provide some sketches of (prior) arguments.
1. Statically-checked programs still need tests, so you haven't saved much effort in that area by putting static typechecking in the language.
2. Expanding on point 1, static checking only checks "trivial" properties of programs; see Rice's Theorem for the relevant sense of the term "trivial". The hard stuff is what happens at runtime, and that's not statically checkable, in general. If you write tests to test the runtime properties of programs, the trivial static stuff gets checked as a side effect of that, so point 1 can be restated to say that you haven't saved any effort in that area by putting static typechecking in the language.
3. Tests don't require you to structure the target program in specific ways in order to keep the type system happy. Implicit in this point are all the differences in expressivity between the two approaches — not necessarily greater or lesser expressive power, but differences in the kinds of expressiveness in different dimensions. That's out of scope for this post, but the subsequent points explore some of the consequences as it relates to reliance on tests vs. static checking.
4. Compared to target languages like Haskell, ML, or C++ templates, using tests with dynamically-checked programs doesn't require you to debug type system errors which are not directly relevant to the problem being solved. Such debugging qualifies as a low-level activity, which can be expediently avoided by avoiding static checks.
5. A kind of "type inferencing" works naturally for tests. When you test that a certain input produces the desired output, you've tested all the terms on the evaluation path between input and output, without having had to arrange those terms to satisfy a static type system. I discussed some related points previously in The unreasonable effectiveness of testing.
6. The effect described in the prior point can be exploited by the use of assertions and/or contracts, particularly at module and function boundaries, but also elsewhere. These enhance the power of tests, and ensure that the sources of errors can easily be identified. This mirrors what static systems do in some respects, but has a number of benefits over static checks:
(a) Unlike static checks, they can check non-trivial properties of programs. (b) They're optional, providing a slider-like capability which allows you to change the degree of checking that occurs in a program, e.g. to scale from a prototype to a deployed application. (c) Again, they don't impose any particular requirement on the term structure of functions — they're non-invasive checks which can be used at natural boundaries, or anywhere else, with the point being that you have a choice.
Robby Findler's papers, like Contracts for Higher-Order Functions and others on his publications page, are a good reference on the subject of contracts, and have the benefit of being more than "just" academic work, but describe a contract system that's available today in a real programming language, PLT Scheme.
7. Another consequence of "the unreasonable effectiveness of testing" is that the variety of tests that are required to test a system sufficiently thoroughly is usually far less than might be imagined by someone thinking in terms of perfect static proof. (Certainly, this is true for the cases which are candidates for dynamic checking in the first place.) So, the premise of the question I'm answering, that we need testing and coverage "in the limit" to match static checking, is faulty; in practice, you don't need to get anywhere near that limit. [Edit: Or else the limit is surprisingly low, depending on the definition of "limit" being used.]
Anton, thanks for gathering the points and relevant links; I'm going to bookmark this page because I think it pulls the questions at hand together very nicely.
I'll summarize my responses by saying that I think we've just had very different experiences with respect to where on a spectrum of possible results the projects we've encountered have fallen. I think this will become apparent if we go point by point:
Anton: Statically-checked programs still need tests, so you haven't saved much effort in that area by putting static typechecking in the language.
I find this to be true in the popular languages (C++, Java), but not so much in O'Caml (I'll have to let others talk about Haskell). There seems to me to be some line that gets crossed between the C++/Java's of the world and even relatively straightforward Hindley-Milner languages. In an earlier thread on the subject, the "it just works" issue was raised. Unfortunately, all evidence I have on it is strictly anecdotal, but decades of C++ and Java experience, and now a few years of off-and-on O'Caml experiences, reinforces my sense that HM sits at a very useful sweet spot in this space. All of this, I should add, is relative to Common Lisp and Scheme.
Anton: Expanding on point 1, static checking only checks "trivial" properties of programs; see Rice's Theorem for the relevant sense of the term "trivial". The hard stuff is what happens at runtime, and that's not statically checkable, in general.
I think this is the assertion that I find most problematic. I don't find Rice's Theorem at all useful. It's understood that most type systems' inference procedures are guaranteed to terminate and are therefore not Turing complete, but attitudes towards that are softening thanks to what we might call the unreasonable effectiveness of C++ template metaprogramming :-) and, from a more research-oriented perspective, dependent types. Even short of full dependent types, though, we do, in fact, see type systems performing increasingly complex analyses on aspects of the code that have a material impact on the user's experience of the product. To trot out my favorite examples one more time, what Flow Caml, TyPiCal, and Acute do may be "trivial" in Rice's Theorem's sense, but they're far from trivial in any meaningful software engineering sense. I would actually go so far as to call Rice's Theorem a "trivial" restatement of the Halting Problem or Gödel's Incompleteness Theorem. :-)
Anton: Compared to target languages like Haskell, ML, or C++ templates, using tests with dynamically-checked programs doesn't require you to debug type system errors which are not directly relevant to the problem being solved. Such debugging qualifies as a low-level activity, which can be expediently avoided by avoiding static checks.
This is another one that baffles me, so I just have to chalk it up to very different experiences: I don't spend any time "debugging type system errors which are not directly relevant to the problem being solved." The examples offered when the question has been raised in the past have been of the contrived "how do I make a heterogenous list?" form. I have to conclude that this is purely a matter of language education: ML and Haskell programmers don't wander around throwing their hands up in despair due to the lack of heterogenous lists, nor do they slave over their hot compilers for hours, days, weeks figuring out how to get the &*^$#&*# type system to accept their perfectly reasonable code. Even fewer people would use the languages than is in fact the case if it were so hard to convince them to accept good code!
Anton: A kind of "type inferencing" works naturally for tests. When you test that a certain input produces the desired output, you've tested all the terms on the evaluation path between input and output, without having had to arrange those terms to satisfy a static type system. I discussed some related points previously in The unreasonable effectiveness of testing... The effect described in the prior point can be exploited by the use of assertions and/or contracts, particularly at module and function boundaries, but also elsewhere.
IMHO, this is the most interesting point: you may decide to check something at an inviolable boundary that doesn't hold behind the abstraction. I'm reminded of "A Monadic Framework for Subcontinuations" allowing control effects inside the monad, but outside being observationally purely functional.
I agree that a good module system like PLT Scheme's "units" makes a huge difference to developing safe code in dynamically-checked languages, and think Robby Findler's work on Contracts for Higher-Order Functions is very important.
Anton: So, the premise of the question I'm answering, that we need testing and coverage "in the limit" to match static checking, is faulty; in practice, you don't need to get anywhere near that limit. [Edit: Or else the limit is surprisingly low, depending on the definition of "limit" being used.]
Right—IIRC, it's been demonstrated that a "random walk" approach to testing ends up revealing some 80% (or is it more? I can't seem to find the reference offhand) of the defects found by a carefully-planned testing regime. That's an interesting result, but I suppose more for what it suggests to me about why dynamic checking enthusiasts don't care for, say, HM typing as in ML or Haskell vs. the rather obviously less helpful static typing of a C++ or Java: as several of your points help to clarify, the belief is that there are significant costs to HM type inference in ML or Haskell. Let me add that when I say "ML" I'm referring to ML with popular extensions such as reference cells and O'Caml's object system, so the discomfort isn't with "pure functional programming" or "lazy evaluation" or other things that are distinct from the type system. The belief really does seem to be that there's a large scope of things that are easily expressed in a dynamically checked language that involve wrestling with the type system in, say, O'Caml, coupled with the belief that the type system doesn't check anything interesting. If I found that to be true, believe me, I'd still be doing my recreational programming in Common Lisp or Scheme (or these days, probably Oz).
But I emphatically don't find them to be true. For example, I like that, in O'Caml, the interface to OpenGL is more compile-time type-safe than the C interface is (and by definition, than any Scheme interface is). To me, that's a "non-trivial" result. Also, I don't find myself trying to write applications and getting type errors that have no bearing on my application. Actually, I rarely get type errors at all, but when I do, they're relevant to my application.
So this is why I ask "why not put the checks in the language," and I think why dynamic checking fans don't find that a reasonable idea: if there's a significant cost to doing so, naturally you want to pursue the lower-cost option.
Again, I think it's a familiarity issue more than anything else: there's more to "typeful programming" than just typing in your ML or Haskell (and I know that you know this). The lack of things such as heterogenous lists seems like a real obstacle until you learn the idioms that obviate the need in 90% of real-world cases, and the rest are typically dealt with by sum types.
In any case, I have to thank you again for this list; I think it's greatly helped me identify where the actual points of disconnect are and provided valuable food for thought about some of the others.
I'll summarize my responses by saying that I think we've just had very different experiences with respect to where on a spectrum of possible results the projects we've encountered have fallen.
Perhaps, although keep in mind that I'm arguing the position I am mainly because I don't see the issue as being as one-sided as you seem to. If you were equally one-sided in favor of dynamic checking and against static checks, depending on what you were saying, I might very well be defending static checks, although along different dimensions than you do. I've done that previously, although perhaps not that much in this forum. I see strong benefits on both sides, and I find there are application areas in which each excels.
I'll put it like this: I don't think current static type systems represent a pinnacle of expressiveness and capabilities — there is much more to learn and achieve in that area. If you want to do things that run up against some of the weaknesses of these systems, dynamic checking is a very viable choice. More on this below.
I think this is the assertion that I find most problematic. I don't find Rice's Theorem at all useful.
I raised it specifically in the context of comparing against dynamically-checked systems using tests & contracts, in which case it is relevant, for the reasons I laid out, the summary being that the safety provided by static typechecks can be quite easily approximated without them.
To trot out my favorite examples one more time, what Flow Caml, TyPiCal, and Acute do may be "trivial" in Rice's Theorem's sense, but they're far from trivial in any meaningful software engineering sense.
Oh, I agree, but as I've pointed out before, those are all research or prototype projects. There's nothing like that in wide deployment, even amongst the functional languages. In the meantime, one of the most notable comparable systems that is in wide deployment just happens to be dynamically checked — Erlang. Coincidence? Hmmm...
No question, the future of static checking techniques looks bright, and perhaps one day the case for more static checking will be more compelling, but it's hard to argue definitively that that day is here now for all application types and domains, when you're just looking at traditional typechecking (whether mainstream or formally-inspired).
Anton: Compared to target languages like Haskell, ML, or C++ templates, using tests with dynamically-checked programs doesn't require you to debug type system errors which are not directly relevant to the problem being solved. Such debugging qualifies as a low-level activity, which can be expediently avoided by avoiding static checks. This is another one that baffles me, so I just have to chalk it up to very different experiences: I don't spend any time "debugging type system errors which are not directly relevant to the problem being solved." The examples offered when the question has been raised in the past have been of the contrived "how do I make a heterogenous list?" form.
You didn't have that discussion with me. An example of the sort of thing I'm thinking of are embedded DSL-like systems, whether one's own or libraries. HaskellDB would be a nice example — very powerful, but the type system error messages which all such systems currently produce violate the abstraction of the DSL, and it's difficult to get around this with anything short of the approach which Camlp4 takes. Camlp4 is a nice system, but it's relatively heavyweight compared to what can be done in dynamically-checked languages, especially with (good) macros.
The belief really does seem to be that there's a large scope of things that are easily expressed in a dynamically checked language that involve wrestling with the type system in, say, O'Caml
There are classes of things. One example are database-backed systems. Look at the things Hibernate does to work around Java's static nature (such as it is) — reflection, bytecode generation, a DSL embedded in comments and/or XML, etc. The issues it deals with are still present in the functional languages — they arise anytime you're dealing with an external schema that's not statically present in your program. Of course, you can make the schema statically present, by writing or generating boilerplate, and if you do that well, it can work. Funnily enough, you don't often see this done well. This kind of thing is one of the reasons that so many of the database-backed web systems are implemented using dynamically-checked languages. While those mostly aren't particularly demanding kinds of systems, they do have some requirements which statically-checked languages tend to require more footwork to deal with.
Java could be raised as a counterexample, but when you look at the machinery Java requires to achieve some of the dynamic behaviors it has, it's non-trivial, and in the end the results still aren't as usable as simply using a dynamic language (ignoring some other dimensions of comparison for the moment).
[...] coupled with the belief that the type system doesn't check anything interesting.
I didn't say that, but I did point out that the things which static typechecks check can be checked in other ways.
But I emphatically don't find them to be true. For example, I like that, in O'Caml, the interface to OpenGL is more compile-time type-safe than the C interface is (and by definition, than any Scheme interface is). To me, that's a "non-trivial" result.
That sounds like a classic issue of application domain and programming style: I don't write code for the OpenGL API, but if I did find myself writing such code directly to the API, the first thing I would say to myself is "Sheesh, how can I abstract this a level or three up?" And if you want to use an embedded DSL approach, that's when some of the issues I'm referring to tend to arise.
Also, I don't find myself trying to write applications and getting type errors that have no bearing on my application. Actually, I rarely get type errors at all
If you rarely get type errors at all, why do you need static checking? ;) The serious message here is that if you write code in a type-aware way, it does tend to keep type errors low, but you don't need a statically-checked language in order to do that. We're not discussing whether or not one should use types, but rather how and when they ought to be checked. The implications of that point are often not given enough weight.
Heterogenous lists are a tip-of-the-iceberg symptom of the problem, not the problem itself. I think they tend to get raised because of their value as an indicator, sort of like driving into a town and noticing that there are heavy-duty bars on the windows of all the houses — you immediately know that there's a problem with crime.
Rice's theorem says:
forall nontrivial properties P, there exists no decision method M, such that for all Turing machines T, M decides whether T has property P.
It's that universal quantifier over T that makes Rice's theorem completely irrelevant to the question of whether static analysis or testing "works better". No one, whatever language they program in, cares about the space of all possible programs. The programs that people write are extremely highly structured and are usually of very low computational complexity.
It's that structure, in fact, which makes testing work as a verification methodology. Nearly every program that people write can be expressed as a hylomorphism (a composition of a fold and an unfold), and these functions have very simple inductive correctness proofs where the proof corresponds to the shape of the program. When you write tests for them, what you're doing is exhibiting a few concrete cases of the proof. Every program designed to be easy to test program is ipso-facto well-designed to be provable. Testing seems easier, mainly because you don't have to come up with the full induction hypothesis.
When you write programs that don't have this kind of structure -- for example, programs that use shared-variable concurrency or which rely critically on aliasing mutable objects for their correctness -- then testing works enormously less well at finding bugs. It's no surprise, either, that some of the hottest new techniques for managing concurrency and state, shared transactional memory and linear types, basically mash the primitives into a shape that lets you apply simple inductive divide-and-conquer methods to reasoning about the program. Which, not so coincidentally, makes them more testable....
Related discussion.
No one, whatever language they program in, cares about the space of all possible programs.
Gregory Chaitin might disagree... ;)
I take your point about the applicability of Rice's theorem. However, I used it (without recognizing the inapplicability) because what it says is very close to something I wanted to say, which is that if you restrict yourself to "traditional" typechecking, excluding things like checking of concurrency properties, then the properties being checked are fairly trivial and can be quite easily checked in other ways. You gave a great explanation of that, thank you.
Assuming "traditional" includes HM and related type systems. Then, as you should be aware, "traditional" typechecking can be and is used to enforce quite sophisticated properties. (This is part of the reason for my negative attitude toward soft typing mentioned below.) "Modern" (circa 1970s) type systems can be used for much more than just making sure an integer isn't passed to a function expecting a string. (They also, as you also know, have uses for producing code and not just "restricting" it.)
The belief really does seem to be that there's a large scope of things that are easily expressed in a dynamically checked language that involve wrestling with the type system in, say, O'Caml, coupled with the belief that the type system doesn't check anything interesting. If I found that to be true, believe me, I'd still be doing my recreational programming in Common Lisp or Scheme (or these days, probably Oz).
Remember that most "dynamic languages" advocates are not using Lisp or Scheme. They're using Python or Ruby, and in those languages, there is a huge scope of things that would cause the programmer to wrestle with an HM type system. (They might also cause the programmer to pull all of his hair out, but that's another debate...) In fact, most of these things are also impossible in R5RS Scheme (and in every Scheme implemtation I've ever used). Compared to Ruby, Scheme seems hopelessly "static".
This is one of the reason why I argue (maybe surprisingly) in favor of the phrase "dynamic languages"... The debate is much more about pervasive reflection and introspection than it is about typing. It "just so happens" (let's not unpack that for now) that with the state of the art, these features can't be statically checked, so "we want reflection" implies "we want dynamic checking"... Then the "types" people naturally perceive this as an argument about types, but to the "dynamic languages" people, that's all secondary.
I'm agreed that Scheme is pretty static in comparison to other languages. I haven't used Ruby or Python for anything significant, so I'd like to know what significant tricks people pull with their meta-programming facilities. The only one I can think of is reflecting database tables into classes without any static class definitions. Others? I'm genuinely curious.
1. Statically-checked programs still need tests, so you haven't saved much effort in that area by putting static typechecking in the language. 2. Expanding on point 1, static checking only checks "trivial" properties of programs; see Rice's Theorem for the relevant sense of the term "trivial". The hard stuff is what happens at runtime, and that's not statically checkable, in general. If you write tests to test the runtime properties of programs, the trivial static stuff gets checked as a side effect of that, so point 1 can be restated to say that you haven't saved any effort in that area by putting static typechecking in the language.
By the way, quite a bit can be statically checked.
Yes, you still need tests in statically typed programs, but you don't need to write the same kind of tests.
Agreed, more or less. As I said, I was providing sketches of prior arguments, which meant ignoring nuances to keep the summary short, and my point 2, "haven't saved any effort" was overly strong.
Moreover, the protocol between the caller and the callee is in the actual code instead of in comments.
Assertions or contracts can duplicate this functionality, with the benefits I mentioned: they're optional, they can check more things, and they don't constrain how you write the code any more than you want them to. They're not statically checkable in general, but nothing prevents them from being partially statically checkable, in principle. Also, contracts reduce the need for some of the kinds of tests raised in the prior point.
An important point I take from the link you gave is Conor McBride's quote: "anyone who would argue [...] that work to try to make more advanced type systems and stronger static guarantees more convenient and well-supported is not necessary because it happens to be possible to bang out this or that example in Haskell as it stands if you think about it hard enough, is adopting the position of the ostrich." The position I'm defending is related to this quote, in the sense that the currently prevalent static type systems still leave many things to be desired, and often the easiest way to avoid their limitations is to avoid them altogether. This provides one possible answer to Paul's original question, "why not put [static checking] in the language?" — the answer is, because we're still working on the best ways to do static checks, and while e.g. OCaml's pragmatic combination of Hindler-Milner and structurally-subtyped OO might satisfy some people, it's not the end of the line by any means.
I don't see what the problem is. If the type system is complaining, then the programmer made an error.
The main point was that you can make such things produce better, more usable error messages. Saying, essentially, that we should just put up with poor error messages is not an acceptable defense if you're arguing that we should put static checking in more languages.
More generally, I take some issue with the assumption that "the programmer made an error". Often, it's merely the case that a static type system won't let you express something without being very explicit about it: for example, requiring you to declare and use a new type, which in a dynamically-checked language could be entirely latent. So while it's true that the programmer might have made an error with respect to a particular static type system, he may not have made any error at all with respect to the logic of the program he's trying to express. This goes back to my point about low-levelness: having to take special steps to cater to the type system can be characterized as a low-level activity, it's an action which is needed to fit a program to a particular language, not an action that's required in principle to get a program to work.
Ooh, I'd say that depends on the system. Quote: "These four bugs survived all perviously tried 80 million test vectors!" until their discovery by the theorem prover.
That's why I included qualifiers like like "sufficiently thoroughly". I'm not arguing that dynamic checking and tests is a perfect replacement for static checking, only that it's plenty good enough in many circumstances, and that the tradeoffs for static checking are often just not worth it in those situations — depending, obviously, on a variety of factors, including subjective preference.
Anton is right about point 2. Point 1 is valid to the extent to which point 2 is invalid: most static type systems do flee at run-time, and the theory of keeping type information through to compile time is surprisingly involved (at least it surprised me that it was nontrivial); equally it is *quite* *possible* to have every statically performable type check done statically in a dynamically typed system, but the practice is hairily involved.
I stand by what I said. We should not be flaming claims like the one of Foote that was cited.
Charles Stewart: most static type systems do flee at run-time, and the theory of keeping type information through to compile time is surprisingly involved (at least it surprised me that it was nontrivial)
But Foote didn't say "most static types flee at runtime." He said that they all do, which is false. My problem is that the dynamic checking community consistently puts forth falsehoods, repeatedly, due to their ignorance of the domain. Sorry, but those are the facts on the ground.
Charles Stewart: equally it is *quite* *possible* to have every statically performable type check done statically in a dynamically typed system, but the practice is hairily involved.
I'm confused: a "dynamically typed system that performs every statically performable type check statically" is a statically-typed language without type erasure. Hmmm, I think I see what you mean: taking the only extant examples I can think of, CMU CL and its fork SBCL, the difference is one I've pointed out before: these Lisps just don't refuse to compile code that they can't type. That's fine if you feel the failure risk is affordable (as it often is; peace, OK?) and, since you're programming in Common Lisp in this case, having any kind of type inference at all is a bonus.
But this would seem to fall into the category of treating type-correctness only as an opportunity for performance optimization, another pernicious misapprehension that, per Anton, isn't worth rehashing.
So I guess we'll have to agree to disagree: you claim Foote's comment wasn't flameworthy because it's an observation that holds true for many static type systems; I claim that it is because it's a universal quantification that doesn't hold and that it's rooted firmly in ignorance, and since it's been said in a public forum, it has a deleterious effect on the promulgation of the truth to the general public. If Foote wants to speak such falsehoods among friends over drinks, fine; misleading the public is extremely problematic.
"But this would seem to fall into the category of treating type-correctness only as an opportunity for performance optimization, another pernicious misapprehension that, per Anton, isn't worth rehashing."
It could be but it doesn't have to be. You could view it as the type system trying to prove errors instead of proving correctness (which is generally impossible, considering that the type system can't know that the program is suposed to do.).
Yes, let me stipulate that I love that CMU CL and SBCL make an attempt to do type inferencing. The only real concern I have about that is that if it became a more visible feature, i.e. people came to rely on it, it might lead to a false sense of security when something that didn't check failed at runtime. This probably isn't an enormous concern in the general case, but it is given that these are Common Lisp implementations, because as has already been noted, doing type inference for a language that wasn't explicitly designed for it—especially one as loosely-structured as Common Lisp—is a considerable challenge, to put it mildly. So in Common Lisp, more stuff is likely not to be caught in type inferencing than is the case in the traditional examples of ML or Haskell.
It's interesting that you mention false sense of security, as that is a risk I see with the conventional "better safe than sorry" static type checkers. Becourse they claim to guarantee some sort of correctness then there obviously always will exist some sort of errors it can't catch. (At least in a turing complete language.)
You could say that a static type analys could result in three diffrent judgments (or posibly not terminate) for each type of runtime error that it analyses for. In the first case it can guarantee that the runtime error will show up, in the second it guarantees that the runtime error will not show up. In the third case it can't make any guarantees.
What to do with the first and second case seems quite obvious. Its in the third case it gets interesting. It seems most staticly typed language will rule that the program is illegal*, where softly typed language instead inserts dynamic checks. So it would seem that if you write code in a classic static typed fashion (hardly typed?) you would get just as safe code. The best way to me seems to raise a warnings for case three and let the programer decide what tradeof between saftey and flexibillity is best at the moment.
*This seems in part to be an artefact from a time then the purpose of static typing was to avoid the overhead of runtime typetagging.
But Foote didn't say "most static types flee at runtime." He said that they all do, which is false. My problem is that the dynamic checking community consistently puts forth falsehoods, repeatedly, due to their ignorance of the domain.
I started my research career in the Haskell camp, became disillusioned with it due to repeated difficulties coaxing the type inference system to assign the right type to inituitively well-typed code. I switched camp to Scheme after I became aware of the work on soft typing, and the grounds for that switch are more or less the content of Foote's remark. Foote's remark is false if the missing quantifier you read into the cited quote is "all", it's true if you use "most". Given the principle of charity...
these Lisps just don't refuse to compile code that they can't type.
That's one possibility, also one can roll your own macros to have say (hm-define args . body) which is like a define except that if it can't find a HM principal type, it aborts the compilation. You can have the same guarantees, they're just not there by default.
Do I not understand correctly, or is runtime type polymorphism a form of dynamic typing? Considering that you need to save the type at runtime in order to do a dynamic_cast<> or a Java cast, how can Foote say that types have fled the field? And for types where you know what they will be for the entire program execution, why shouldn't you verify them statically and throw away the type at runtime? Granted, C++ and Java aren't as strongly typed as Haskell or ML, but it seems to me that there is a middle ground.
So...what if we took a statically typed language and added a special type that is explicitly dynamically checked? For instance, suppose we added a special type to Java, call it Dynamic. Dynamic is a subclass of Object, but has the special property that you can call any method on it, and typing is checked at runtime. So whereas this is illegal:
Object o = new Object(); o.read(); // Object has no method called 'read'
Dynamic o = new FileReader("file.txt"); o.read(); // Dynamic has no method called 'read' // But if o has no such method, an exception is // thrown
This would allow coders to mix and match static and dynamic checking in what seems to me a rather natural way. Comments?
See also, Haskell's Data.Dynamic and Dynamic Typing in a Statically Typed Language
Statically typed programming languages allow earlier error checking, better enforcement of disciplined programming styles, and generation of more efficient object code than languages where all type consistency checks are performed at run time. However, even in statically typed languages, there is often the need to deal with data whose type cannot be determined at compile time. To handle such situations safely, we propose to add a type Dynamic whose values are pairs of a value v and a type tag T
Ok, so considering what a hassle it is to implement static checking in a dynamic language, what's the big deal? You can eat your cake and have it too. Why do we still have static-vs-dynamic wars? Obviously, there's places where static checking is better (don't want to carry around a type tag for ints), and places where dynamic checking is better (dynamic db queries). So clearly XOR is inferior to inclusive-OR.
Becourse the hassle of implementing static typing in a dynamic language comes from the flexibell nature of the dynamic nature. And it's really that flexibillity that you want and not dynamic typing in it self. Dynamic types in a static language gives you the worst of both worlds. And to take a trivial example:
[Dynamic 1, Dynamic 2, Dynamic 3]
..is just so ugly.
In Haskell, due to the "overloading" of numeric literals, that can be written [1,2,3]. Furthermore, that is just syntax, it would be trivial to have a language that automatically inserted the Dynamic where appropriate (essentially this is just a very simple case of subtyping). I'm not going to go into it (so don't ask, but perhaps ponder it), but personally I think soft typing doesn't address the aspects that either "camp" is interested in, and is thus a "worst of both worlds" (or rather just not compelling to either). There are some use cases though that it does seem useful (e.g. pedagogy).
it would be trivial to have a language that automatically inserted the Dynamic where appropriate (essentially this is just a very simple case of subtyping).
Are there any examples of this? I wouldn't think it would be trivial in a type inferencing language. And even in a non-inferencing language, given that anything can be successfully typed using a Dynamic type, automatically using Dynamic "when in doubt" is likely to severely reduce the ability of the language to statically report actual type errors. Am I missing something?
You're absolutely correct that you don't want to gratuitously insert coercions to Dynamic. Treating every type as a subtype of Dynamic is not wise, IMO. Alex Aiken and his collaborators have done a great deal of work with type systems like this (to facilitate soft typing, in part), and my feeling is that his work has shown that making this idea work is just too hard and too complicated.
However, you can propagate type information in order to calculate where to insert the appropriate coercions. And in fact, ML and Haskell already do this. For example, in system F, type abstraction and instantiation are explicit -- if you wanted to use the empty list nil as a list of integers and a list of strings, you'd need to write nil [Integer] and nil [String]. In ML/Haskell, the compiler fills those applications in for you -- whenever you use a value with a polymorphic type at a particular instantiation type, the compiler just fills in the type application argument.
(Haskell goes much further than ML does, by extending this mechanism with type classes; you can look at its typeclass mechanism as a way to infer value arguments to functions, where you pass in records of functions implementing particular APIs. Many proof assistants go even farther than that, and treat inferring both type and value arguments completely identically.)
Anyway, you could do something similar with a Dynamic type; you propagate type information to figure out where a Dynamic is expected, and then if you have a concrete type in a context where a Dynamic is wanted, you just insert a coercion to Dynamic.
For example, if foo is known to have type foo : 'a * 'b * 'c -> Dynamic list then fun foo (x,y,z) = [x, y, z, 17, true] could be elaborated to fun foo (x,y,z) = [Dyn x, Dyn y, Dyn z, Dyn 17] with no problems.
The general technique to propagate information like this is called bidirectional type checking, which is sufficiently simple that there don't exist any really good references on how it works! The papers that use it all do lots of other stuff in addition, so just grokking the bidirectional idea is hard. :( I would recommend looking at Pierce and Turner's Local Type Inference, and then at Chlipala, Peterson and Harper's Strict Bidirectional Type Checking. Pierce and Turner look at bounded quantification in addition, and CPL add a form of strictness checking to further reduce the number of annotations.
All that said, personally I don't want Dynamic, because to use a Dynamic value you have to have a typecase in your language; you have to be able to figure out what type the value contained in the Dynamic is. This breaks the parametricity property of the language, which means that type abstraction doesn't work, and that's a showstopper for me. That is, you can write a function of type
is_int : 'a -> bool fun is_int x = typecase (Dyn x) of | Integer n -> true | _ -> false
Type abstraction is an important enough property to me that I'm willing to trade away a lot of apparently-cool features to keep it.
What if there's a way to contain the effects of Dynamic in the way that Monads can contain the effects of, well, effects? Then you can use Dynamic where it's handy, and preserve the purity of the rest of your code.
Haskell already (essentially) does this by restricting things that can be type-cased to be in the class Dynamic. Then the type would be '(Dynamic a) => a -> Bool which is not parametric and therefore doesn't have the issue.
...that the Dynamic class declaration looks something like this?
class Dynamic a where put :: a -> something get :: something -> a
Data.Dynamic
All that said, personally I don't want Dynamic, because to use a Dynamic value you have to have a typecase in your language; you have to be able to figure out what type the value contained in the Dynamic is. This breaks the parametricity property of the language, which means that type abstraction doesn't work, and that's a showstopper for me.
Dynamics do not necessarily break type abstraction - it all depends on how type abstraction is modelled. Sure, simply relying on quantification and static scope restrictions won't be enough anymore. But if you adopt a generative interpretation of type abstraction then everything is fine. For example, this is what Alice ML does to reconcile type abstraction with packages (which are dynamics carrying modules).
First off, I did not say with inferencing etcetera, though it is quite doable. Without type inference we already have examples, e.g. Java's Object. Nor is the subtyping route the only possibilty (or is it essentially?) Note (as neelk points out indirectly) this is more or less what soft typing does! And, yes, gratuitously would severly restrict the ability of the static type system to actually check static type errors. This is one of my issues with soft typing. It essentially only gives you the benefits of static typing if you write your code (and all code you interface to is) statically type checkable anyways! And often such type systems are powerful but not so much powerful for the programmer.
What exactly is this "flexible nature" of which you speak?
For example lisp functions like eval, read and compile. I haven't seen firstclass macros or computation spaces in any staticly typed language eighter. not to mention metacircular overloading (if that is the correct term?). And well the choice to do more things without defining any new types.
To make anything safer you restrict people from doing things you think is dangerous. Restrictions always limits people. There is always a tradeof. Some tradeofs are easy. Few people really want the freedom/flexibillity of being able to crashing program by adding to functions. Some tradeofs are harder, but there is always a tradeof. Every feature that give you more safety gives you less flexibility.
When was the last time you called eval or compile in a production system that wasn't specifically a language tool? Anyway, I don't see how those functions are related to static-vs.-dynamic typing. What would stop me from writing a C function called compile() that does the analogous thing? I mean, isn't that what a C compiler is? Don't they have embedded C compilers? Also, I don't know what "metacircular overloading" means, so it would be interesting to see a simple example of that. And the "more things" in "choice to do more things without defining any new types" is exactly what I'm asking about.
Depends on your value of 'language tool' - I've done a fair amount of thinking about building web applications in Haskell and quickly found a lot of uses for both hs-plugins and Template Haskell, for example. It turns out you can do rather nifty things with the type system in that context as well - monads become a security measure.
Uhm, yes I am an academic, and no I haven't written any non language tool production system. I dont see how that changes the reasons for wich people like dynamicly typed langueages. Well if you compile and typecheck code at runtime, isn't that dynamic typing then? Your doing your checks at runtime aren't you? ;) If you got a language that compiles to a vm without typechecking, and then jit compiles the vm and does typechecking at this point, is that language static or dynamicly typed?
Metacircular overloading is overloading of the internal functions of the evaluator. For example you can change the semantic of application. I can't give any really simple example but you could see it in use in the implementation of aspects for scheme in Pointcuts and Advice in Higher-Order Languages even though it doesn't mention it by the term of metacircular overloading. (I may have made that up.)
Actually, running the type checker at runtime is the direction statically typed languages are going to support the features of "dynamic languages" while still having the benefits of static typing. I consider this still quite distinct from "dynamic" typing, but clearly the terms and distinctions are being stretched and eliminated. This (staged programming) is the end of the static v. dynamic typing debate some believe (except, of course, every now and then for tradition's sake).
A previous LtU article discusses Types and Reflection. As for examples...
hs-plugins is a library for loading code written in Haskell into an application at runtime, in the form of plugins. It also provides a mechanism for (re-)compiling Haskell source at runtime. Thirdly, a combination of runtime compilation and dynamic loading provides a set of eval functions- a form of runtime metaprogramming.
Template Haskell is an extension to Haskell 98 that allows you to do type-safe compile-time meta-programming, with Haskell both as the manipulating language and the language being manipulated. Intuitively Template Haskell provides new language features that allow us to convert back and forth between concrete syntax, i.e. what you would type when you write normal Haskell code, and abstract syntax trees. These abstract syntax trees are represented using Haskell datatypes and, at compile time, they can be manipulated by Haskell code. This allows you to reify (convert from concrete syntax to an abstract syntax tree) some code, transform it and splice it back in (convert back again), or even to produce completely new code and splice that in, while the compiler is compiling your module.
Template Haskell is an extension to Haskell 98 that allows you to do type-safe compile-time meta-programming, with Haskell both as the manipulating language and the language being manipulated.
Intuitively Template Haskell provides new language features that allow us to convert back and forth between concrete syntax, i.e. what you would type when you write normal Haskell code, and abstract syntax trees. These abstract syntax trees are represented using Haskell datatypes and, at compile time, they can be manipulated by Haskell code. This allows you to reify (convert from concrete syntax to an abstract syntax tree) some code, transform it and splice it back in (convert back again), or even to produce completely new code and splice that in, while the compiler is compiling your module.
There are statically typed languages that offer eval and compile (via dynamics).
...does Alice have eval/compile capability?
Been meaning to ask this on the mailing list, but since you mentioned it here. :-)
ComponentManager.Eval?
The interesting functions rather are Compiler.compile and Compiler.eval - ComponentManager.Eval just executes a (already compiled) component.
I don't understand your example. Why would you need to type literals that way? I'm a little slow, so please spell out the rest of your thought.
"Dynamic" in Li's example is actually a type constructor. (At least, in most languages that's what it would be.) You need something like that to tell a statically-typed language that the literal values in question shouldn't just be treated as whatever their natural type is. In type inferencing languages in particular, type constructors are one of the major pieces of information the inference system relies on. (BTW, a nice look at the workings of such systems can be found in the paper Polymorphic Type Inference.)
Also, take a look at the examples in my "Give me back my infinity" post. In the statically typed examples, include Java, all the literals are constructed with a type constructor. The reason for that should be fairly clear in those examples.
Li's example might be made more concise with a construction syntax like Dynamic[1, 2, 3], where a typed array of Dynamic is being defined. But that's a fairly special case. As I've mentioned in another post, there are issues with anything but the most local inferencing using a Dynamic type, because it's difficult (often impossible) to distinguish between uses of Dynamic, and type errors.
Dynamic[1, 2, 3]
I understand that Dynamic is a type constructor. What I don't understand is why he would want the numeric literals to be anything other than their "natural" type. What expression would you use them in that would require them to have type Dynamic?
Better example would be:
Dynamic [Dynamic 1, Dynamic 2.0, Dynamic "foo"]
That's what I was missing. In that case, the defect is not having to declare type constructors. The defect is not having a heterogenous list type. ;> It seems to me that you should be able to declare a list as having type Dynamic, and then all its elements would be Dynamic. Is that unreasonable?
Well Anton gave you an example on that "Dynamic[1, 2, 3]". But as I mentioned it was just a trivial example, still you need a lot of type constructors to combine static and dynamic typing with type infernence, if you don't do it in the style of soft typing.
(and by the way, Im a she not a he.)
I never really check profiles, and I made the embarrassing assumption that you were male. My sincere apologies for the error. Anyway, I think the type constructor objection is a bit shallow, since presumably you would want to segregate the use of Dynamic and static typing, for the most part. I can see where Dynamic typing would be useful in dealing with ad-hoc database queries, for instance, or something like parsing an XML file. There you could use all Dynamic in the part of the code that deals with the DB/XML directly, and only the interface between the dynamic portions and the static portions would require casts (or type constructors, if you prefer) where the type cannot be directly inferred.
It's understandable. It's not like my name is internationally genderd. Actually its a chinese male name but a swedish female name. and well there aren't to many women in cs.
Im not against static typing in general, just the type of static/dynamic hybrid thats based on adding a Dynamic type.
I would like a system that lets you use any of the features from a typical static system and a typical dynamic (and a effect system), and combine them orthogonal and induvidualy for each function, or collectivly for a group of functions.
Those features would include type annotations, type definitions, dependent types, static typing, soft typing, purity, determinism and so on.
I envision this as a system of extended firstclass contracts that can include clausuls stating that some asertions must be decidable att compile time.
Yes, I think a language that seamlessly integrated all those features would indeed be a nice language to use (assuming it integrated them well). However, I'm not convinced that contracts are the best mechanism for implementing it. I think some things are just more naturally stated in more conventional typing syntax. I think a combination of contracts with traditional static types would be most powerful. Making the dynamic portions seamless with the static portions is the biggest challenge.
I don't see why the contracts couldn't include static types and conventional typing syntax, although you still need something to declare if the typing should be staticly or dynamicly checked. Soft typing already does integrate static and dynamic portions seamlessly, the problem is more that it does that to much.
Ok, so considering what a hassle it is to implement static checking in a dynamic language, what's the big deal? You can eat your cake and have it too.
No-one's demonstrated that convincingly so far in an actual language. Including a Dynamic type in a statically checked language doesn't address some of the most interesting aspects of dynamically-checked/latently typed languages. In a dynamically-checked language, you can write a function like this (in Javascript):
function foo(a, b) { if (a == b) return "foo"; else return false }
...without having to tell the language anything about the type of the function foo. However, to most programmers, it's pretty obvious what the general type of foo is, since it always returns either a string, or a boolean (actually, a specific string, and false). A programmer using this function needs to know at least some of these properties of the function — essentially, the programmer has some notion of the static type of the function, even if the language itself doesn't. (I'll refer to this as a latent type, following long precedent.)
foo
As we know, statically-checked languages aren't nearly as laissez-faire as this. Except in trivial cases, statically checked languages require you to explicitly specify the types involved in a function like this. For example, in SML:
datatype SomeType = SomeStr of string | SomeBool of Bool fun foo a b = if a=b then SomeStr "foo" else SomeBool false
I've bolded the explicit type machinery in this definition. Note that SML has type inference, but we still have to be explicit about the types used in a case like this, because they're ambiguous. There are an infinite number of types other than SomeType which could be used in the definition of this function. A statically-typed language forces you to choose and specify the exact type you want to use. It can only infer types in unambiguous cases, where a function can have only one possible type. So the static language forces the programmer to narrow down an infinite set of possible types to a single, specific member of that set.
SomeType
That's quite a requirement! It offers one answer to the question of where the flexibility of dynamically-checked languages lies: it's in that infinity minus one of possible types that you've just been forced to exclude from the program. What if one of those other types was the one you really wanted? You're going to have to go back and change things. What if you want to use a slightly different but more or less compatible type in another function? Defining two separate types and dealing with conversion between them is unnecessary in a dynamically-checked language, but necessary in a static one.
Of course, statically-checked languages often offer escape hatches to allow you to avoid the type system's rigor if you need to. A "Dynamic" type is one such hatch (in Java, it's spelled "Object"). Structural subtyping, such as in OCaml, is another escape hatch. Let's look at a Java version of this function:
Object foo(Object a, Object b) { if (a.equals(b)) return new String("foo"); else return new Boolean(false); }
In some ways, this is quite close to the SML version, although it uses a predefined, and more general type, Object, and a couple of its subclasses, String and Boolean. But the return type of this function is now Object, which doesn't give either the language or the programmer much information. (The same would apply if the type were called "Dynamic".) Assigning a type of Object to that function isn't a very good approximation to what we would normally consider the type of such a function to be. That extra bold text isn't buying us very much — it's not adding anything to the source code that provides useful information to a human reader, it's merely administrative overhead to tell the language not to worry the static types in that case. There are also other consequences which aren't apparent in this micro example, such as the need to cast Object values to more specific types when they're actually used. This approach provides a way to encode latent types in a statically-checked language, but it certainly doesn't qualify as eating your cake and having it too.
Now, to connect this to some of the things I've been saying elsewhere in this thread, I could define this function in (PLT) Scheme with a contract:
(define/contract foo ; read as "define with contract" (any/c any/c . -> . (union (string/len 4) false/c)) (lambda (a b) (if (equal? a b) "foo" #f)))
The contract above is very specific about the value returned from this function: it says that it's either a string with a length less than 4, or false. (The "/c" suffixes are just a convention indicating that a name represents a contract.) Although this is quite an explicit specification, it doesn't impose any limitation on its clients that wasn't there already: for example, the clients don't have to be defined as expecting something like SomeType from the SML example, or Object from the Java example. We don't have to give up any flexibility, and we don't have to commit to a specific type scheme throughout the program. Compared to a statically-checked program, what we do give up is the ability to fully check that contract at compile time; but then again, a statically-checked language can't fully check such contracts either, in general[*], because these contracts can express all sorts of things that can't be statically checked. [* I'm ignoring the fact that this particular example returns constant values, which could be statically checked.]
At the very least, there are tradeoffs galore here. It's fine to say, as many people do, "I'm happy programming in static typesystem X", but implicit in that decision is the acceptance of many restrictions, requirements, and tradeoffs. Dynamically-checked languages are interesting because they allow you to work in a space that is free from most of those restrictions and requirements. That property hasn't been fully duplicated in static languages, with or without a Dynamic type.
How about somthing like:
function foo(a, b) { if (a == b) return "foo"; else return "false" }
Ok, mostly just kidding. :-)
Count me in the camp that believes that Static Typing is less flexible - I mostly consider that a bonus. Also count me in the camp that doesn't necessarily consider catching runtime type errors as the most important benefit of static typing - this is also a bonus. Nope, I like typing systems that allow me to define the information of the types being built and manipulated in my programs. I consider types (whether statically checked or not) to be the most important aspect of my programs. I like it very much when the types are explicitly laid out through meta information. I like to think of types as a schema. For example, show me the schema of a database and I can tell within reason what the database is capable of doing - without having to look at a bunch of muddled triggers, stored procedures or client code in a bunch of different languages. Show me the types and I can tell you a lot about the program without having to look at the code. Anyhow, Types are important and the easier the programming language makes working with, maintaining and self-documenting the types, so much the easier is my life. That's not to say that static typing doesn't have a cost in terms of algorithm implementation, but I consider building viable types as a much harder problem in the long run then the manipulation (and trickery) of getting those types to run through algorithms. Just my 2 cents worth, but in all the static vs. dynamic debate, I rarely hear the benefits of being able to consider types as meta information. In trying to trundle through the translation of CTM and SICP to Alice ML, I've found the hardest part is not the algorithms per se, but rather the explicit definition of the types that the dynamic languages (Oz and Scheme) are manipulating. (Of course, YMMV may vary depending on your application domain and your personal style of learning and programming). (Then again, when I'm doing one-off programming, I really like my languages to be as dynamic as possible).
Lately I've been thinking about something that I happen to find particularly pleasing: if a Function is a relation from a domain to a codomain, then a Type is a classification of such domains and codomains (or elements of the domain when the domain is a tuple space...i.e., multiple arguments). That is to say, Types are simple the sets of possible inputs and outputs to functions, and thus we should not think of types in terms of Data, but rather Functions. None of that is particularly new, but I think the insight is the relevance to static vs. dynamic typing.
Dynamic typing says: "We don't care what these input and output sets are, as long as they overlap sufficiently for the functions we've written. Furthermore, we don't want to commit to predefining these sets because it might make it harder to write a new function."
Static typing says: "We will spend some time inspecting these domains and codomains such that we can classify them according to a scheme which will allow us to verify that only sets defined by us get used as domains and codomains in our functions."
The sets are always there, regardless of whether we name them. But in a statically-checked (SC) language, the sets are explicitly named as types, and in a dynamically-checked (DC) language, the sets (types) are generally anonymous. In general, DC languages will have more types, because there is no force tending to force the domains and codomains into a common mold. SC languages will conversely have fewer types, because the language itself encourages (or forces!) a discipline on the programmer to reduce the number of possible types.
I believe that in this light, the tradeoff implications are clear: The DC world indeed gives you utter freedom, because the cost of creating a new type is as low as the cost of creating a new function. In that sense, a DC language truly is "more powerful" than an SC language. On the other hand, in the SC world, types are *designed*, and so the cost of spelling out those types explicitly is paid back in the static enforcement of those domain/codomain shapes.
And thus I would return to a point that I and others have made many times...DC seems like a good strategy for prototyping, where it's more important to give flesh to ideas than it is to define a rigid structure. The fluidity of the testing/brainstorming process demands an equally flexible language that allows new projects to bootstrap themselves into existence. On the other hand, as programs get large and complex, to the point that no single person can understand how they work, designing the types and having a centralized location where they are defined becomes more and more important. The problem with contracts is that they are defined locally to the function that uses them. There is no central "contract repository", and if there were, would it not be exactly the same as an explicit type? So in a project where ideas must traverse a treacherous source code landscape, concrete definitions for the domains and codomains of functions brings a limiting discipline, but also an order to the development environment.
Furthermore, what are contracts, really? I will tell you. They are a mechanism that enables ad-hoc (sub)typing! In a DC language, a contract allows the programmer to spell out the type explicitly, at the last possible moment (right at the start of function application). In an SC language, a contract allows the programmer to spell out a subtype at the last possible moment. That is, the domain of the function is specified or refined right before the function is called. So really, contracts are merely a form of type definition or refinement (which is why type theorists should be really interested in making them more formalized and powerful).
In some sense, SC languages are more abstract than DC languages because they reify this "latent" structure which exists within the language, and allow you to perform explicit manipulations with that structure. I don't buy the argument that SC is bad because it makes it more difficult to do really high-level programming. What I do accept is that SC and type theory needs to improve to the point that metaprogramming is as easy as programming. Only when type theory matures to that level will Lisp have met its match and it will be possible to do everything in an SC language that is possible in the most powerful DC langs.
In general, DC languages will have more types, because there is no force tending to force the domains and codomains into a common mold.
In general, and in practice, that's true, but there's nothing stopping someone who wants to program in a type-oriented style from doing so in a dynamically checked language, explicitly identify types, and even check them, whether dynamically with DC contracts or statically with some kind of soft typing system.
SC languages will conversely have fewer types, because the language itself encourages (or forces!) a discipline on the programmer to reduce the number of possible types.
The forcing is certainly one of the important issues. It keeps programmers honest about the types in their programs. They can't get away without really thinking about them, which is quite possible in a DC language. But, checking does not necessarily have to be something that is enforced at the level which typical SC languages enforce it — it can be done at other levels, with tools, including in an IDE.
The problem with contracts is that they are defined locally to the function that uses them. There is no central "contract repository", and if there were, would it not be exactly the same as an explicit type?
It can indeed be exactly the same as an explicit type, but that still doesn't mean it has to be statically checked. One of the common Scheme styles is to use abstract data types in the form of records, and those records centralize type definitions. You raised the point that contracts tend to declare anonymous types for functions, but most functions in SC languages also have anonymous types. Both SC and DC languages can have functions with types like foo -> bar. Contracts are only part of the picture - you can think of them as defining anonymous types (where a contract is sufficiently static-type-like), but the checks which a contract implements can be based on named (nominal) types defined in other ways.
foo -> bar
So in a project where ideas must traverse a treacherous source code landscape, concrete definitions for the domains and codomains of functions brings a limiting discipline, but also an order to the development environment.
And that order is provided by the type definitions, not by the static checks.
In a DC language, a contract allows the programmer to spell out the type explicitly, at the last possible moment (right at the start of function application).
More generally, they allow checks to be performed at function boundaries, at module boundaries, and even on individual expressions. Those checks go beyond what any ordinary static type system calls a type — contracts provide a superset of the checks that static types can provide. (I'm ignoring more experimental typing systems such as dependent types, which overlap in different ways.)
What I do accept is that SC and type theory needs to improve to the point that metaprogramming is as easy as programming. Only when type theory matures to that level will Lisp have met its match and it will be possible to do everything in an SC language that is possible in the most powerful DC langs.
I don't think it can be taken for granted that SC languages will mature to a level in which the most powerful DC languages become redundant, though. I think it's more likely that the most powerful DC languages will get better tools to deal explicitly with static types, which will blur the boundaries between the two approaches.
Ok, not always, but I hope I made my point with a shamelessly attention-getting title. The benefit of static checks is that by giving the compiler more information at compile time, you allow it to make all kinds of optimizations that aren't possible at runtime. Again, prototyping vs. production. During prototyping, performance usually isn't an issue. During production, depending on the size of the app, it becomes more of an issue. I know that I personally profile my apps from time to time to make sure that I'm exploiting all the easy speedups that I can. And if the language itself makes my app slow (which Java is known to do with its fat objects bytecode interpreter), I'm unhappy. That means I'm really unhappy with runtime checks that could have been performed at compile time (for little to no cost). Especially in the C++ world, there is a bit of zealotry to keep as much code in the compile-time world as possible, because it makes the resulting runtime smaller and faster. And that's why C++ compilers have to work so dang hard. The template system is pushed to its limits to remove as many runtime checks and operations as possible, while providing as much type safety as a language like C++ allows. Given the success of libraries like Spirit++ and Blitz++, not to mention the BGL, etc., I think it does a pretty good job.
That's not to say that DC languages don't have their place, even in production. But that is to say that it's a compelling reason to choose SC in many cases. As far as DC languages getting more static, I only have this to say...how long has Lisp been around? How static is it right now? ;>
...in many cases. Clearly, we're a match for each other in subject line trolling. :)
That's not to say that DC languages don't have their place, even in production. But that is to say that it's a compelling reason to choose SC in many cases.
You seem to be conflating static typechecking with static optimizations somewhat. That's understandable, because traditional SC languages combine all these things: static typechecking is built into the language, and that's exploited to provide static optimizations at compile time, and to report type errors at compile time. But all of these things can be separated from each other to some extent, and recombined in other ways which give different benefits.
Contracts put static information into code, which can be exploited for static type inference and static optimizations, but still without requiring the semantics of the language to be changed, and without sacrificing the ability to omit type information when appropriate or convenient.
As far as DC languages getting more static, I only have this to say...how long has Lisp been around? How static is it right now? ;>
A number of Lisps support optional type declarations and static optimizations based on those. Scheme is particularly oriented towards a sophisticated take on static knowledge. A number of Scheme compilers do various kinds of static type inferencing, and generate correspondingly fast code. The Scheme compiler "Stalin" gives statically typed languages like C++ and OCaml a run for their money on the performance of e.g. numeric code (see the comp.lang.scheme thread Ray tracer in Stalin, and check out the benchmarks). Bigloo is another Scheme compiler which is strongly geared towards those kinds of static optimizations. PreScheme is a completely statically typed subset of Scheme, using Hindley-Milner-style inference. This last is of course an SC language, but that also provides a kind of answer to your question.
My point is not that all DC languages will become more static, only that it's possible to tease apart the things that SC languages conflate, and apply them in different ways, and that doing this appears to have a great deal of promise. The traditional SC approach of statically checking the types of every term in a program is not necessarily the only ultimate solution, regardless of how powerful SC typesystems get.
BTW, I don't really mean to push Scheme so heavily, but many of the other DC languages don't emphasize these kinds of issues nearly as much. Some notable exceptions are Erlang and Oz. I believe Erlang has a few static optimization capabilities and type inferencing tools available for it, too — after all, it's a real industrial language commonly used in competition with C++ for performance-sensitive applications. In any case, I'm talking about what can be done, not what say your average DC scripting language happens to do.
Types, even in a statically typed language, can carry more information than can be checked at compile time! Consider the password example given in this thread. In Ada, I can declare a "type" like this:
type Password is new String(1..8);
The language is able to check statically that a Password is passed when an object of this type is expeceted, and so on. It cannot, however, always be sure that when I coerce a string into a Password object the length matches. This check is done at runtime.
Technically the language manual will say that the length constraint isn't part of the "type", but rather belongs to "the first subtype", since subtype constraints of this nature are checked at runtime (unless the compiler can deduce a problem during compilatation), while the language remains statically checked.
From a programming perspective, rather than a language lawyer perspective, the disctinction isn't that important. You declare your constraints and the language does "the right thing".
The point is, of course, that you decalre types which give structure to your system, and that many checks are performed at compile time, while many others are auomatically done at run time, unless you supress them for performance reasons, after verifying they can't fail.
Unfortunately, this is just not true in any reasonable sense, at the very least if you want to check them. Static typing tends to be global and pervasive. Implementing "static" type checking with "well-placed" dynamic checks would require non-local changes in general. This would be a quite significant burden, and further would require changing the code of libraries you use to enforce your "types" (assuming they weren't written with your same type checking scheme). The difference is very similar to the difference between manual memory allocation and garbage collection. You could, in principle, rewrite a program that relies on garbage collection to use manual memory allocation, but it would require changes in structure. The analogy further holds in that you may simply make a mistake. You may not have all the dynamic checks necessary to check the types you identify, which is akin to dangling pointers and such. Soft typing doesn't seem to help. If you want to have it help you you must use it's types and work within it's restrictions... exactly like a statically typed language.
I'll be more out with my views on soft typing. First, I do not think the research on soft typing is pointless or a waste of time. What I do think is that it's target demographic is small or non-existent. Soft typing weakens or destroys the properties that "static typers" value, while requiring many of the same "costs" of static typing if "dynamic typers" are to get any of these alleged "benefits of static typing" which can't be as much as they are in a statically typed language; the type checker is working in hostile territory! For more realistic, in between, programmers it seems mostly the worst of both worlds, though there are some perks.
In general, and in practice, that's true, but there's nothing stopping someone who wants to program in a type-oriented style from doing so in a dynamically checked language, explicitly identify types, and even check them, whether dynamically with DC contracts or statically with some kind of soft typing system. Unfortunately, this is just not true in any reasonable sense, at the very least if you want to check them.
What do you mean by "this"? :) I'm not claiming that perfect checking of types can be performed in the way I described, only that testing combined with suitable dynamic checks, such as contracts, can provide a sufficiently strong assurance about the validity of a program's type scheme.
As an example of what I'm getting at, I've converted programs from ML to Scheme, and vice versa, and retained essentially the same type scheme. The checking of types in the Scheme version is obviously not as comprehensive, nor is it static, but it's sufficient.
Static typing tends to be global and pervasive. Implementing "static" type checking with "well-placed" dynamic checks would require non-local changes in general.
One simple, effective strategy is to include a contract along with every named function definition, which encodes that function's type. The implementation of a function remains latently typed, but whenever it invokes another function, that function's contract is checked. This, combined with the usual dynamic checks performed by the language, provides a pretty good assurance that most type errors will be caught during testing.
The analogy further holds in that you may simply make a mistake. You may not have all the dynamic checks necessary to check the types you identify, which is akin to dangling pointers and such.
Yes, that's true. So ideally, you get errors during testing. Even if one of your contracts doesn't generate an error, the language's dynamic checks do. In the worst case, you get the kind of runtime error in a deployed system that static typecheckers are designed to eliminate. That does happen. However, the risk and the consequences of this happening are minimized by testing and contracts.
None of this prevents one from programming in a type-oriented style in a dynamically checked language, and including checks for those types.
What do you mean by "this"? :)
I did see this as a bit ambiguous.
I'm not claiming that perfect checking of types can be performed in the way I described, only that testing combined with suitable dynamic checks, such as contracts, can provide a sufficiently strong assurance about the validity of a program's type scheme.
Admittedly, the inability to "perfectly" check everything (or rather certain things) is part of what I was stating. Static typers do care about this, but not (just) in some misguided attempt to "prove" the "correctness" of the program. There are cases where correct use relies on the static types comprehensiveness and would not (immediately) lead to anything for dynamic type checking to complain about. For example, Haskell's runST function (especially for the Lazy ST monad); if we just strip the types and leave it to dynamic typing, no type errors will occur but bad things will happen. Further, there is just not one place you can put a "contract" to enforce correct behavior.
We do seem to be "getting at" slightly different things. My example above, though, suggests that this example does not generalize (or alternatively that the "not as comprehensive" is not generally acceptable). Incidentally, the "nor is it static" is unnecessary; I clearly have no issue with it not being static, that's a given in this context.
Yes, that's true. So ideally, you get errors during testing. Even if one of your contracts doesn't generate an error, the language's dynamic checks do. In the worst case, you get the kind of runtime error in a deployed system that static typecheckers are designed to eliminate.
Specifically regarding the last two sentence: no they don't! no you don't! And that's the issue! In fact, using the above example, depending on the implementation of runST you might actually get a dangling pointer. Of course, such an implementation wouldn't be used if this were possible, but nevertheless there would be no dynamic type error, the program would just behave incorrectly.
That does happen. However, the risk and the consequences of this happening are minimized by testing and contracts.
I will qualify my earlier post with the fact that a good chunk of static type checks can be handled with these approaches, but just where static typing is proving most valuable, the above approaches will fail.
I guess what should be said is that your "type oriented" style should not be equated to a (sophisticated) "static type oriented style". Presumably you are referring to significant use of abstract data types and "checking" them is verifying their contracts. This is not contentious, but it doesn't correspond to static type checking.
I have no argument in general with points like these:
There are cases where correct use relies on the static types comprehensiveness [...] [...] this example does not generalize (or alternatively that the "not as comprehensive" is not generally acceptable)
There are cases where correct use relies on the static types comprehensiveness [...]
In my first post in this thread, I wrote "perfect knowledge of runtime type safety isn't essential, in many practical situations, and is balanced against many other considerations." (Note that "isn't essential" is qualified by the subsequent phrase. :) That really does summarize my overall perspective on this.
Presumably you are referring to significant use of abstract data types and "checking" them is verifying their contracts. This is not contentious, but it doesn't correspond to static type checking.
That is what I'm referring to, but I don't claim that it corresponds to static type checking. But among other things, I'm pointing out that such abstract data types do constitute static types, and that it's possible (in fact, necessary) to program with an awareness of those types, as well as being possible to check that those types are being used correctly to a useful degree.
For example, Haskell's runST function (especially for the Lazy ST monad); if we just strip the types and leave it to dynamic typing, no type errors will occur but bad things will happen. Further, there is just not one place you can put a "contract" to enforce correct behavior.
Which bad things are you thinking of specifically? Violating the guarantee that ST monad instances can only manipulate their own local state? That's a nice guarantee to have, but it only takes a little discipline to ensure that e.g. lexical scope provides the same sort of guarantee — with such techniques, while "proof" of validity is left to the programmer, enforcement is still performed by the language. So I see that as another example of balancing of considerations.
Also, with that example, the pure, non-strict nature of Haskell is a major factor. In an impure, strict language, conversion of a Haskell program which uses runST would not require monads, and would have much less need for sophisticated types. While that wouldn't help with the safety guarantee, it does tend to simplify the problem from a type perspective. As another example of this sort of thing, to achieve the guarantee I mentioned, runST has a rank-2 polymorphic type, which you get "for free" with dynamic checking. Again, as I see it, we're seeing a tradeoff here between strong static guarantees and a certain kind of simplicity (if you're willing to grant that partially impure systems can be described as simple). The type requirements of the simpler system are not as great.
I will qualify my earlier post with the fact that a good chunk of static type checks can be handled with these approaches, but just where static typing is proving most valuable, the above approaches will fail. I guess what should be said is that your "type oriented" style should not be equated to a (sophisticated) "static type oriented style".
To the extent that there are things that static checking can do that can't reasonably be emulated any other way, I have to agree. But unless those things are an absolute requirement of the system in question, the use of "sophisticated static types" is one of the things I'd want to balance against other factors.
Just some remarks:
The runST example does enforce less than usual properties in less than usual circumstances, but a) that does remove it's validity and b) far from all such examples rely on those "properties and circumstances". The reason I chose that particular example is that it is rather dramatic, undeniably real practical Haskell, and in the Lazy ST case surprisingly difficult to enforce with dynamic checks or at the very least I haven't thought of any reasonable, natural way of doing it without significantly changing the interface. (I haven't thought of any way allowing dramatically different interfaces, but mostly because I haven't thought much about it in that line.) I did consider the use of a rank-2 type unfortunate; however, I include "normal" rank-1 parametric polymorphism in "sophisticated". I will make the following claim, possibly slightly overstated: If the "types" you are "orienting" yourself toward include parametrically polymorphic types, then you cannot check them with local run-time checks in general (and in most practical cases). Where "check" means perfect enforcement. I want to emphasize (again) that in statically typed languages the type system is used among other things to maintain the invariants of a library by requiring certain properties of user code; the point being that relying on unenforced "discipline" is (deemed) unacceptable because it is the library's invariants you are (potentially) violating.
But if you program a softly typed system in the same way you would program a dynamicly typed system, it wouldn't cost you a thing. And you could still benefit from compiler warnings. How is this in any way be worse than dynamic typing?
But is it significantly better?
For example, show me the schema of a database and I can tell within reason what the database is capable of doing - without having to look at a bunch of muddled triggers, stored procedures or client code in a bunch of different languages.
I would mildly disagree, as I view some kind of triggers as constraints, and constraints as first-class part of type information. Funny, I discussed this point just today in a UML and OCL-like setting.
Show me the types and I can tell you a lot about the program without having to look at the code.
I suspect both Curry and Howard would agree with you :)
I would mildly disagree, as I view some kind of triggers as constraints, and constraints as first-class part of type information.
Taking Anton's example of passwords, would we consider the restriction on the length of a string to be part of the type information. And if we go there, wouldn't it be just as possible to consider the restrictions on the patterns of characters within password strings to be part of the type as well (for example, not allowing passwords such as "password", "abc", "123" - or any password from a dictionary of forbidden words, or better yet requiring that the password contain at least one alphabetic and one numeric character - the possibilities are endless depending on how paranoid your system administrator is). From my perspective, first and foremost is simply the proof that the kind of value is a string (or whatever the common type happens to be).
From a Trigger standpoint, I don't usually consider them to be part of the meta-information about types - they are an implementation of restrictions and/or events that occur when an insert/update/delete occurs. They can be useful in terms of figuring out how the common types are used, but they run into problems when the are considered as part of the meta-information about the database. That is, I am not sure that you can use the information about Triggers in an analysis of the database, without either (a) looking at the source code of the Trigger; or (b) performing run time tests to evaluate the characteristics.
On a more general level, some might consider contracts and unit tests to be useful in the analysis of types. Or the types can be implemented within the program (as suggested by Anton in Scheme). But once you do this, you basically make analysis of types become a function of the program - having problems of seperating the concerns between type and program (halting problems, etc...). I know Schemers are usually type minded in the construction of their code - but I can't help but think that types and code run together making the segregation of analysis betwixt the two difficult.
More immediately, I'm left wondering in what circumstances you consider Triggers to be part of the Type schema of a database (and when they're not)? And why couldn't Stored Procedures also be considered part of the schema, as they generally manipulate the range of values that can be seen from the outside world.
In the end, triggers, stored procedures, contracts, and the programs themselves do effect the type of information being manipulated. And there's not a static type system strong enough for inference of these things. But when I talk about types, I'm usually interested in the meta-information - information that can be extracted and analyzed without recourse to the program source. In the database world, I'll usually look at the table definitions, the foreign key, and the constraints placed on column(s) with the Constraint specifications. Only after I look at that, will I dive into the various triggers and stored procedures.
I guess the thing I'm trying to say is that I like my meta-information to be in a different language from that in which I am writing my algorithms (and my workarounds and hacks).
Not sure where you'd draw the line between Types (kind of values) and Constraints (restrictions placed on the range of values - subsets).
My previous post was caused exactly by the fact I was unable to identify such a line in my UML use case, and by you separating DB schema from triggers and ilk. I thought you suggested that schema is in some sense more "type" than triggers, my apologies if I misunderstood.
That depends on the type system being used. In the traditional type theory definition, static checkability is part of the definition: it's all about what you can prove about the terms of a program without actually running it. But if you decide that static checkability isn't your primary concern, then it can be quite natural to draw the line differently.
Or the types can be implemented within the program (as suggested by Anton in Scheme). But once you do this, you basically make analysis of types become a function of the program - having problems of seperating the concerns between type and program (halting problems, etc...). I know Schemers are usually type minded in the construction of their code - but I can't help but think that types and code run together making the segregation of analysis betwixt the two difficult.
The PLT contract language is a domain-specific language that if anything, is more separate from the primary code than type annotations are in most statically-checked languages. It's somewhat similar to the case in Haskell, where the type signature for a function can be provided as a separate statement. It's trivial to read in some Scheme source and extract only the contracts.
Of course, since contracts can include constraints that can't be checked statically, what you can do with extracted contracts, without actually running the program, varies. In principle, though, it would be possible to extract & check certain static aspects of contracts, which ought to end up corresponding fairly closely to the (static) types in a program.
I agree, that's important — or at least, you want to easily be able to extract the meta-information, without having to run the program.
Of course, statically-checked languages often offer escape hatches to allow you to avoid the type system's rigor if you need to. A "Dynamic" type is one such hatch (in Java, it's spelled "Object").
Close, but no cigar. If you had read my original Dynamic post, you would have seen an instance where Object != Dynamic:
Object o = new FileReader("toot.txt"); o.read(); // read() is not a method of Object!
That's where static typing bites you in the rear in Java. You really do need a new Dynamic type to get dynamic checking in Java.
This is also wrong. String literals in Java already have type String, and with JDK 1.5, autoboxing obviates the need to create a Boolean object. However, you have missed the actual problem with Java. The static checking in Java occurs during method call resolution, where a dynamically-checked language would just throw MessageNotUnderstood or whatever. The other problem occurs during binding. When a Dynamic object can only have one possible type in a given context, the compiler can infer that type and perform an automatic cast. When the possible types make the expression ambiguous, you would need to cast to get back into the type system. But that seems like a fairly obvious requirement to maintain type integrity, no?
Also, note that what you are doing in Scheme is defining an explicit anonymous type in the function signature. Using a more verbose syntax, one might write it like so:
type X = ( x <- String | x.length < 4 } + { false } X foo(a, b) ...
I agree with most of your other sentiments, except for the notion that Dynamic cannot be implemented well in a static language. I think the semantics I spelled out for Java would work just fine, and are surprisingly simple.
No. You can invoke the read method on o through the java.lang.reflection API. (Edit: And if you know ahead of time that o can have only a bounded number of different known types, then you don't even need full reflection, only dynamic casts. But doing it that way is not much more flexible than using type unions a la algebraic data types.) This is awfully tedious to do by hand but that seems orthogonal to the point Anton was trying to make.
Yes, I recognize that his point was the power of contracts. And my claim is that the power of contracts is exactly the same as the power of explicit types. And that's why static typing is a Good Thing. It puts a name to those types. In fact, I will make an even stronger claim than before. As long as a contract does not perform I/O, I claim that *all* contracts can, in principle, be statically checked! Ha! That challenges the claim that contracts are more general than static types. I'm just waiting for someone to call me on it with a counter-example... ;>
You say "in principle", so I suppose your proposed type-checking algorithm amounts to "execute the program and declare the program well-typed iff the run-time system does not raise a type error"? :)
Anyway, I think it's clear that contracts are just specifications of types if we take the stance that types are sets of values. However, sticking to practice rather than merely principle, the main issue with static checking of programs with types that have complex specifications is that the checker has to err on the side of caution, i.e. it has to be conservative.
Here's a specific, somewhat dramatic example to illustrate the importance of this point. Let f :: T -> T'' and g :: T' -> T'' be functions and consider the ML-style program "let f x = g x". To statically check this program we need to verify that T is a subtype of T'. This can be very complicated. For instance, suppose the specification for T is {(x,y,z,n) in N^4 | n > 2, x^n + y^n = z^n} and the specification for T' is {(0,0,0,n), (1,0,1,n), (0,1,1,n) for all n in N | n > 2}. Then statically checking this program amounts to proving Fermat's last theorem. On the other hand, it is easy enough to handle this via dynamic/run-time checking since all you have to do is check whether _specific_ quadruples are in T and T'.
(define/contract read-password (-> (string/len 9)) read)
This defines a read-password procedure which will raise a contract violation if the password read is longer than 8 characters (or, as it happens in this simple example, if the password isn't contained in double quotes). It's arguably poor coding style, but it proves the point. You can't statically check all contracts, in principle.
read-password
It's arguably poor coding style, but it proves the point. You can't statically check all contracts, in principle.
You cannot statically check that the function fulfills its contract but the checker can declare that "this function does not seem to fulfill its contract". And in this case that is actually useful information since it indicates that you left something out (perhaps intentionally, but the type checker is forcing you to be explicit).
For instance, in a dependently typed system suppose truncate[n] is a function which takes a value of type str (arbitrary-length string) and returns a value of type str[n] (string containing at most n characters) by retaining only the first n characters. (You could do without dependent types in this specific case but a dependent type framework handles this more elegantly and in greater generality.) Then you could write
(define/contract read-password (-> (string/len 9)) (truncate[8] (read)))
which could be statically proved correct by the type checker. Or, assuming the system supports structural subtyping with str[n]
(define/contract read-password (-> (string/len 9)) (let ((password (read))) (typematch password ((str[8] s) s) (else (read-password)))))
Just for the heck of it, here's the example in Haskell using Oleg Kiselyov's library for Number-parameterized Types.
import FixedDecT main = read_passwd >>= print read_passwd :: IO (Vec D8 Char) -- Here's our contract: fail with a type error -- if read_passwd tries to return anything -- other than a vector of 8 characters read_passwd = do x <- getLine return $ listVec D8 $ take 8 $ x ++ cycle " "
...if you tried to define "read_passwd" with something simple minded like...
read_passwd = getLine
...you'd get the following type error...
passwd.hs:12:14: Couldn't match `Vec D8 Char' against `String' Expected type: IO (Vec D8 Char) Inferred type: IO String In the definition of `read_passwd': read_passwd = getLine
This is an easy one. I simply define a type String8, and make sure that no operations allow a String8 to contain more than 8 characters. In fact, that's exactly what you're doing, only you define the type as a contract. Making such a type in C++ is as simple as using Boost.Array like so:
typedef boost::array String8;
C'mon, make it harder than that, like write a contract that depends on multiple input variables, or global state. ;>
the point isn't that it can't be formulated as a type but that it can't be checked by compile time as you couldn't posibly know by then what string the user would enter.
Suppose I create this Java enumeration:
enum Color { red, green, blue }
And suppose that I want to read in these values from a file, mapping their ordinal values to the enumerated values:
Color c = file.readInt();
In C/C++, this is a perfectly legitimate thing to do. However, it is type unsafe, because there is no static checking that file.readInt() will return a value in [0..2]. On the other hand, the expression is not even legal in Java. The typechecker will *statically check* that codomain(file.readInt()) != Color. To convert ordinal values to enumeration types in Java, you have to do something like this:
Color readColor(FileThingy file) { switch (file.readInt()) { case 0: return red; case 1: return green; case 2: return blue; } }
Not only that, but the compiler will reject this method because there are cases which are not covered. So the Color type in this function is a translation of the contract (to use a liberal syntax):
typeof(x) == int && 0 <= x <= 2
Of course I can't tell you statically what values x will have at runtime. But Java will guarantee that this function never returns something that isn't a Color. Just like C++ will guarantee that the passwd function will never return a password that is longer than 8 characters. That's what static typing means, right?
The reason for my claim is that I have a rough constructive proof for Contract-Type equivalence:
For any function f, let us call the domain of the function D. As long as the contract only defines relations on the arguments of D, I can define a new set set D' which is C(D) (where C is the contract relation). I then simply construct a static type using D'. The only way this construction can fail is if C depends on a runtime parameter. So my original restriction of 'no I/O' was too weak. If you were to use, say, a global parameter in the contract, then C(D) might be different on each invokation. But then, C would not be a function strictly of D. It would also be a function of G, which is the state outside D. So you would really have C(D, G). If G is only fixed at runtime, then obviously I cannot statically construct D'.
So the correct statement is that if the contract relation is strictly a function of the domain of f, then it is statically checkable. On the other hand, is there anyone perverse enough to define a contract that depends on external mutable variables? And if there is, shouldn't we be dragging them out into the street to be shot?
The check that my contract actually performs, in the code as I wrote it, can't be checked statically such that no input length enforcement needs to exist at runtime. You can hide that check in the implementation of a type, but that's not static checking, it still happens at runtime.
Greg gave a fully worked out example, but in that case, he reads an arbitrary length line, and then truncates it, and statically ensures that the function returns the truncated result. However, that's a different program from my example, which will error if the input is longer than 8 characters.
As I said, my example is a questionable program design, but it's simple and proves the point.
It just needs a suitable transformation. Obviously, you can't read an entire line into a String8. That's the point of the type. So instead you have to do something like this:
get_password() { String8 passwd = String8.read() if read_to_eol() != "" throw ContractViolation return passwd }
So when you want runtime behavior for type violations, you obviously need to transform the code to get equivalent behavior. The question is, are there times when you *want* runtime behavior for contracts? I mean, an example like this one that isn't contrived?
And in what way is
if read_to_eol() != "" throw ContractViolation
not a sort of dynamic typechecking?
But only because Anton wanted a runtime effect. So by definition, you can't make a runtime effect static. But if he didn't care about the runtime effect, and only cared about enforcement of the contract, then there wouldn't need to be such a line. So in the sense that "Contracts may produce runtime effects", you cannot make those effects static. But to the extent that "Contracts protect your code", the protection part most certainly can be made static. If contracts are just used to raise exceptions, that's fine. Let's say that contracts are good for combining typechecks with exceptions. But to say that contracts are more powerful than static typechecks because they have runtime effects is a bit disingenuous, if you ask me.
"... you can't make a runtime effect static"
And thats why you want dynamic typing. Becouse you want to be able to deal with things that aren't known att compile time. If you don't do any IO you could just run your whole program att compile time and it wouldn't matter what kind of typing you do.
Yes, sometimes you really do want dynamic typing. But not *all* the time. But in this case, the runtime effect is to raise an error that *could* have and *should have* been dealt with another way. People regularly observe that dynamic typing gives you a lot of power by not restricting your function domain/codomains. This is true. C++ gives you a lot of power by doing arbitrary casts with reinterpret_cast<>. However, the C++ community learned its lesson from C that with power comes responsibility, and you don't want to simply let people perform C-style casts at whim, because they will inevitably destroy a lot of good data that way. So the C++ solution is to be very intentional about doing something powerful. Hence, the very verbose C++-style cast names. It's very easy to search through code and find all the places that reinterpret_cast<> is called.
In the same way, it makes sense to me to make a language safe by default, and powerful by intent. That is why I feel that a Dynamic type is a good compromise. If you are going to do something powerful and dangerous like unrestricting your function interfaces, you should have to do so explicitly. And whenever you come back into the type safety fold, you should have to do so intentionally. That's why I don't mind a little verbosity when it comes to static/dynamic interfaces.
As I have previously stated, obviously you should do everything thats posible at compile time. The question is what we should be allowed to do at runtime.
My opinion is that you shouldn't impose your own views on thats good programing style more than necissary. I think everything should be as easy to do as posible, and that you shouldn't make things more difficult on purpose.
I'm afraid my example was misleading, sorry. Contracts (at least the kind I'm talking about) let you check various properties of program that, in general, can only be known at runtime — things such as that the value of a variable falls in a certain range of values, for example (see e.g. Contracts for Higher-Order Functions). You can't, in general, statically prove that a given function returns a value in a particular range.
As an admittedly circuitous demonstration of this, here's a naive implementation (non-memoized) of the Collatz problem:
(module collatz mzscheme (require (lib "contract.ss")) (provide/contract (collatz ((and/c positive? integer?) . -> . (=/c 1)))) (define collatz (lambda (n) (if (= n 1) 1 (collatz (if (odd? n) (+ (* 3 n) 1) (/ n 2))))))) (require collatz) (collatz 5) ; => 1
Although as far as we know, we can't prove that this function always returns 1, we think it does, and the contract ensures that if it doesn't, that'll be caught at runtime.
There's no particular reason to constrain contracts to only checking statically-verifiable properties. The point is to enforce relevant properties of a program at appropriate boundaries. If it happens to be possible to statically check a contract, that's a bonus, but it has little to do with the purpose of contracts.
If it happens to be possible to statically check a contract, that's a bonus, but it has little to do with the purpose of contracts.
How can you say that with a straight face when the whole point of this subthread was about type checking? You went from type checking to assertions without skipping a beat. Talk about conflation! You have not demonstrated a true "dynamic type" (a function domain/codomain that changes at runtime). All you've demonstrated is a typecheck that gets executed at runtime. But it's not even really a typecheck so much as an assertion.
If it happens to be possible to statically check a contract, that's a bonus, but it has little to do with the purpose of contracts. How can you say that with a straight face when the whole point of this subthread was about type checking? You went from type checking to assertions without skipping a beat. Talk about conflation!
What I said, starting in my "infinity" post, was that contracts could be used as a way to check the same properties that static typecheckers check. I also said that contracts could "express all sorts of things that can't be statically checked".
You later disputed the latter point by claiming "that *all* contracts can, in principle, be statically checked! Ha! That challenges the claim that contracts are more general than static types. I'm just waiting for someone to call me on it with a counter-example... ;>" That seems to be where we got off-track. My counterexample used I/O, but it didn't need to to illustrate the real point.
My statement "if it happens to be possible to statically check a contract..." once again relates to the point that static types can exist in a program, without static checking occurring in the language. Static checking is nice to have, but it's not a requirement for dealing with static types. It's perfectly reasonable for contracts to check static types at runtime. That's not all they can do, but it's one of the things they can do. It nevertheless remains true that static checking has little to do with the purpose of contracts.
You have not demonstrated a true "dynamic type" (a function domain/codomain that changes at runtime). All you've demonstrated is a typecheck that gets executed at runtime. But it's not even really a typecheck so much as an assertion.
I wasn't trying to demonstrate what you call a true dynamic type, with respect to contracts. I was merely responding to the claim that all contracts can be statically checked.
...that the password must be greater than or equal to 8 characters. :-)
I was going for a real-world example... ;oP
If you've got dependent types and polymorphism, then you can typecheck roughly the set of properties that are expressible in constructive higher-order logic. So, for example, if we use a vaguely Haskell-ish notation:
So, we start with the non-dependent type IO (String + Exn) This says that we want an IO action that will either give us a string or a value indicating an exceptional failure of some kind. To refine this into a dependent type, let's think about how to state your contract as a logical proposition. Basically, we want to say that there exists a natural number n, such that if n < 9 then there exists a string s of that length, and otherwise we want to return an exception value. So we've got an existential quantifier, and in dependent type theory, an existential is (by Curry Howard) a sigma-type (a dependent record). In what follows, read "\Sigma" as "there exists". So we can write a type like: IO (\Sigma n: Nat. if n < 9 then (\Sigma s: String. length(s) = n) else Exn) We can then imagine writing some code like this: do s <- read; let n = length(s) in if n < 9 then (n, (s, refl)) else (n, fail "blah") This type would statically verify that the program has the desired property.
Note that in this example, we have to actually pass n along as part of the result, and this is redundant, since the information is implicit in having s. This happens because a term of a dependent type is a proof of that type, and we have to produce the information to reconstruct the reasoning. This is a general feature of dependent types, and one of the reasons that they are relatively rarely used -- passing around proof witnesses can get expensive, particularly if the witnesses are much more complex than the actual computational algorithm.
There's been a fair amount of research on compiling away the redundant proof terms (Google: "program extraction", "reflected proof", and "squash types"), because the theorem proving community wants to be able to verify a fast decision procedure within the theorem prover, and then use the verified version in the underlying prover implementation.
This isn't quite ready for prime time in a FPL yet, but we're very close -- the next few years are going to be REALLY interesting.
"...spelled Object" was an offhand remark. I didn't mean to say that a real Dynamic type wouldn't be useful. I did specifically mention "the need to cast Object values to more specific types when they're actually used". The point I was making about Object in Java is that it does provide an escape hatch, which has similar properties to a Dynamic type.
I agree with most of your other sentiments, except for the notion that Dynamic cannot be implemented well in a static language.
That remains to be demonstrated, when it comes to the convenience and flexibility of dynamically-checked programming. You can certainly use a Dynamic type to encode latently typed code in a statically-checked language, it's just not as convenient.
Also, note that what you are doing in Scheme is defining an explicit anonymous type in the function signature.
Sure, that's part of the point. It makes the latent types a little less latent. However, as I've mentioned, it doesn't constrain your code any more than you want it to.
However, as I've mentioned, it doesn't constrain your code any more than you want it to.
Where, "constrain your code" means: "force you to design the type that your function implicitly defines". So in all important respects, DC and SC languages are the same. The only difference is how easy it is to create new types. DCs make it as easy as theoretically possible (but perhaps that's too much sometimes?), and SCs make it intentionally more difficult so that you spend more time designing them (but perhaps too difficult sometimes?). What I'm looking for is the killer example. The instance (a practical and realistic instance) where dynamic checking is an obvious win, it's safe, and none of the static languages handles it well. Any takers?
Well static typing will allways be safer than dynamic typing. And you always want safety, the question is how much your willing to pay for it. And I think there is at least one simple function there it's generally considerd not worth it: Division. The domain of division doesn't realy include zero for the denominator, but very few language makes static checks for this.
I just want to see an example where a DC language does something nicely that is hard to do in an SC language, where the example is practical (meaning realistic and/or common). Something that makes a DC programmer go "Ha!"
A realistic, practical, common example is dealing with databases, or XML - any kind of externally defined schema. I've mentioned the effort that Hibernate in Java goes to to deal with some of those issues in the presence of static typing: source code generation, bytecode generation, reflection (which is relatively cumbersome in Java), DSLs embedded in comments and XML. An impressive arsenal of techniques, most of which aren't necessary in a DC language.
Seems to me that these all fall under the category of proxies - but I could be seriously be misusing the term.
I'm due for a RAM upgrade soon. Or do I need an EEPROM instead? Anyway, you don't think these applications would benefit from a Dynamic type to the extent that such a language would be comparably nice to a full-blown DC? XML is a great example. If you use the org.w3c.dom package to deal with DOM trees, you end up with DOM Nodes that have any number of types, including Element and Text. It's very annoying to have to check the type and then cast the result to do what you want. Here's a quick-n-dirty function I wrote to emit a DOM tree as an XML file:
private static void write(BufferedWriter out, Element node, int depth) { try { out.write('<' + node.getNodeName()); write(out, node.getAttributes(), depth); Node child = node.getFirstChild(); if (child == null) { out.write("/>"); } else { out.write(">"); while (child != null) { switch (child.getNodeType()) { case Node.ELEMENT_NODE: write(out, (Element) child, depth + 2); break; case Node.TEXT_NODE: out.write(((Text) child).getData()); break; default: throw new RuntimeException( "Don't know how to emit node type " + child.getNodeType() + "\n" + "Node: " + child.getNodeName() + " Value: " + child.getNodeValue() + " Text: " + child.getTextContent() ); } child = child.getNextSibling(); } out.write("</" + node.getNodeName() + ">"); } } catch (IOException e) { throw new RuntimeException(e); } }
What would be nice is to define a function write(BufferedWriter out, Text node, int depth); that just writes out a Text node and returns. Then, it would be nice to make the tree Dynamic instead of Node, and have it dynamically dispatch on the actual type of node, so that I could transform the code to this:
private static void write(BufferedWriter out, Element node, int depth) { try { out.write('<' + node.getNodeName()); write(out, node.getAttributes(), depth); Dynamic child = node.getFirstChild(); if (child == null) { out.write("/>"); } else { out.write(">"); while (child != null) { write(out, child, depth + 2); child = child.getNextSibling(); } out.write("</" + node.getNodeName() + ">"); } } catch (IOException e) { throw new RuntimeException(e); } }
All the casts are gone, and yet the code is still typesafe. If the tree contains a node it doesn't recognize, I expect it to throw an exception at the write() call. But I don't need or want to throw away the static checking baby with this bathwater. Can you convince me that I do?
The approach there is to say that shipping packages around requires runtime type checks in order to get at what they call "Open Programming". The tradeoff made, though is that the level of granularity of packages are Modules, and the type checking is done against Signatures.
Which begs the question of what level of modularization is dynamic run time type information really necessary. Is it at the the individual variable level, the function/procedure level, the class level, or at the module level.
So instead of asking whether RTTI is valuable, perhaps the question is whether it really needs to be carried around for each and every symbol, or whether type erasure is a "Good Thing" in the small, but presents problems with programming in the large.
Previously on LtU Clean Dynamics
After tickling my fancy with that quote about "the greatest code ever written", I can nowhere find the actual implementation of (eval ...) in Lisp. I don't know much about Lisp at all, so I'm sorry if its an FAQ but I'd be really happy if somebody could post it.
Thanks!
Paul Graham's Roots of Lisp might be what you are looking for...
That was exactly it.
Recent comments
31 weeks 4 days ago
31 weeks 5 days ago
31 weeks 5 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 7 weeks ago
1 year 7 weeks ago
1 year 10 weeks ago
1 year 14 weeks ago
1 year 14 weeks ago