Site operation discussions
101 Ways To Know Your Software Project Is Doomed may be amusing, but item 28 is clearly meant as a LtU flamebait...
If they think Subversion's merge algorithm is voodoo black magic, better keep them away from darcs' theory of patches...
Going off topic, that brings up a good point. Does anyone else have problem with the way SVN merges? CVS seemed a bit better. Is merge a part of SVN, or is it a part of tools that interface with SVN? I'm annoyed by the way SVN doesn't deal with simple changes to Scala code.
I've also noticed some cases where SVN has apparently silently merged together some code in a weird way. Until now I had assumed that my co-worker had caused the screw-up, but now I'm not so sure.
I had read a piece recently about Linus Torvalds and Bram Cohen arguing about version control. Linus argued that merge should just be simple and mark anything remotely complex as a conflict to be merged by a human. I'm starting to think that this may be the right way to go.
Here's the link to the blog post where I read the discussion.
Actually, merge for me under subclipse does exactly that: anything remotely complicated is not merged but marked as a conflict, even if the code hadn't changed from the last updated revision! Granted, subclipse is a pretty horrid tool, so maybe the problem is there.
Any interest in language specific merge algorithms? E.g., if you are merging Java code and have access to the AST and maybe type information, then you could probably safely merge more often. Strange that I haven't seen much on this in tools literature.
Sure, it would be very neat, but it's probably not worth the effort in terms of work later saved. Plus, you'd still have the possibilty of incorrect merges (which aren't flagged as such) only now in more subtle forms.
I don't have any concrete numbers to back this up, but my stance is similar to Linus': If there's even a remote chance of conflict a human should probably look at it anyway just to make sure. I feel the key is making the "look at it + merge/reject/combine/edit" process as smooth as possible for the human.
This was expected after the success of this conference.
Shouldn't #1 be: The programmers have renamed their waterfall process - Agile.
Most bugs are kind of logical, the kind of bugs not caught by type systems (I am talking about conventional languages here). But even the most advanced languages leave some room for possible bugs at run-time.
Personally I do not dare release code that is not tested (at least functionally), no matter how good the compiler is.
Only last week I was told - "I said that type checking was a form of testing."
Types are theorems; functions having those types are proofs of the theorems.
The interesting question is how expressive your theorems (= types) are.
Last week I wondered if "a form of testing" meant testing in the most general sense, or some specific software testing technique (unit-testing was being discussed) - never did get an answer to that question.
(I think any activity could described as "a form of testing" as long as the intent was to test something in the most general sense - standing up, biting an apple, ... I don't think that kind of statement tells us much.)
...when a statement is a theorem, and the stuff supporting the theorem is an existence proof of the theorem, that's useful. As Matthias Felleisen pointed out somewhere (but I can't seem to find the reference right now), what's interesting about a function that's declared to take (int n) is that it really is saying:
∀n: int ...
That is, if the function really is total on the integers, then you don't need to test to see if if works on all the integers. Now, of course, that's not a very strong assertion, but it's the basis of everything else that follows from the Curry-De Bruijn-Howard Isomorphism; the more interesting your types get, up to full dependent types, the less you need to actually test (by running the function) that the function does what's intended. In the limit, you have strongly-specified functions, where the return type of the function is the specification for the function. At that point, there's nothing left to test in that function—your failure modes lie elsewhere (the parser that's converting the user input to the type that the function expects; your garbage collector, etc.)
Saying that "compiling is a form of testing" isn't fair, you're right. Compiling is stronger—it's a form of proof. It's just that, in most popular languages, you can't prove anything very interesting. But in the languages we tend to talk about here, you can.
I'll risk being simplistic - is it enough to say that proof versus test distinction comes down to universal versus existential quantification?
...I'm inclined to think so, that testing demonstrates the presence of (presumably desirable) properties, while static typing can demonstrate the absence of undesirable ones. That sounds reasonable in the broad strokes, although a monkey wrench is thrown in by the observation that abstract types have existential quantification, too—so they say "there exists some type, T, with these desirable properties..."
The distinction between desirable and undesirable properties probably hides a lot of important details as well. Formalizing the "un" in "undesirable" might reveal a rat's nest of issues.
In any case, the universal vs. existential quantification distinction is sufficient justification, for me, to support my belief that static typing and testing are dual to each other: software will always need both, because even software developed by extraction from, e.g. Coq, will still have dynamic semantics that you want to be able to make strong statements about.
Lie somewhere on a continuum of software verification techniques, between proof-by-type and unit testing. DBC gives you axioms about a function's behavior, but--as noted above--there is often no way to formally connect the preconditions to the postconditions. "Proof", via successfully running the function on various inputs without tripping an assertion, is left as an exercise for the user. :) (Or for the tester or developer in a more mature environment). Model checking takes it a step further.
All of these techniques are valuable. Formal proof that a function satisfies some useful property is, assuming the property is in fact what the end-user needs (a can of worms in itself) the ultimate; but it still is intractable in many cases.
Of course, I suspect that item #28 isn't intended as a dynamic-language-fan slam against static typing or other ways that tools might verify code correctness without executing it on sample datasets--instead, it's a slam on piss-poor developers who think "it compiles" is sufficient grounds to check their code in. If the tools ever get good enough that compilation without error implies correctness (with no semantic concerns to worry about), then we probably won't need the lazy programmer in the first place. :)
...I'm inclined to think so, that testing demonstrates the presence of (presumably desirable) properties, while static typing can demonstrate the absence of undesirable ones.
The question remains what determines the boundary between both? Lets assume we deactivate the type checker, specify a set of tests and they succeed. Let's further assume that we had a complete coverage of code paths. Now we activate the type checker and it rejects the program. What shall we do? Change the program s.t. it passes the type checker or adapt the type checker / type system that it won't reject at least the specialized program we did run sucessfully before ( specialized with concrete input data stemming from our tests )?
If you have 100% code coverage and you ensure that your tests always pass but (some) type system's checker rejects your program, you're probably justified in shipping anyway, depending upon what the type system can guarantee against that tests can't—so once again I'm thinking of something like TyPiCal, where all of your tests can pass but TyPiCal might tell you that your code is subject to deadlock and/or race conditions.
Of course, the problem here remains that you have to continually ensure 100% code coverage and test-passing every time you change the code. 100% code coverage for most software isn't necessary in practice (Anton's On the Unreasonable Effectiveness of Testing is a great exposition of this). A related issue is that you have to write tests that you wouldn't have to write given a relatively expressive static type system. So there's always an interesting set of trade-offs to be made, and most developers do, in fact, make them—that is, the programmer who just ships the product once it compiles without doing any testing doesn't exist, and the programmer who knows and cares nothing about the supported operations on terms in their language before running their program doesn't exist, at least to a first approximation.
...testing demonstrates the presence of (presumably desirable) properties, while static typing can demonstrate the absence of undesirable ones
My pooh-bear brain gets it a little different - testing can demonstrate the presence of undesirable properties (successful test shows existence of an error) and type checking can demonstrate the absence of undesirable properties (successful type check shows there are no inconsistencies).
Certainly the practice of testing does find undesirable properties. An implementation that passes the tests demonstrates the presence of presumably desirable properties of the implementation.
interesting in terms of language and logic
Interesting and fraught ;-)
... always be systems where the overall cost of implementation ...will be higher than is warranted ...
Wouldn't it be nice just to be able to say what the overall cost of implementation would be in each case, and oh can you imagine if we were able to assess the benefits as well! :-)
Isaac Gouy: Wouldn't it be nice just to be able to say what the overall cost of implementation would be in each case, and oh can you imagine if we were able to assess the benefits as well! :-)
People actually do this, given a range of experience with various technologies. My CTO and I bat the pros and cons of assorted language technologies, static and dynamic typing, etc. back and forth fairly often. We've definitely had occasions that have prompted him, a well-known figure in Lisp circles, to proclaim that we should have used a statically-typed language—and given what we're doing, I've been heard to say that, as much as I prefer static typing, I don't believe that the libraries and web application development infrastructure for any such language other than Java are mature enough for me to recommend them. But assessing what would have to happen on either side to rectify those observations seems quite doable, at least well enough to base business decisions on.
As usual I haven't been as clear as I would like.
Sure, we bat the pros and cons around, as we have for years - we trade opinions - show me the numbers!
... for the obvious reasons: the numbers would just be my and my colleagues' best efforts to quantify our opinions :-) and any such numbers are, of course, proprietary. So let's make up our own numbers! Maybe we could sum up the zlib-compressed size of each of our posts here about static or dynamic typing and compare them. It would mean about as much as any numbers any company came up with to justify their technology choices.
Are you being just a little too cynical? :-)
Cynical? Not at all. I'm just pointing out that, absent the construction of a Bayesian belief net to model the issue, the best we can do is try to come up with some less-understood, satisfactorily-objective measure, of which our cost estimates are a part, but by no means the whole. And that's OK; I actually have no problem with people continuing to use Lisp (or Erlang, or whatever) vs. OCaml (or Scala, or whatever) based on the understanding that they aren't losing enough (by whatever measure) for it to matter. They might even be winning (by whatever measure). My only claim is that, for most applications, static typing leads to losing less/winning more (by whatever measure), because many of the benefits attributed to dynamic typing (interactive, incremental development; the language "not getting in your way") also apply to modern type-inferred statically-typed languages. But we both know that we can both measure things in such a way that the claim can be falsified or verified; all that would happen would be for the argument to shift to whether our choices of what to measure, or our measures thereof, were reasonable or not.
Oftentimes a ui walkthrough with managers turned into debate on the colour scheme - there's a straightforward explanation - everyone seems to think they know something about colours.
I have the feeling that there's a similarly straightforward explanation for the heat of static checking versus dynamic checking - everyone thinks they know something about a language because they can guess that it uses static checking or dynamic checking - unfortunately that's sometimes all they know. (LtU readers obviously have a far greater depth of knowledge.)
As you say, some of the benefits attributed to dynamic checking also apply to some statically checked languages. And language features which some identify with dynamically checked languages are also available in some statically checked languages (and not available in other dynamically checked languages). Seems to me that the details matter more than dynamic versus static.
My only claim is that, for most applications, static typing leads to ... whether our choices of what to measure, or our measures thereof, were reasonable or not.
Maybe I'm misreading the end of the sentence, but I think it's fine that the argument would shift to whether our choices of what to measure are reasonable for our specific situation.
It would be wonderful to have successfully categorized the measurements that were appropriate for different situations, and then use those to guide our tool choice.
You understood my closing perfectly: I'm bemoaning the fact that there isn't (to the best of my knowledge) a generally agreed-upon framework for categorizing cost/benefits across projects, language choices, framework choices, organizations, team composition... and I was attempting (poorly, I think) to observe that, in the end, it would have to boil down to Bayesian analysis: you start with some (hopefully informed!) prior probabilities of success and update them when you get additional data. Of course, getting actionable intelligence ("business rules") out of such a context presents its own challenges.
One thing that I'm reminded of, thanks to working with some brilliant people: the team composition is probably the most important thing. There are people who absolutely should use Lisp for the job, almost entirely independent of what the job is, and there are people who absolutely should use OCaml for the job, almost entirely independent of what the job is. I think I'm frequently guilty of overlooking this.
This no good at all - we really need to put more effort into disagreeing!
Let me make another attempt to misunderstand your last paragraph - do you simply mean that those people have used Lisp (or OCaml) for X years and are now enormously skilled in solving problems with those tools?
Yes, that's essentially the crux of it. I used Lisp for decades, but could I have connected to a REPL on an unmanned spacecraft running a large, complex Lisp app and debugged it in situ? I'd like to think so, but honestly, I have my doubts. But my CTO did that. Conversely, I've been using OCaml for almost five years now, but would I ever come up with Oleg Kiselyov's "all the ways to express the polymorphic fixpoint operator in OCaml?" That's even more doubtful than the first example. So if I were to hire my CTO, I'd want him to work in Lisp if he wanted to. If I were to hire Oleg, I'd want to let him work in OCaml if he wanted to. Ob. note: that doesn't mean both gentlemen aren't wizards in other languages as well; they are.
Yes, the strange thing is that I still don't think the differences between programming languages are insignificant, just that skilled people fill in the differences to a significant degree.
And although I've heard that wizards exist, I'm more interested in simpler programming languages for us muggles :-)
Characterizing people as non-linear, first-order components in software development
You've both probably already read it, but I think it fits here for some who might not have.
My function needs primes as parameters. What now?
Clearly, you'd need to be working with a dependent type system that could express that your function takes primes. Thankfully, whether a number is a prime or not is decidable. :-)
Saying that "compiling is a form of testing" isn't fair, you're right. Compiling is strongerâ€”it's a form of proof. It's just that, in most popular languages, you can't prove anything very interesting. But in the languages we tend to talk about here, you can.
Is anything less than a fully-dependent type system really useful in programming? 99% of bugs (number assumed - feel free to point research papers with different numbers) exist because the type system does not allow exact specifications.
...but I don't. What I do have is decades of Pascal, Object Pascal, C, C++, Java, Common Lisp, and Scheme experience, coupled now with about five years' OCaml experience and about eight months' Python experience. That experience strongly suggests that, yes, there is a great deal of value in, e.g. the OCaml type system vs., e.g. the Java or C++ type systems. There are definitely respects in which either you can't express what you can in a dependent type system, or at least if you can, it's painful (cf. Eliminating Multiple-Array Bound Checking through Non-dependent Types). On the other hand, when you consider the hoops that top C++ programmers are willing to jump through to keep using C++ (see, oh, almost all of Boost), this is actually pretty light stuff—again, in my experience.
I don't believe #28 is flame bait, after all, think about all the people learning how to program in C/C++/Java who are just ecstatic about having finally formed a syntactically correct program that successfully compiles after hunting for little bugs that caused some random compiler error for a good while.