Fighting Bit Rot with Types (Scala Collections)

didn't see it mentioned yet. i would excerpt the abstract were it not for the fact that the pdf is horribly broken such that no spaces are copied-and-pasted. yay usability.

the gist: statically typed collections library ends up with duplication, ironically? and new advances let it be excitingly refactored to make everybody happy again. i think that is interesting because i do believe static typing has some gotchyas.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

We report on our experiences

We report on our experiences in redesigning Scala’s collection libraries, focussing on the role that type systems play in keeping software architectures coherent over time. Type systems can make software architecture more explicit but, if they are too weak, can also cause code duplication. We show that code duplication can be avoided using two of Scala’s type constructions: higher-kinded types and implicit parameters and conversions.

PDF works fine for me, using Adobe Acrobat...


didn't work in anything non-AA on this mac, and i tried a few different things. guess adobe is messing with non AA users? ;-)

thanks for posting it.

Works for me

with Preview on the Mac.

works as in

you can see it (i can see it) or you can copy and paste and have whitespace ratherthaneverythingrunningtogether? if you get whitespace then obviously you are /part/ of the conspiracy... :-)


I can copy paste without any problem. Here's the first paragraph:

Bit rot is a persistent problem in most long-running software projects. As software systems evolve, they gain in bulk but lose in coherence and clarity of design. Consequently, main- tenance costs increase and adaptations and fixes become more complicated. At some point, it’s better to redesign the system from scratch (often this is not done and software systems are left to be limping along because the risk of a redesign is deemed to high).

Maybe corrupt font cache problem?


The paper skips over some of the controversy involved in collection library evolution like laziness vs. eagerness. In an OO language like Java, in Josh Bloch's Java Collection library, operations tend to create views/aliases, while in Scala they tend to create copies (at least this was the case before). Either choice has important implications. For mutable collections, copying is hardly ever the right thing to do, while for immutable collections, you would always expect another immutable collection. Java chose lazy, Scala chose eager, while the initial decisions were probably made in some ancestor libraries.

Does it matter? If I have a list of all integers, that better be lazy, then map should also be lazy. I think everyone agrees on that. What about calling "length" on an infinite list? Should length be a member of Seq if Seq's can have infinite instances? Is that sensical? Now, when we wrap mutable Java collection library, what semantics should be used? Lazy as intended by Java, or eager as intended by Scala? Will Java programmers porting their code to Scala care? And really, the type system plays no role in that, its almost useless.

In the end, that bit rot occurred had little to do with types and much to do about philosophy and dogma. Types aren't everything, but we already knew that.

Code duplication

I haven't read the paper (just skimmed it), but I take their comments on code duplication to be looking at how Scala's type system supports scalable component abstraction, but that there is some implementation advice for other Scala library designers.

If that's the case, then my major problem with this paper is the same problem I have with other papers like it, including ones on Haskell library design: (1) to reuse code it must be modular (2) adapting existing code requires parameterization (3) finally, continuously improving reuse requires bondage-and-discipline Standardization (Software Engineering stuff)

For (1), this is a problem domain issue; For (2), this is an issue with manipulating the static production rules of the language; Odersky's book on Scala with Lex Spoon and Bill Venners fundamentally shows how mapping design requirements into Scala code is more expressive than Java code, but does not settle the question as to what Scala cannot parameterize easily. I attribute problems with paramterizing and thus re-using Scala code to the fact that Scala is based on the traditional metamodel of reuse, where you write starting off based on the language's predefined metamodel (e.g., "classes", "traits") that is strongly enforced by the compiler (in Scala, it is also statically enforced). In Scala, you can extend the compiler, but you cannot substitute the metamodel; one of Scala's "achievements" is actually how well it targets the JVM.