I Seek a Reasonable Survey on the Concept of "Module System"

Regarding research for the design of a module system, my efforts seem to mostly conclude that the term "MODULE" means just about *anything* to *anyone*, therefore next to *nothing* ;)

To keep this message terse, just think yourself of all of the really weird terminology applied to language constructs otherwise all called "module systems" in different programming languages!!!!

I've more or less settled on Scheme's r6rs relatively simple module system, but I have low confidence that this is the right design choice. It *really* has been extremely frustrating.

Is there anything that provides some logical and/or historical overview or survey on the myriad concepts of "module system" in computer programming language design? (The major languages, the historical design "milestones" or whatever).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Best layman description

Best layman description would be available through reading Gilad Bracha's blog, B.S. and A.S. (before Sun and after Sun). He worked on the superpackages concept for Java, and ultimately left Sun during the project.

A different look at modularity is found in Constantine and Yourdon's 1975 book Structured Design. However, they use the word "structure" in that book to describe modules. Meilir Page-Jones coined the term "connascence" to describe various kinds of modularity, but this term never caught on. Probably in part because (a) nobody was really thinking of the problems he was thinking of back in the '80s and '90s (b) the name is hijacked jargon from non-software literature and is not nearly as good as "coupling" and "cohesion" (c) his basic idea was to define modularity in terms of categories that could possibly be checked by a tool and always made perfect sense from the perspective of program maintenance (i.e., "connasence of position" is a re-statement of the Stable Interposition problem), rather than tons of system design principles you had to remember (d) he published these ideas in extremely pretentiously titled books, like What Every Programmer Should Know About Object-Oriented Programming. (Note to authors: Sentences are not titles.) (e) Academics don't read crap like that

Gilad's ideas are even more general than syntax, and extends to the semantics of the entire toolchain - compiling, linking, loading. Gilad will point the way to all the stuff that came before Newspeak, since he is an academic and good about that stuff. He'll cite Scheme, etc.

I've more or less settled on Scheme's r6rs relatively simple module system, but I have low confidence that this is the right design choice.

Copying that would be an implementation choice, not a design choice. Design criteria would state what is orthogonal to what, not how you actually achieve it.

Taking your broader point,

Taking your broader point, I'm not sure that "What Every Programmer Should Know About Object-Oriented Programming" is a sentence.

Too big a topic

There's no such survey, and no survey like that is possible - there's just too much to survey. You can find a survey of module systems for ML-like languages in Pierce's Advanced Topics in Types and Programming Languages book. The various Java module JSRs probably provide prior art of Java module systems. You can read Matthew Flatt's 'Composable and Compilable Macros' for discussion of module systems as they relate to macros. How module systems relate to typeclasses (in all senses of relate) is yet another topic discussed in other places. Then there's lots of work on components in the SE community. Plus all kinds of work on dynamic linking, which is relevant too. Plus Modula-{1,2,3}. Plus the CL package system.

I think you get the idea.

If you're interested in module systems like the R6RS, it's based on the design in Flatt's paper, which is discussed more in Culpepper et al, 'Advanced Macrology and the Implementation of Typed Scheme'. But it's really only relevant to languages with macros.

The various Java module JSRs

The various Java module JSRs probably provide prior art of Java module systems.

Not really. Only one spec is worth anything, mainly due to the fact there were some less-than-capable people handling Sun's technical lead position on the various JSRs. (This is one way of saying Scheme's standards have better review processes than Sun's Java Community Process.)

So there are two kinds of Java modules. One aimed at integrating with the VM, and thus is very PLT. Another aimed at hacking the classloader ("the usual Enterprise systems Java approach") and class configuration files (manifest file) seen by OSGi implementations... and thus is very "software engineering"-ish.

You could also checkout my

You could also checkout my OOPSLA 2001 Jiazzi paper, which is a Java version of Flatt et. al's unit module system.

Module is an overloaded term, given that their are many aspects to modularity in general. There is compile-time modularity vs. run-time modularity, as well as SE-focused modularity (focusing on deployment) vs. PL-focused modularity (focusing on separate compilation). My advice is to understand the problem that each module system is trying to solve, and classify them accordingly.

What is run-time modularity,

What is run-time modularity, vs. what is deployment in a pluggable architecture? I'm not sure I see a difference.

Run-time modularity, such as OSGi hotswapping, is concerned with zero-downtime and minimal reload time -- for upgrading of existing software. These upgrades should not even require the code to be provided by the same vendor. The only requirement should be that they satisfy the object interfaces. In OSGi parlance, the user only has to have a new bundle that replaces an old bundle with equivalent services. In theory, this can be as fine-grained as simulated a dynamic language using compile-time modules. For example, providing strong mobility would be tricky, but providing weak mobility only requires serialization of the object from old service bundle to the new bundle.

However, this whole hotswapping dance, can easily be reseen as simply solving a deployment problem. Thus, run-time modularity here is only a means to the ends (solving the deployment problem).

Yet, deployment is more general than that, because you want compile-time typechecking in addition to a run-time module system like OSGi bundles. What you want to be able to do is define an XRef (cross-reference) between a set of module definitions and a set of module configurations; this many-to-many relation specifies a set of valid compile-time module instances. Support for this is not that good right now in existing OO 3GLs. Better support is actually available from OO 4GL vendors focused on model-driven architecture, but that is sort of cheating.

Module actually has a very

Module actually has a very specific meaning as an encapsulated unit of "something." So if you say module, prepend what you are modularizing: a source code module (Java file/package, unit), a binary deployment module (dll, jar), a run-time state module (object, continuation, monad). It is sometimes

Although the contexts are obviously different, the principles and problems of modularity are surprisingly similar: separate compilation, dynamic linking, object composition...they have a similar theme. Some unified construct/theory might be a possibility, however, my committee wisely suggested that this wasn't a good idea.

I agree

My point was simply, practically speaking, run-time modularity and deployment in a pluggable architecture simulate the same goals. This statement is basically a tautology, since in order for something to be pluggable in the hotswap sense it must be modular and done at run-time.

Module actually has a very specific meaning as an encapsulated unit of "something."

I would probably put the air quotes around "encapsulated". ;-)

Overloading

Regarding research for the design of a module system, my efforts seem to mostly conclude that the term "MODULE" means just about *anything* to *anyone*, therefore next to *nothing* ;)

Yeah well. duh. "Abstraction," "unit," "entity," "atom," "type," "generator," "interface," "mapping," ... Most terms are vague descriptions. If you want it to mean anything specific within a context, define it or derive a list of different meanings and pick one.

[Ah well, guess I shouldn't be sloppy and read beyond the first few lines ;)]

All the true objections not

All the true objections not withstanding, I remember a great set of slides about modules (in the ML sense, I think) that was discussed here several years ago. It is well worth trying to dig it up (sorry, but I can't recall anything more concrete at the moment...)

Maybe Pierce's ICFP talk

Maybe you are referring to Pierce's ICFP'00 talk "Advanced Module Systems (A Guide for the Perplexed)". The slides are available online.

Thanks for all the (divergent) pointers

I *still* think some brilliant but pragmatic comp sci text author (you know the kind) could write a really good history/survey of "modules systems" that would rationalize terminology and concepts of "module system functionality" FOR YEARS TO COME IN OUR INDUSTRY. Oh, well.

I guess when I think "module system," whatever else is going on, my first thought is: managing names for things (bindings) via named name spaces. So, if little else:


module buffy (export spike) is giles = 3; spike = giles + 1; end
module angel (export dawn)  is dawn x = 7; end

/* allows us to write, hiding buffy.giles */

module joss (import buffy angel) is 
     darla = buffy.spike + angel.dawn; 
end

Add whatever syntactic sugar and name manipulation/resriction you like (most notably manifestly typed names, I suppose).

On top of this basic capability, though, as many have pointed out, the sky is barely the limit.

Reading all of the above, the combination of this basic "module" notion with "compilation unit" and typically thus with "file" is so super duper obvious we don't even mention it.

Z-Bo mentioned the OSGI Java folks so-called "module system" efforts, and in fact, it is precisely exposure to all *THIS OSGI STUFF* which made me finally go: "WTF? What in h*ll is a *module* supposed to be anyway", because the OSGI folks might as well have been from ANOTHER PLANET, given what I was looking for and reading at time.

My interest in r6rs is really, really prosaic, but I think it gets a lot of things right. First, the more/less os file system independent naming of the module hierarchy, and the inclusion of optional version numbers. It just strikes me as a smart design. [net tcp protocol smtp (2)]. Looks good, maps to file systems easily, let's code easily use (and install, I imagine) the the [smtp (1)] version, etc.

Second the basic keywords fit well together. It all boils down to simple static analysis (as it should), but the lingo has a flexible, dynamic "feel" to it: module, import, export, all, some, except, prefix. (I'm sure I'm recalling some incorrectly and forgetting others) and how they can be combined with simple nesting.

Third, the "schemely" concept of prefix based renaming during import of names from a module vs. a "standard" module scope syntax like mod.name, mod:name, mod@name, mod::name, mod/name and on and on. I really like the prefix approach: 'prefix' punts on what is essentially a whole non-issue nicely IMHO.

Fourth, the ability to conveniently re-export names with all the same flexibility with which they are "imported" is super useful, obvious, and now seems to me mysteriously missing from other module systems. It gives the ability to use a typical "file as compilation unit as module" approach, but then import and re-export (some of the) possibly many modules' names in just one or a few modules that are what are eventually imported and used by users.

Finally, I have need for a couple of kinds of modules: (1) source file, pretty much total abstract "inline modules" for pure name space protection; (2) good old module/file/module/compilation unit (so far); and a concept in my language design called a "syntax-module" which is kind of like a really heavyweight macro (think embedding whole systems like syntax/type checked SQL instead of whipping up a tone of (when (test) body) or (loop) style Lisp macros.

The basic r6rs scheme gives me all the flexibility I need (so far as I've reasoned about it) to implement all 3 "module" construct with basically a single syntax, set of keywords, etc.

So that's my interest in r6rs. I just think it's flexible enough, simple enough and does several stupid prosaic traditional "module" tasks close enough to "the right way."

Final note. I did spend a couple of days to finally "get" ML signatures, modules and functors. Totally brilliant. I just don't think I have the brain power to integrate something like ML's system into everything else that's going on in my language right now. Maybe someday.

Scott

Z-Bo mentioned the OSGI Java

Z-Bo mentioned the OSGI Java folks so-called "module system" efforts, and in fact, it is precisely exposure to all *THIS OSGI STUFF* which made me finally go: "WTF? What in h*ll is a *module* supposed to be anyway", because the OSGI folks might as well have been from ANOTHER PLANET, given what I was looking for and reading at time.

This is because OSGi leaves quite a few details about a module system unspecified, by design. This allows everyone to "understand" the OSGi platform and what it is about, while still customizing the notion of "module system" in the context of OSGi. Some good examples elude me at the moment, as I have officially not touched the JVM in two years, but...

Bottom line: OSGi is meant to be neutral, allowing for many kinds of implementations of the spec. That's why some implementations are much smaller than others. Just compare Equinox to Felix, for example. There is at least one other implementation, but its name eludes me also.