Mod sys with external binding of mods to mod sigs, where all mod imports must be sigs only

A module system variation I've been exploring is modules that can ONLY import module signatures (or some other interface or module type).

A separate type of source or configuration file, outside normal module source code files, would then just devote itself to binding these signature imports to concrete module implementations - on a per-importing module basis. There are some simple (good enough?) constraints designed to ensure that a module satisfies the same interface within some scope (like a library or package, ect.) when this matters.

Plowing ahead, I've been a bit in the weeds thinking about how to implement this with a "stock" assembler/linker via compiler cooperation and some pre-link stage glue code generation.

But really, I've rushed ahead too far too fast without looking around for similar lines of thinking and development efforts, presuming they exist.

I welcome any papers or language implementations or implementation experience reports along these lines. If nothing else, maybe I can avoid pursuit of a bad idea :-)

Mucho thanks.

p.s. Again, the model is that modules ONLY EVER import module signatures, and then concrete module implementations are assigned to these signatures outside the "normal" set of modules source code files.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A while ago I asked about a

A while ago I asked about a similar thing here. Two memorable quotes were Andreas Rossberg's

I would call this feature "uncontrolled unconstrained mass overriding", and think it is as bad as it sounds. ;-)

and Philippa Cowderoy's

I've wanted to be able to do something along those lines in Haskell often enough

Advanced Module Systems (A Guide for the Perplexed) says that something similar was discussed early in the design of ML, but dropped.

Not quite

I think you are misquoting me, since that comment was about a different proposal (and the difference to what Scott is asking about is discussed further into that thread).

I second the link to the Pierce/Harper slides, which specifically talk about this topic of "specific" vs "generic" module references.

Mesa worked like that.

Broken link

Broken link

Newspeak

Definitely look at Bracha's Newspeak. It uses classes/objects rather than modules, but in this case it's largely just a difference of terminology. I think it does precisely what you suggest. In Newspeak, there is no way whatsoever to refer directly to another concrete class. All references are through parameters, and it requires an external tool (IDE, build system, whatever) to actually grab a bunch of top-level classes and stitch them together.

As I've said before, I think it's an extremely good idea, although it remains to be seen how cumbersome it will be in practice. With modern IDE support, it may be quite workable, and in fact (again, as I've said before) plenty of contemporary Java programs are already written this way, they just use a dependency injection framework to do it dynamically.

I encourage you to pursue this and see where it takes you, regardless of what Andreas says. ;) See also my comments in the thread linked above by Manuel.

Paper

And I realized I should have posted something more concrete than "check out Newspeak": here is the LtU discussion of the most relevant paper. Lots of good discussion in that thread as well.

Thanks. That's great that I

Thanks. That's great that I have some other work to look at.

My early notes have an idea of module groupings (akin to the modules that would constitute a distinct library) that share a (reusable) set of default (module -> sig) intended bindings. This could well limit much of the burden while still allowing easy, ad hoc custom module -> sig specifications.

Also, like primop bindings in some (most) languages, say, there can be user defined permanent binding specifications to help the compiler go to town and optimize away. While I was thinking of performance with permanent bindings, I now see that this would also reduce developer burden.

I have wondered if this features is important at all given even a simple object system in the language. But while dependency injection is easy with objects, I realized that the signature/module level external binding specification also fulfills (much of) the role of a detailed link specification.

The external specification model has the ability to remove or add object files from the link set because it's a *compile time* (or link? - kinda in the middle) specification (not runtime object value passing) - which might be pretty darn handy :-)

S.

Abstract-only Imports

I'm a bit curious as to what we should expect to gain from this. I suspect that, ideally, we would end up with a more flexible and reusable body of source code. OTOH, attempting to 'force' this higher degree of abstraction and independence among source files might well lead to developers duplicating source code and flattening the dependencies (reducing the level of abstraction) in order to avoid pain at the project file.

I suppose another possible benefit regards static typing and especially typeful programming. The abstract imports allow external control over the underlying types used in each module, and thus of any typing effects. I'm of the opinion we should disfavor typeful programming in favor of pluggable and optional type systems, in which case the regular abstraction mechanisms of the language would be no less capable than module parameterization. (Module become function returning a record... perhaps utilizing sealers/unsealers for first-class ADTs.)

Benefits

I'm a bit curious as to what we should expect to gain from this.

  • Heightened testability (everything can be mocked)
  • Heightened security (no ambient capabilities)
  • Prevention of version clashes (different versions of modules can live together safely, without crosstalk)

Relative Benefits

I'm asking for benefits that involve from Scott's position that 'all' imports be externally parameterized. The benefits you describe are available in a module system with concrete dependencies (such as Objects as Modules in Bracha's Newspeak).

The 'heightened testability' might apply, but it might be a double-edged sword: more is testable, but there is more to test (potentially combinatorial). Concrete source dependencies mean you only need to test the specific inputs you developed against.

The security benefits seem orthogonal. [Nothing about Scott's module system prohibits ambient authority in the language - unsafe memory, meta-object protocols and reflection, direct assembly, etc.]

Hmm...

I'm asking for benefits that involve from Scott's position that 'all' imports be externally parameterized. The benefits you describe are available in a module system with concrete dependencies (such as Objects as Modules in Bracha's Newspeak).

Just to be clear, I wonder why you call Newspeak a "module system with concrete dependencies"? My strong impression was that it is precisely the opposite: a module system where all imports are externally parameterized.

My strong impression was

My strong impression was that it is precisely the opposite: a module system where all imports are externally parameterized.

Not quite. In Newspeak, it is possible for the source-code describing a module to parameterize and instantiate other modules. For example, given parameters A and B, I can instantiate and use module A[B]. Further, I can use specify concrete parameters; i.e. if nested class or object C is declared in a local scope, I can use A[C].

The concrete references and internal parameterization of modules in Newspeak seems to be a recommended practice: an IDE-layer class serves effectively as a global namespace. It would be non-trivial to parameterize modules externally to that namespace due to the internal coupling and concrete references.

From Modules as Objects in Newspeak (section 2.3):

An application is typically constructed by instantiating a top level class T representing the application as a whole. T will likely depend on a number of separately compiled module de finitions; its factory method should take these, and only these, as arguments. The Newspeak IDE provides us with a namespace containing all classes used in development. It in this namespace that we will instantiate T. Use of the IDE namespace is analogous to how tools like make reference the components of an application utilizing the file system as a namespace.

While Newspeak makes parameterization and very-late-binding feasible, the properties are not fully enforced. [Though convention could certainly discourage bindings within classes, and the IDE could prevent bindings between 'top-level' classes by forbidding dependencies between application classes.]

Scott describes a system to structurally enforce that parameterization be external to the modules, i.e. by restricting it to special link-layer code. As consequence, the constraints are much more severe than those in Newspeak. I was wondering what we gain from these extra restrictions.

Agree to disagree?

I'm glad I asked, because it turns out that we definitely have different understandings of Newspeak. I'm not sure which of us is right, but my impression is different than what you describe.

I'll start with something we agree on:

Not quite. In Newspeak, it is possible for the source-code describing a module to parameterize and instantiate other modules. For example, given parameters A and B, I can instantiate and use module A[B]. Further, I can use specify concrete parameters; i.e. if nested class or object C is declared in a local scope, I can use A[C].

I agree, but I'm not sure that's so important. IMHO, this is just the dual module/class nature of Newspeak classes showing. The key is "if nested class or object C is dclared in a local scope." Naturally one can directly refer to concrete things that have just been locally declared, this doesn't seem to me like such a problem (in fact, anything else seems perverse). It seems to me that the following rule is sufficient: one can only refer to names whose bindings are lexically apparent. (One caveat: names inherited from a superclass are considered "lexically apparent" via self/this.)

Informally, I would only describe top-level classes in Newspeak as "modules." These classes have the property that no free names may occur anywhere. (I would refer to nested classes simply as classes, although of course Newspeak blurs this distinction considerably.)

I'm not sure what you mean by this:

... an IDE-layer class serves effectively as a global namespace.

If you mean that there is some code you can write that refers to free names, my strong impression is that this is not true. My understanding is that a Newspeak program consists of a set of closed-form classes, none of which contain free names (and therefore none of which refer to one another). Some mechanism is presumed to exist which is aware of these classes and can compose them, but this mechanism is implementation-dependent and unspecified (maybe IDE, maybe build system, who knows). In other words, it is precisely "[restricted] to special link-layer code."

Anyway, as I said, I'm not sure which of us is right, and it may well be you. But my impression was that Newspeak, at least regarding top-level classes, was almost exactly the system that Scott describes.

Little Pockets of Solitude

the following rule is sufficient: one can only refer to names whose bindings are lexically apparent

This is a useful rule for simplifying the module system, I agree. I'm thinking that the main advantage is a simplified dependency graph. I'm giving it some serious consideration for my language. Keeping the dependencies 'flat' moves a lot of complexity and pain into the project layer (how many modules need numbers or lists?). But keeping the namespace local would simplify the individual modules.

Some mechanism is presumed to exist which is aware of these classes and can compose them, but this mechanism is implementation-dependent and unspecified (maybe IDE, maybe build system, who knows). In other words, it is precisely "[restricted] to special link-layer code."

I place a greater weight on how such things are actually implemented and utilized in practice. When people speak of Haskell's module system, they speak more often of GHC than of the Haskell 98 standard. With regards to Newspeak, the reference implementation will be a de-facto standard.

I'll grant that I might have misunderstood Section 2.3 of that paper when I read it earlier. I was initially under the impression that there was one big 'IDE' class, forming a development namespace, and thus that many other classes would have access to namespace. But I now understand that this big IDE class is only available to the final 'application' classes. In that case, there is no relationship between development classes - each is an independent pocket of solitude, only brought together by the final application class, and I agree that this would correspond well to Scott's designs.

See Jiazzi (units for Java),

See Jiazzi (units for Java), it does exactly this.

It's safe to say that one,

It's safe to say that one, but only one, of the "themes" or "hypotheses" that motivates my experimental efforts is: maybe people have spent much too much time focusing on "programmer productivity." At the same time, we have developed this large "stack of shoulds" or design/development best practices - but while tools in some small part might help developers comply with a few of them, there's very little enforcement built into language tools.

So, being a born contrarian, I'm officially dis'ing "interactive, experimental programming"; "end user domain expert programming"; the popularity of the plethora of the ever increasing number of redundant toy scripting languages; the assignment of any value to individual programmer productivity; ....

Hope you get the idea. So let's take just one old design saw. We are told, "a good interface should support at *least* two completely different implementations." We'll, importing only abstract interfaces will be one very nice step toward *forcing* this to be a reality in the design.

So, for example, a look-up table will never necessarily be a RB tree or hash dictionary again in my language (should I implement this module system). Now, to any calling code (and the developers who write it) table look-up might as well be a silly little Lisp style association list instead. In fact, some "build master" (could be a group or several library focused groups or whatever) might decide that for tiny tables, an assoc list is in fact the right, low overhead implementation.

In any case, the implementation of interface dependencies will be TRULY selected independently. Hence, if with some inconvenience, the language encourages developers, to say the least, to "do the right thing." Does it slow down the individual programmer? Yes. Who cares? Not I.

So then think of other design and development "good ideas", such as structuring systems via a family of canonical "shapes" that define strict relationships between program artifacts: top down, layered architecture, shared substrate (stdlib), controller and workers, polling and dispatching, and so on.

Great. So, another idea I'm considering is a language defining the internal structure of libraries via selection of a supported "system structure" - then the language enforces the visibility rules (of modules and their contents) of that type of structure to interconnected structures.

Now no extern declaration or just eight more import statements will ever have any effect on getting one's code to compile IF one gets the structure wrong at the start. Instead, one will have to slow down, reexamine one's subsystem structure choices, repartition the system; recognize important missed substructures that require a different shape - or whatever.

But in the end, if it compiles and works, it will in fact have an organized and documented (multipart and nested) structure that can be accurately and precisely described - not just some "rough" design shapes illustrated in some diagrams on a whiteboard. (I guess you could subvert any such system if you tried hard enough, but restricted visibility of formally structured subsystems will make it harder).

Blah blah blah. So, this particular line of inquiry is part of a broader experiment to see how "good design and development ideas" can be *enforced* by the language tools. In this sense, where I have usually thought of language tools as being "for" developers, in this case, the language tool is "for" some much broader enterprise and its interests over some extended period of time much, much longer than the duration of a single project (say a decade or more of a product or system *family* life cycle).

Another major "good idea" I'm banging on is reuse and how to restrain freedom in the language to enforce (or "encourage") writing very nicely reusable libraries whenever possible. One simple idea I have is to allow only "main" to be a standalone compilation unit - ALL other source code must be put into the above mentioned highly structured libraries, even if "reuse" doesn't seem like a goal at the time - does it ever when in the trenches under deadline? So again, I *take away the convenient options* and at least force (as much as the language can) the entire system to be structured as nicely as a well structured, heavily reused libraries.

Hope that helps. I could probably enumerate more specific motivations for the "import interfaces only" policy, but ultimately they would all still be informed by this general line of inquiry I've taken up.

S.

nifty

seems akin to using skeletons for making parallel programs: we know that it is easy to shoot ourselves in the foot, and likewise hard to get things right, so try to stick with a library of some off the shelf canned solutions.

i think skeletons allow for composition of skeletons, does that seem like a useful thing in your current scheming?

Sounds a bit like Stephen

Sounds a bit like Stephen Kell's work.

Thanks for the Stephen Kell

Thanks for the Stephen Kell reference. Indeed, from reading the abstracts and one of the shorter papers, he seems to be definitely singing from the same sheet of music.

Kell is even skeptical about relying on stable interfaces at all (!) - so I eagerly anticipate the "punchline" of his overall program. Thanks!

S.