Turning every module into a functor automatically?

IIRC, I've read about this approach in one of the ReadScheme module papers:

Say we have an ML-like higher-order module system, but every time a module M uses another module U, M automatically becomes a functor, that by default uses U, but could also be parameterized post-hoc to use another module V in place of U:

structure M = 
  let f x = U.do_something x

structure U =
  let do_something x = ...

By default, M would use U's implementation of do_something.

But a client could also supply another implementation (structure), as long as it has the same signature as U:

structure V =
  let do_something x = ...

structure M' = M parameterize U = V

The parameterize statement means that V should be used in place of U inside M'.

I think this would be quite convenient in some cases (e.g. supplying a different string implementation for a module – "monkey patching").

Comments?

Update: Martin Gasbichler's Fully-parameterized, first-class modules with hygienic macros calls this feature full parameterization.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Mass overriding

I would call this feature "uncontrolled unconstrained mass overriding", and think it is as bad as it sounds. ;-) ...unless you have an urge to make sure that every client coming along is able to break everybody else's abstractions and implementation invariants.

Also, whether two higher-order modules "have the same signature", will most likely be an undecidable property.

Edit: I also think that this is not the above paper's interpretation of "full parameterization".

Testing

It seems to me that mocking interfaces for testing all but requires that "unconstrained mass overriding" be possible, and for this reason modern design trends in C# and Java are to keep everything interacting using interfaces.

In this scenario the "hooking up" complexity is handled using Inversion of Control (i.e. container-managed object graph construction), so that code is written with less coupling and greater testability.

Full parameterization

I also think that this is not the above paper's interpretation of "full parameterization.

The paper proposes a module system in which every module has to list all the interfaces it uses. Then, in many cases, the linker can choose the single implementation of that module. If there are more implementations of an interface, the linking information has to be specified manually by the programmer.

The system I sketched in the original post is quite similar to that model, the difference being that I left out the explicit specifications of the used interfaces. Other than that, I think they are quite comparable (edit: in a kind of hand-wavish way :).

Modules vs components

You need to distinguish two concepts: modules and components (or compilation units). In many languages, these are equated, but not so in ML. ML modules are just language entities that get evaluated. References to other modules are not necessarily inter-component references.

So what you suggest is entirely different from the paper. AFAICT, the paper simply suggests that all external (= inter-component) references should be treated as real parameters, a priori. That's nothing special, unless you care about macros. Your suggestion, on the other hand, is that we should be able to turn even internal (= intra-component) references into parameters, a posteriori. That is something entirely different, and in fact, does not make any sense. It is equivalent to suggesting that you should be able to write

val x = 4
val y = print x; x + 3
val y' = y parameterize x = 5
print y; print y'

and see 4 5 7 8 as output. If you don't see the analogy, just modularize the above:

module X = {val x = 4}
module Y = {val y = print X.x; X.x + 3}
module Y' = Y parameterize X = {val x = 5}
module Z = {print Y.y; print Y'.y}

The point is, an ML module is not an "unlinked" component. A module is the result of an actual evaluation, it is already "instantiated". Evaluation might create state or have other side effects. Your proposal would amount to being able to take the result of such an evaluation (a module value if you want) and somehow being able to reevaluate its original module expression in a modified ("overridden") environment.

I don't think it's something you seriously would want to have in a language. Neither do I see how it would be the right answer to the testing scenario Matt was describing, which is naturally addressed on the compilation unit/linking level.

Does it matter?

Give or take the sequencing aspect in ML, I've wanted to be able to do something along those lines in Haskell often enough - though being able to write functors without nasty encoding overhead would be by far the most useful part! It is, admittedly, a weird form of overloading, but overloading is all it is, you get to hang on to both the expression and its value. Pretty much akin to being able to twiddle with a (copy of a) thunk's environment.

It may well be a sign of bad taste in module languages, and I suspect in practice I'd just get used to rewriting as a functor and creating a default instance module, but hey. It does strike me as more lispy than schemelike, certainly.

But I'm not convinced about handling it on the compilation unit/linking level, for one simple reason - that's largely extralinguistic in most cases, and trying to pull it into the language leads to... well, MLish modules again. Leaving it extralinguistic may be tolerable for testing (it's rather more annoying if you're using in-language testing tools), but it's all kinds of annoying for other purposes.

I am admittedly ignoring the module vs component distinction, because I'm not much convinced it has to matter, given sufficiently poor taste. Of course, we may well be better off with a module-to-functor refactoring tool and maybe some defaulting!

I am admittedly ignoring the

I am admittedly ignoring the module vs component distinction, because I'm not much convinced it has to matter, given sufficiently poor taste. Of course, we may well be better off with a module-to-functor refactoring tool and maybe some defaulting!

And this is more or less what I suspect the Noop people have in mind with DI-in-the-language. Although they might not describe it exactly that way...

(I love it then threads come together...)

I have something very similar to this...

... in a language I'm working on (err, it's on low heat now). I think it's a nice system, because you get something more modular than type classes but with less explicit plumbing than Scala implicits or a fully functorized style using ML modules.

I think some of Andreas' objections go away if you bind to theories (signatures + equations) rather than just signatures. Then you can always do a substitution that satisfies the theory. Reasoning about the correctness of a piece of code is done relative to the theories satisfied by the symbols used, rather than reasoning about the implementations of those symbols. I think this captures what's good about OO (that code should interact through interfaces) without what I consider bad (that the interfaces are part of the values themselves).

Edit: But of course I should point out that if you're willing to just take the programmers word for it that theories are satisfied, rather than requiring theorem proving, then you're essentially back just having signatures and the equations are just documentation. This is where I am now, and it seems to be a pretty reasonable position, given the high cost of theorem proving.

Order-sorted algebras

I think some of Andreas' objections go away if you bind to theories (signatures + equations) rather than just signatures. Then you can always do a substitution that satisfies the theory. Reasoning about the correctness of a piece of code is done relative to the theories satisfied by the symbols used, rather than reasoning about the implementations of those symbols.

An OBJ fan?

interesting stuff

been a while since i looked at that. i always wonder what if any of it is still going, or how close it ever got to usable-by-industry. e.g. Tatami sounds neat but looks out of date. i really wish more projects would make it explicitly clear if they are still in progress. like, i can't tell if Maude is or not.

Maude

I met Patrick Lincoln three years ago and IIRC, Maude had been in production use at SRI for years by then (so assume it's still in use.)

I agree that an explicit "vapormeter" (and maybe an "abandonmeter") would be useful for software projects. :)

Yes

I actually just recently learned of OBJ earlier this year here on LtU (on a tip from Derik Elkins, I believe). It's very similar in some ways to my project. A couple of the differences are that I don't use initial models for constructions but instead have first class modules and explicit substitution and that I use a more powerful logic with explicit quantification.

Yes, it matters

Philippa: It is, admittedly, a weird form of overloading, but overloading is all it is

Well, no. First of all, overloading you can control. Haskell-style overloading in particular is clearly a form of a priori parameterization, like functors, that you can choose not to use in your definition. With the OP's suggestion, the whole point seems to be that you don't have that choice, which looks questionable at best. Considering that one main purpose of modules is encapsulation, double so.

Also, as soon as your language is impure or strict, you cannot simply overload arbitrary definitions a la Haskell. Because overloading is parameterization, it turns expressions into functions, thereby completely changing their operational meaning. In strict languages, you will typically have to restrict overloading to function values. And in fact, even Haskell does something like that with its monomorphism restriction.

you get to hang on to both the expression and its value. Pretty much akin to being able to twiddle with a (copy of a) thunk's environment.

You may not fully realize the computational generality of ML-style modules. They are a complete, higher-order functional language. It is not even clear what it should mean to retroactively "parameterize" a module. Which parts of the transitive closure of its definition are you considering? What are you referring to by some module variable X outside its scope?

Take an arbitrary functional program that computes some value, say, an int, and then define the meaning of "parameterize x" on that value. How would you do that?

But I'm not convinced about handling it on the compilation unit/linking level...

Well, when you do testing you typically test compilation units/components/whatever you call them. As for the rest of your comment, I fully agree that the parameterized and suspended nature of components should be reflected within the language. In fact, I presented one possible take on this at ICFP'06. :-)

Comments

Can you elaborate on the problem you see with strictness? Are you just talking about the fact that evaluation must be delayed until the last parameter is substituted?

And I agree with you that this isn't really about modules. You can frame this as a mechanism for controlling parameterization of values. I do this by managing a DAG of dependencies, keeping track of what's a parameter where vs. what's treated as a fixed value. It's just easier to write

foo :: Real -> Real

than to write

foo :: Real R => R -> R

Yes

Are you just talking about the fact that evaluation must be delayed until the last parameter is substituted?

Essentially, yes.

I do this by managing a DAG of dependencies, keeping track of what's a parameter where vs. what's treated as a fixed value.

I don't quite understand what you mean. What is "treated as a fixed value"?

Fixed

With lambda calculus, when you have a lambda expression,

f = \x\y. x*x + y*y

it's easier to partially apply the 'x' parameter (f x) than the 'y' parameter (\x. f x y). In informal mathematics we often write something like,

f = x*x + y*y

and then declare, "fix x and consider f as a function of y". This is roughly what I meant by fixed.

And again, the main benefit for this is when a particular implementation would over-specify your needs (ie it's not really intended for substituting another implementation of 4). But, for example, most code written to 'float' should really be parameterized in a float type. Often 'sort' has freedom to choose how to resolve ties. etc. You can get the same effect by making everything a parameter, but that's very verbose.

Item notation

If you care about exposing redexes in the lambda calculus, then you should look at the techniques, e.g., item notation, in Kamareddine & Nederpelt (1995) Refining reduction in the lambda-calculus [citeseer].

Your project sounds very interesting.

Broken link

That link doesn't work (and Googleing just turns up pay-walled versions).

Link fixed

It's still available from citeseer. I've fixed the link in the grandparent post.

Thanks

Odd that google failed to find the citeseer link...

It did

It was on the first page of results for Google Scholar, but the reference turned up in two versions, one with an incorrect title, and that incorrect reference came from citeseer...

Ah

I fully agree that the parameterized and suspended nature of components should be reflected within the language. In fact, I presented one possible take on this at ICFP'06. :-)

You should have mentioned that earlier. ;)

is it real?

any mls (alice?) with that kind of feature?

It is

Yes, it is a core feature of Alice ML.

thanks, that is cool.

now if i had the $ to support a project to get an interactive debugger going for Alice :-)

Interfaces vs overriding

[Edit: Oops, meant as a reply to barrkel.]

I think we are talking about different things. Interfaces have nothing (or should have nothing) to do with overriding. Interfaces are about controlled a priori parameterization, which is good, while overriding as proposed above is about uncontrolled a posteriori code transplants, which isn't quite so.

Not sure

I'm not sure the difference is as clear-cut as you suggest. The Java folks started out wanting a posteriori transplants (for testing, mainly, or occasionally for reuse) and since that isn't really possible in Java, they ended up moving to a priori parameterization as the default for everything. And then they had to invent DI to stitch everything back together.

The fact of the matter is, the bulk of Java code that uses DI frameworks has a zillion interfaces Foo, each of which has a single FooImpl in the main codebase. So all the clients of Foo fully expect that their Foo will always really be a FooImpl. Often, the only other implementations of Foo are dynamic proxies and something like MockFoo or FooStub. The parameterization is extremely shallow, amounting to little more than a level of indirection which can be exploited for testing, instrumentation or dirty tricks.

So I see your point, but I guess I don't really agree.

Ditto

I guess Matt said what I would have said in reply.