## Transparent Persistence

Is anyone doing functional programming research on transparent (aka orthogonal) persistence?

I've been reading a lot about the subject (http://c2.com/cgi/wiki?TransparentPersistence) and I've found that promising projects like Eros, Grasshopper, Texas, and others have either died or removed transparent persistence from their projects.

I'm a contributor to an open-source transparent persistent OS (www.torsion.org) that's in the baby stages; I am interested in transparent persistence because of its potential to remove non-pure I/O from functional platforms.

Geoff

## Comment viewing options

### Orthogonal Persistence

Good question. The E language has spent a LOT of time (decades) thinking about the issues; there's a very good summary overview here. Basically, orthogonal persistence plays havoc with the need to evolve the structures that are being persisted in non-trivial ways. Peter Sewell's group, working on Acute, is striving to bring type theory to bear on the problem, with some level of success, but most definitely not with orthogonal persistence.

### Squeak?

Where do Smalltalk's saved images fit into this picture?

### Snapshot

The image file is a snapshot that includes source code, compiled code, open windows, applications... (everything is an object...)

So the code and data remain in synch - they are together in the snapshot.

Open file handles and database connections and network connections, will probably fail the next time the image is invoked. (OTOH there are ways to handle this when the snapshot is created, and when the image is next invoked.)

Lightweight Smalltalk processes are just objects, so they are saved in the snapshot and continue running the next time the image is invoked :-)

Smalltalk images are usually portable - save on MS Win machine, copy the image file to Unix machine and continue working on Unix machine.

### Consistency

So the code and data remain in synch - they are together in the snapshot.

This is misleading and is exactly the kind of thing, the eRights page was talking about. Upgrades are the issue. While fresh code is (likely) in synch, as upgrades are made things will fall out of synch. What happens when two upgrades want to change the same method incompatibly, or how does code that works with upgraded objects (say with an additional field) deal with old ones?

On an unrelated point:

The image file is a snapshot that includes source code, compiled code, open windows, applications... (everything is an object...)

"Everything is an object" is completely irrelevant to the behavior. An image is just a copy of the heap and the current machine state.

Note (to Luke Gorrie): the eRights page linked by Paul Snively explicitly mentions Smalltalk.

### Erlang

Persistent Erlang systems have been built with live upgrades and downgrades. I think a general answer to the problems you stated has to do with designing a system in a modular way using transactions to allow the programmer to define safe points where one module can be upgraded without breaking dependencies.

### Pardon me, really...

Derek, do you really know enough about Smalltalk to state this or that is misleading?

1. edit the definition of a Smalltalk class to add a field
2. accept the new definition
3. any object of that class (that existed in the image before the field was added) will now have the new field

"What happens when two upgrades want to change the same method incompatibly"

Do you have a specific example?

"the eRights page... mentions Smalltalk"

"Smalltalk, with its easy support for live upgrade, is not a counter-example. This support cannot be made reliable, and is instead designed for programmers-as-customers who know how to recover from inconsistencies."

Be nice to know why it cannot be made reliable, wouldn't it.

Isaac Gouy: Be nice to know why it cannot be made reliable, wouldn't it.

Off the cuff, I'd imagine that they mean that Smalltalk's live upgrade allows... well... anything, so there's no protection against leaving the system in an inconsistent state.

Personally, I find EROS' move away from orthogonal persistence more interesting, since it supposedly represented the "holy grail" of orthogonal persistence, i.e. whole-system snapshotting at the OS level with issues like open network connections, file descriptors, etc. dealt with. But I've been unsuccessful in finding a good single explanation of Shap's reasoning in doing so.

Update: There's an interesting thread on the subject here.

Update again: Mark Miller's post here sheds a little more light in the context of E.

### Because...?

"Smalltalk's live upgrade allows... well... anything"

It's Smalltalk - it can be changed to do as much or as little as required.

If there exists a 'live upgrade' technique that protects against leaving the system in an inconsistent state, why would that technique not be usable in Smalltalk?

### Nomic

Isaac Gouy: It's Smalltalk - it can be changed to do as much or as little as required.

It's not clear whether that's a problem statement or a solution statement. :-)

Isaac: If there exists a 'live upgrade' technique that protects against leaving the system in an inconsistent state, why would that technique not be usable in Smalltalk?

I think precisely because it wouldn't be mandatory and some later "upgrade" might revert it, or pervert it, or...

More prosaically, I imagine that they meant "Smalltalk's live upgrade can't be made reliable short of replacing it." If your point is that it's replacable, I doubt anyone would argue with you, but it leaves open the question as to what, exactly, you could replace it with that would be guaranteed reliable without violating some other property of Smalltalk that's considered fundamental. Of course, we'd both be better off just asking the ERights team what was meant.

First, just to be clear, Smalltalk does have working orthogonal persistence, via the snapshot mechanism, and the comment on the quoted page is not a statement about Smalltalk's support for persistence. It is specifically about live upgrade, i.e., changing an existing class in the context of a running system, and having all outstanding instances now act according to the new class definition.

When such a live upgrade happens, all instance representations are effectively converted at that time so that the value of each old instance variable is transfered to the location of new instance variable of the same name, if any, within the new object's representation. Any old instance variables not present in the new class definition are simply dropped. Any new instance variables not present in the old are initialized to null.

First, the non-fatal issue:

A representation-conversion should be correctness preserving, i.e., it should have the property that if the old instance conformed to its old class' invariants, then the converted instance will conform to its new class' invariants. The above conversion rule does this surprisingly often. Often, the programmer can carefully write the new class so that the above conversion will have this property. In cases where this is too hard, one could imagine enhancing the live upgrade mechanism so that the programmer can provide explicit conversion code as part of the new class definition. (Perhaps Smalltalk even has such a thing? I don't know.)

However, live upgrade can't be made reliable in Smalltalk primarily for two reasons, both of which have to do with changing methods, and neither of which have to do with changing the representation of named classes:

1) Smalltalk is multi-threaded. (This happens to be mostly-cooperative multi-threading, but this makes no difference to the issue here.) There is no moment at which all threads are necessarily quiescent. Mutable objects are only required to satisfy their class' invariants when they're quiescent, i.e., not when they're in the midst of executing a method. Further, once a method is upgraded, how should stack frames in the midst of executing that method proceed? Should we also allow the programmer to provide code for converting old stack-frames in progress into corresponding new stack-frames in progress? Anyone think this is realistic?

2) Named classes are not the only definitions that have instances. A Smalltalk "block" is an anonymous lambda expression. A Smalltalk "block closure" is the closure resulting from evaluating the lambda expression -- it is an instance of its block. These are anonymous, so there's no reliable way to say, for a new version of a method, which of its contained blocks are the successors to which contained block, if any, from the old method. Without this information, Smalltalk can't know which new block the old block closures should be upgraded to be an instance of.

IIUC, what Smalltalk actually does in both of these cases is to not upgrade the impossible representation: it allows old stack frames to run to completion on their old method code, even though this old code will now be interacting with new objects. Likewise, block closures are not upgraded. They continue to behave according to the code of their original blocks. For supporting rapid interactive development, these are probably the right choices. The point though is that, given the overall Smalltalk architecture - no quiescence points and anonynous lambda expressions - neither these nor any other possible choices will always preserve consistency across a live upgrade, no matter how careful the programmer.

Erlang does live upgrade of a sort, separately per-process, at a quiescence point in the logic being upgraded within each process, and where the only state passed forward from the old code to the new are data - terms, not instances of old anonymous lambda expressions.

E does not yet support live upgrade. It never has, and I wouldn't be surprised if it never does. However, two aspects of E make success more likely: like Erlang, E has natural quiescence points - per vat between turns. E does have anonymous closures, but E's syntax is biased towards encouraging the naming of any object definition expressions whose instances might still be live after their creation turn. (Closures needed for conventional immediate-sequential control structures don't survive their creation turn, and so won't need to be upgraded.)

### Non-orthogonal persistence and distributed consistency

EROS itself is now CapRos, and its architecture remains designed to support orthogonal persistence. The EROS successor Coyotos initially differed from EROS primarily by dropping support for orthogonal persistence. But the problem this leads to, secure restart, is one they never solved. Instead, Coyotos is now expected to support orthogonal persistence, in much the same way EROS does.

Regarding E, chapter 17 of my thesis explains more about E's approach to non-orthogonal persistence. The issues involved in recovery of distributed consistency following the crash-revival of a vat may be of interest. Be sure not to miss the discussion of post-crash consistency of KeyKOS device drivers in section 17.5.

It is unclear whether E's revival mechanism lends any insight to the secure restart question that Coyotos would have had to face. The systems are sufficiently different that it's hard to translate these issues between them.

### Personally, I find EROS'

Personally, I find EROS' move away from orthogonal persistence more interesting, since it supposedly represented the "holy grail" of orthogonal persistence

EROS/CapROS will always be persistent. Coyotos, the followup project to EROS, will also very likely have transparent orthogonal persistence. Various paper's on that site explain why Shap was considering removing persistence.

### useful, at least anecdotally

Personally, I never used the security features of KeyKOS. (although now, in the web world, I imagine they'd be generally useful -- being able to deploy a search engine without (a) giving away information about the engine's implemenation to the users and without (b) giving away information about the users' searches to the engine's owners was the sort of problem KeyKOS was designed to address). But I did use the persistence, maybe once or twice a week (when one lives in the same system image as the kernel developers, this sort of thing is bound to happen from time to time), and I was grateful for not losing any work.

One of the more impressive demonstrations (from after I'd left) was after they'd included a hosted UNIX subsystem -- the first part was the usual demonstration that a workstation's attention span is no longer than its power cable, and the second part involved plugging the machine back in, and then moving the mouse around the screen. The desktop would initially restore with empty windows, but as each app was given focus, it'd be swapped right back in, displaying a pre-power failure state.

Moving from that world to Windows 3.0 gave me plenty of opportunity to meditate upon the fact that all systems are fault-tolerant -- it's just that with some systems, it's the end user's job to tolerate the faults.

Sorry for the delay, I've been in New Orleans the past week or so. I'm well aware that you can do this; the issue isn't that you can add the field the issue is with what value do you initialize it? In general, it may well not be possible to make a correct choice.

As for conflicting upgrades, I don't have a specific example*, and don't really see why you need to ask. It's relatively trivial to generate one. The issue is pretty much exactly the same as CVS conflicts and the only general solution is the same, manual intervention to merge (perhaps by choosing one or the other) the changes. This is exactly what the eRights page was talking about. You cannot automate this in general and lay users may well not be competent to "recover from inconsistencies".

Looking at some of your later responses and the how you are presenting your questions I think you see this as an "attack" against Smalltalk in particular. In one version of a response, I had a comment along the lines that this is not Smalltalk's fault, it is inherent in the problem. This is not a failure of Smalltalk. As I mentioned above, this is about the same problem in RCSes or SCCSes and the only such system I'm aware of that attempts all but the most basic automated merging is darcs, and of course that gets nowhere near "solving" the problem.

* However, when I did play with Squeak, I certainly had such issues.

### Transparent persistence is not well-defined

I have never believed in transparent persistence because there are too many cases where it's either not clear how it should behave, and the programmer must specify it explicitly, or it's impossible to work at all. Examples:

• If the state is restored in a program which has certain types defined slightly differently, which parts should be saved and restored, and which should acquire the new meaning from the running program? This depends on what we consider a part of the object. For example if data refers to a function (maybe global named function, maybe an anonymous closure).
• If we talk about saving a given piece of data, as opposed to freezing the whole session, where are its boundaries? What if it refers to objects considered global?
• How to save open files? Open network connections? It doesn't make sense conceptually because we can't force the rest of the system nor other systems to maintain a state which can be connected back in the same configuration.
• A practical language allows to wrap objects implemented in other languages. They are opaque even to the runtime system, then can't be serialized transparently, without explicit specifying how to serialize a particular kind of object.

Serialization makes sense when programmers explicitly specify how a given type is serialized. The system can automate that in simple cases, but even then the default definition should be overridable in case it's not appropriate (e.g. part of object's state is a large cache which is better recomputed than stored, or it has an invariant which should be guaranteed even if the serialized data has been altered). There will always be some unserializable types (except in toy languages).

### Squeak?

How do Smalltalk's saved images deal with these issues?

### Transparent Persistence IS well-defined

I'd like to respectfully disagree, Qrczak. TP may be hard to define if it is done at the application level and must interact with legacy systems, but I think it can be well-defined if it is implemented at the kernel. would you agree that it might be well-defined with the following caveats?

1. I am not considering per-application persistence, but rather TP on the entire OS itself. Therefore, closures, function references, etc. would remain intact both before and after a reboot. Also, this means that persistence is always global.

2. Open network connections would, upon a system restore, simply throw an exception, just as if a web server had returned an error during a transfer. Good network code has plenty of error-handling logic. This is more of a distributed-systems-fault-handling problem.

3. Files don't really exist in a TP system, since any block of memory has a guarantee of being persisted to the disk.

4. If a wrapped object had a valid memory representation at one point, why can't that simply be restored back after a reboot, just as if VRAM had paged that memory out and then back in? The whole point is to bring the system back to the same memory state so that you can program your code as if the system was not rebooted.

5. You do bring up a good point - caches, video memory, etc that should not be persisted. This could be implemented as a special "NoPersist" attribute in the language implementation.

6. Please clarify what you mean by an unserializable type. I understand that hands-free CROSS-MACHINE serialization is not always well-defined, but surely WHOLE-SYSTEM INTRA-MACHINE serialization is trivial? I mean if the entire RAM is brought back to an identical state?

In conclusion, I agree with your statement in general, but not as an absolute - TP can be well defined if it is implemented at the bottom level - the kernel. What do you guys think?

Geoff

### Ok, better defined, but is it useful?

In general it works better when the boundary between what is saved/restored and what is acquired from the new reality is smaller.

Shifting it to the OS level removes some places which are difficult to synchronize (e.g. references other processes running in the system), and adds other such places (hardware configuration which might change in the meantime, which was not a problem for per-application state saving).

It doesn't allow to freeze an individual process and restart it on another machine. This means that application writers must still implement something explicitly if they want their application's state to be saved and restored elsewhere. In particular it's not a substitute for saving documents.

But what I'm really worried about is that it throws away OS concepts that we are used to. It would have to bring tremendous advantages for people to accept that they no longer work with files, and it would have to redesign and redevelop substitutes for the plethora of developer applications and mechanisms: editors, compilers, version control systems, access restrictions on a multiuser system, backups, distributing the application over a network, searching for text in the source, finding differences between two versions of the source, typesetting documentation, temporary forking a part of the source to try something out and later reverting to the old version (without disrupting the evolution of other parts), moving sources between projects etc.

In fact this is why I reject Smalltalk. Because it creates its own world, separate from the rest of the system, and insists in reimplementing everything in a way which is limited to Smalltalk rather than language-agnostic.

### Richard Gabriel thinks so

It's lack is his first example of what's wrong with the state of technology that he uses to motivate his Feyerabend Project.

### And I don't think so

I don't turn off my computer.

### Cautionary tale

I was thinking about this recently. The Newton was built on a language (NewtonScript) descended from Smalltalk via Self; in particular, everything in the system (including the entire system) was one great big object, and the OS took care of persisting everything. Basically, the thing was a Dynabook, but not in Smalltalk.

It made for some great features on the Newton (e.g., it was possible to extend the apps); but it also meant that you were stuck in a walled garden. It was possible to back up your data to the desktop, but all that meant was that your desktop had a copy of the master NewtonScript object. "Transparent persistence" turned your data into an opaque blob.

### Costs

francis: "Transparent persistence" turned your data into an opaque blob. "Transparent persistence" turned your data into an opaque blob.

Not necessarily—I can't help but note that what you've described is the process of taking the entirety of your Newton environment and embedding it in another environment. What else would you expect but for it to be a blob? You're obviously not claiming that the Newton didn't interact with other systems because it obviously did. Nor, obviously, are you claiming that it was impossible to write Newton software that could marshal NewtonScript objects to and from other formats for serialization and/or persistence.

Unfortunately, though, this does indeed seem to be what people really want in practice: they want to piecemeal their system in formats that are meaningful in the context of other systems with other design goals and other features, and from that point of view, "orthogonal persistence" is indeed—well—orthogonal: it doesn't solve that problem.

It's been interesting reading the E language take on orthogonal persistence and the EROS/Coyotos take: E observes that it's basically impossible to do orthogonal persistence and upgrade at the language level without OS support; EROS observes that it's possible to do orthogonal persistence at the OS level but it seriously complicates the design and makes it unfamiliar to application developers and users; Coyotos observes that if you abandon orthogonal persistence at the OS level you get a simpler design that's both more amenable to formal verification and more familiar to developers and users, but where does that leave a language like E?

Let's face it: these are hard problems!

### Nor, obviously, are you claim

Nor, obviously, are you claiming that it was impossible to write Newton software that could marshal NewtonScript objects to and from other formats for serialization and/or persistence.

It was possible; but the Newton seduced people into thinking it wasn't necessary. If you want to be able to get your data out of the persistent world, you still need to write an ordinary serializer; so you might as well use that serialized format for data storage, rather than winding up with incompatible representations.

### This is the Conflation of Concepts I Was Describing!

francis: If you want to be able to get your data out of the persistent world, you still need to write an ordinary serializer; so you might as well use that serialized format for data storage, rather than winding up with incompatible representations.

This is where people always seem to go astray: it isn't a value-neutral thing to say "since I need to serialize and export objects anyway, my orthogonal persistence should be done in those terms." It has ramifications in terms of how virtual memory works (if you have virtual memory); it has security ramifications (cf. E and EROS); it has semantic implicatons with respect to the software (does it even make sense to talk about serializing object X and not object Y?) etc. It's far from obvious that Newton did the wrong thing, but also far from obvious that they did all of the right things. I do maintain that one right thing is treating interoperation, e.g. serializing objects to some transport, and orthogonal persistence, e.g. snapshotting the entire system, as distinct tasks.

### The system must do the type/class bookkeeping

Since the O/S takes over persistence, it is the O/S that must make sure that types match the data. So if a 'file' is 'opened' by an incompatible 'type', the system should throw an exception. It is assumed that the system must not only keep types, but versions of types also.

For example, a Word 95 document should be handled by type /class 'WordDocument' v. 95, and a Word 2003 document should be handled by a type/class 'WordDocument' v. 2003.

### Transparent Persistency -- Looking at the World from the Store

For the world's suite of computer systems to be most useful, many of these computer systems must include persistent stores; I don't think this is a controversial statement. Now a conventional way of viewing the interaction between a persistent store and the world is from the viewpoint of the world. For example, as members of the world, you and I might interact with a Unix file system. We know that the state of the file system includes i-nodes, directories, files, etc., and by interacting with the file system, we cause it to evolve the state of those components.

An alternative viewpoint, which I suggest is worthwhile, is to say instead of viewing the state of the persistent store from the point of view of the world, let's view the state of the world from the point of view of the persistent store.

To do this, we can take the persistent store to have some native programming language, and we view the state of the store as being just the state of execution of a program in this language as it interacts with the world.

An example of this is the GemStone database management system (DBMS). GemStone's native language is Smalltalk augmented with an operation to demark transaction boundaries.

Another possiblility for the native language of a persistent store would be a concurrent constraint language. This has the advantage of declarativity. There is no need for an added construct to demark transactions, because concurrent constraint languages have natural transactions. That's why we call these languages concurrent.

Which would be more suitable as the native language of a persistent store, a concurrent constraint logic language, or a pure lazy functional language augmented with an infinite supply of oracles?

And if your answer is a concurrent constraint logic language, should it be eager or lazy? Is there even such a thing as a lazy concurrent constraint logic language?

An example eager concurrent constraint logic language is ToonTalk. Another example is the Actor model of Hewitt and Agha.

The Carlsson and Hallgren (1998) thesis on Fudgets discusses oracles.

### lazy concurrent constraint logic language

Is there even such a thing as a lazy concurrent constraint logic language?

Curry or
Alice are good starting points.

### And Oz too

And Oz too, of course.

### Oz

No, Oz is an imperative language. It has Cells, which are subject to assignment. This is a primitive in the language, not something built up from a functional base.

### So does Alice

So does Alice. In my experience Oz code limits the use of state just as much as ML code.

### State is only bad if it contaminates everything

No, Oz is an imperative language. It has Cells, which are subject to assignment.

Of course Oz is an imperative language. Haskell is also an imperative language (because of unsafePerformIO). This is not bad because you can easily program without state in both languages if you want. The state is well-factored from the rest. It would be bad if the state were to contaminate everything (like in Java, in which for example all arguments to calls are implicitly stateful).

### State Isolation

I've just barely examined Oz, but it seems that state is allowed to be used anywhere. If so, there's a crucial difference from Haskell where a programmer can know that a function is pure just by looking at its type, as opposed to Oz where things are generally pure, but code has to be prepared for impure arguments, or ward them off with documentation. Of course, the distinction is minor compared to that with somehting like Java or C, where you much of the standard library involves state.

I assume by "Haskell is also an imperative language" you mean Haskell isn't referentially transparent, which is as true as "Java is not memory-safe" (thinking of JNI). The convention in Haskell is that by using a function whose name starts with unsafe the programmer is either claiming to posses a proof of some function-dependent obligation, or waiving any right to any of the nice properties of the language, all the way up to memory-safety. The promise around unsafePerformIO is that the result of the argument computation doesn't actually depend on when or how often it is executed (or what is executing concurrently). The ST monad and runST are pretty neat because they actually prove that with the type system for a restricted set of computations using just mutable variables.

### What should a type signature tell you?

a programmer can know that a function is pure just by looking at its type

It is often useful to have a stateful implementation of a function that behaves as if it were stateless. For example, if you want to implement memoization from scratch (i.e., the language doesn't give it to you magically), then you need this. If the type system does not let the implementor of a function hide its statefulness, then I consider it too weak.

### Types and state

If the type system does not let the implementor of a function hide its statefulness, then I consider it too weak.

I would argue the opposite: a type system that lets you hide a function's statefulness is weaker in the sense that you cannot fully trust the type system. Pragmatic, sure; but weak? In any case I don't think this is about a property of the type system as much as it is about how the type system is used in practice by an implementation.

### What Does "Hide" Mean in This Context?

Does Haskell's STRef "hide" the statefulness of a function? If so, in what sense can the type system not be trusted, and in what sense is this not a property of the type system? If not, in what sense does STRef not "hide" the statefulness of a function?

### STRef doesn't hide anything.

STRef doesn't hide anything. The ST monad (and importantly runST) does, though.

### Memoization and unsafePerformIO

Does Haskell's STRef "hide" the statefulness of a function? If so, in what sense can the type system not be trusted, and in what sense is this not a property of the type system? If not, in what sense does STRef not "hide" the statefulness of a function?

The example Peter gave was implementing memoization. A correct implementation of memoization is referentially transparent from the outside even though it is internally side-effecting. Thus you'd like to keep the state monads out of the picture for modularity reasons: I should be able to make a function memoizing without changing its interface. In GHC they work out around this "problem" by using unsafePerformIO. I think this is a great pragmatic compromise but my point was that allowing something like unsafePerformIO into the language weakens the type system in a formal sense.

### There's a lazy version of

There's a lazy version of the ST monad, I imagine that can be used instead.

Isn't the whole point that you don't want the interface to change so the state monadic style is inapplicable here? Essentially we want memo :: (a -> b) -> a -> b instead of memoST :: (a -> b) -> ST Memo a -> ST Memo b. GHC achieves this with unsafePerformIO.

### Depends on what you're

Depends on what you're memoising. In many cases there's a data structure of results from which you can retrieve the one you want, and you can use a lazy invocation of the ST monad to build that structure.

### Misunderstanding ST?

A major reason ST exists is because its use does not need to be apparent in the type of a function. There is a function runST :: (forall s . ST s a) -> a, which lets you do calculations involving mutable state inside a pure function, so long as the calculation is actually referentially transparent, because it only uses mutable structures it creates itself. (the forall cleverly ensures this - essentially skolemization of s during type checking creates a unique brand for each ST computation - see Oleg's work under the heading Lightweight Static Capabilitites for similar enforcement of more interesting properties).

What you can't do with ST, and perhaps what you meant if I am the one misunderstanding, is share a piece of mutable state between different applications of the function internally using ST. I think proving that a function which does share state like that is still referentially transparent is well beyond the Haskell type system (cue Oleg), thus falling back to iveProvenThisSafeMyselfSoPleasePerformIO.

What you could do withST would be something like a mapMemo :: (a -> b) -> [a] -> [b], which behaves like map but internally uses a mutable hash-table to memoize between mapping different list elements.

### runST

What you can't do with ST, and perhaps what you meant if I am the one misunderstanding, is share a piece of mutable state between different applications of the function internally using ST. I think proving that a function which does share state like that is still referentially transparent is well beyond the Haskell type system (cue Oleg), thus falling back to iveProvenThisSafeMyselfSoPleasePerformIO.

Yes, this is what I'm talking about. I realize that you can use runST without changing the external interface when you only need "internal" memoization.

### A good many type systems in

A good many type systems in use today aren't strong enough to type the ST monad or a near-equivalent correctly - and thus couldn't allow you to safely hide the fact a pure function has an internally stateful implementation.

### The ST monad is my friend!

The ST monad is my friend! Though yes, it does leave us with reasons to want impredicative rank-n polymorphism.

### Curry

Curry isn't even referentially transparent. Referential transparency is considered desirable, isn't it? I'm still am not clear exactly why, but the FP community seem to think this property brings important benefits.

Well, it gets hard to reason about a function if you call it with the same parameters 100 times and get 100 different results.

Perhaps it's the ultimate refuge of the truly touched:"Insanity: doing the same thing over and over again and expecting different results."

### Reasons I like referential transparency.

Software tests require repeatable results. When I did J2EE for a living, it was difficult to ensure that the state of the J2EE server was exactly the same on different test runs.

With referential transparency,

• the output of a function depends only on the input. There are no surprises.
• A call to a function may be replaced by the result of the call.
• code can run in multiple threads at the same time.
• lazy evaluation becomes worthwhile

Some of the
This has been written about on WardsWiki, see AdvantagesOfFunctionalProgramming, GraphReducer, and FpVsOo for some useful insights.

The last one is a particular favorite of mine, it got me thinking about the nature of identity in FP vs OO. Alistair Bayley mentioned that pure FP has only value equality, and I was slighly more enlightened when I understood that.

There are two in-progress articles that will hopefully be in the March issue of The Monad.Reader. One is an article on the nature of FP vs the nature of OO by Alistair Bayley, and one is on duality by Esa Pulkkinen. Those two articles fit together in that FP can be viewed as the dual of OO. Now that I've wandered far from the topic, maybe I should stop...

--Shae Erisson - ScannedInAvian.com

### Why RT matters

RT gives you more static guarantees about the behavior of a given piece of code. If a language has referential transparency, code that produces observable (for some definition of observable, YMMV) side-effects will have a distinct type (e.g. IO Bool in Haskell, *World -> (Int, *World) in Clean). AFAIK the same can't be expressed in languages without RT.

### RT in a non-RT language

AFAIK the same can't be expressed in languages without RT.

Using abstract types (if the language supports them) you can express referentially transparent constructions even in a non-referentially transparent language: Pure ML.

Without inspecting the source code of the SKI structure (i.e. just relying on the sig and the compiler) how can one be sure that using $or S won't have side-effects? Also how useful is a structure like that if I can't inject any values in it? In the end of the day we have to create a sufficiently large pure language inside the pure structure/functor that is just like a kernel RT language. We could even create an abstract type in the kernel RT language to represent operations with side-effects and abstract operations for sequencing such operations, injecting values in it and primitives for opening files, sockets, etc.. Perhaps we should call it the IO monad ;). How far do we need to go until we end with an interpreter to an RT language inside an non-RT one? As languages are turing complete how does this embedding invalidate my earlier point? It just reinforces the idea that you need a static analysis phase to prove static properties of code and that if the type system is sufficiently expressive we can emulate features of another language/type system in it. ### Missing the point Without inspecting the source code of the SKI structure (i.e. just relying on the sig and the compiler) how can one be sure that using$ or S won't have side-effects?

That is a red herring and misses the point. But, I'll counter with an analogous question. Without inspecting the implementation, how can one be sure that computations expressed using the IO Monad *will* have the intended side-effects?

Also how useful is a structure like that if I can't inject any values in it?

(I'm surprised. A Haskell programmer should know better. ;-) You could just as well ask how useful is a Monad or an Arrow if you can't just inject new kinds of *computations* [Edit: I meant computations that have new kinds of side-effects] into it.

How far do we need to go until we end with an interpreter to an RT language inside an non-RT one?

You could say that the SKI structure already is such an interpreter. Indeed, that's the the idea. You can express RT computations in a non-RT language by creating a combinator library that doesn't allow you to create non-RT computations. If you think about, RT and non-RT languages are duals:

In an RT language you express non-RT computations as combinations of primitive non-RT computations that are then interpreted. (That's what you do with Monads, Arrows, etc.)

In a non-RT language you express RT computations as combinations of primitive RT computations that are then interpreted. (That's what you'd do with the SKI structure.)

As languages are turing complete how does this embedding invalidate my earlier point?

Who said anything about invalidating points? My point is that you *can* express RT constructs in a non-RT language.

[Edit: Clarified some wordings. No intentional semantic changes.]

### I miss the point because I write in point-free style

That is a red herring and misses the point. But, I'll counter with an analogous question. Without inspecting the implementation, how can one be sure that computations expressed using the IO Monad *will* have the intended side-effects?

I'm surprised. A SML programmer should know better. ;-)
By analyzing the code's transitions using the language's semantics/type system and verifying that the desired side-effects do occur. As far as I'm concerned the language's type system/semantics is its definition. If a type is int -> int it does matter if the language offers RT out of the box or doesn't. If the language has RT I can be sure that all functions in the universe ever won't (modulo unsafePerformIO and friends) do anything other than diverge or produce a valid int (we could even restrict divergence if we wanted a la Charity). In a language without RT I need to analyze every function to see if it has any side-effects.

Simon Peyton Jones once wrote that unsafePerformIO shifts the burden of proving that a piece of code doesn't have [visible] side-effects from the compiler to the programmer. In a language without RT the burden is always on the programmer.

In my post I said that RT does give you additional static guarantees. If I start programming in a subset of SML that is proven to produce no side-effects I'm not programming in SML anymore. Also the full proof can't be checked by the type-system. Of course the type checker will verify that the abstract types are properly used but the kernel library that defines the semantics of these abstract operations can't be verified by the compiler.

If you think about, RT and non-RT languages are duals

I agree with you (using your definitions) that they have this dual nature. We could start with a RT language and add a side-effecting layer (i.e. monads, unique/linear types) or start with a non-RT language and add a "pure" layer, but unless the latter is part of the language's semantics/type system we can't have the full guarantees given by the former.

### No, seems like a more fundamental misunderstanding

By analyzing the code's transitions using the language's semantics/type system and verifying that the desired side-effects do occur.

Precisely. You'd do the same with the SKI language (to be sure that side-effects do not occur). The SKI "language" (or any other pure language you'd want to use) can be (has actually already been) given a precise defition. The only difference is that the SKI language is implemented as a combinator library in SML rather than as a Haskell run-time written in Haskell and C (for example).

If a type is int -> int it does matter if the language offers RT out of the box or doesn't.

Indeed, the meaning of a type is always interpreted with respect to a type system. While both SML and Haskell have a type constructor -> they are not the same, because the underlying type systems are different.

In a language without RT I need to analyze every function to see if it has any side-effects.

Simon Peyton Jones once wrote that unsafePerformIO shifts the burden of proving that a piece of code doesn't have [visible] side-effects from the compiler to the programmer. In a language without RT the burden is always on the programmer.

Is this some kind of attempt at appeal to authority? Again, you can specify and use a combinator library like SKI to define RT functions so you don't have to. I can prove that you can't build non-RT values using the SKI structure and I don't need SPJ to do that.

If I start programming in a subset of SML that is proven to produce no side-effects I'm not programming in SML anymore.

That is just ridiculous. So, assuming that you apply the same definition to Haskell, then when you program in Haskell you always use the full language (including unsafePerformIO) at each point of the program or otherwise you are no longer programming in Haskell.

Also the full proof can't be checked by the type-system. Of course the type checker will verify that the abstract types are properly used but the kernel library that defines the semantics of these abstract operations can't be verified by the compiler.

The run-time system and compiler of a typical Haskell implementation that implements the various operations of the IO Monad is not verified by Haskell's type system to perform the desired side-effects.

We could start with a RT language and add a side-effecting layer (i.e. monads, unique/linear types) or start with a non-RT language and add a "pure" layer, but unless the latter is part of the language's semantics/type system we can't have the full guarantees given by the former.

It is trivial to see that you can't build non-RT functions using the SKI structure and the type system of SML guarantees it (through type abstraction).

### Let's just agree to disagree

I'm not trying to prove that Haskell (or Clean or whatever) is superior, as I'm not trying to prove that any approach is superior. My point was very precise and limited, that RT gives you an additional static guarantee (and I'm not even trying to prove that this guarantee is useful or whatever). If I implement a SKI/Lambda Calculus/Haskell library on top of SML I will have the no side-effects guarantee only for code written using this library, all other features of SML still don't have this guarantee. This is a fact because we don't have RT on SML as a whole, just on this subset of SML.
If you still feel that I'm misunderstanding you, please go ahead and explain it. I'll read your post, but I'm won't bother other LtU regulars with yet another answer (from me) because I can't say anything different than I already said.

### Neither more nor less

Non-RT and RT languages, as discussed here, are duals of each other. Neither really makes for more guarantees than the other. The difference is that in an RT language everything is RT except the non-RT interpreter on top while in a non-RT language everything is non-RT except the embedded RT interpreter. Using a Haskell like notation for types, you could present this as:

Type of Computation or Value
non-RT RT
Kind of Language RT NonRT a a
non-RT a RT a

In an RT language your program is a value of type NonRT a (for some a - usually ()) that gets interpreted by a non-RT interpreter. In a non-RT language your program can construct values of type RT a and interpret them with the RT interpreter (and the point is that such computations are guaranteed to be RT and be subject to further optimization, for example).

In practise, languages like Haskell and Clean provide some sugar for the non-RT stuff (e.g. do-notation) and also include one (or more) non-RT interpreter(s) by default (because you couldn't really do anything useful without it) while languages like OCaml and SML do not. I can imagine adding some sugar (syntactic and perhaps more) to an ML like (non-RT) language to support RT, but such additions aren't formally necessary.

### RT could also work with restricted assignment.

Suppose we have a programming language which allows assignment of local variables only, including arguments. Wouldn't such a language pass the test of being referentially transparent? In this language, each subroutine's result would only depend on its input.

Static guarantees can also be given in the context of C as well; tools like Lint do quite effective static analysis.

### On one hand for a language

On one hand for a language to be RT it is often assumed that every operation should be RT and assignment operator is clearly not. It has an side effect, even if just local. On the other hand, on the surface, this is exactly what Haskells state monad does.

### Why restricted assignment operator is not RT?

Isn't the point of RT to have results of functions depending only on their arguments? so why a restricted assignment operator is not RT?

### The point of RT is that the

The point of RT is that the value of an expression shouldnt depend on any side effects. For functions this means that there value should only depend on ther arguments. For other expressions it means that there value should be constant. If a variable or reference could have its value changed by the (side) effect of an assignment operation, then it isnt constant and therefor not RT.

### Almost.

What I mean is to restrict assignment to local variables and arguments - no assignment to variables of the outer scopes. This means that functions' results would depend only on their input. A sequence of functions which modify their arguments seems similar to binding computations using <- in Haskell. It sounds like RT to me, without the limitations of pure FP.

### Which "limitation" do you

Which "limitation" do you see pure functional programming having that is not shared by your scheme?

### The ability to do in-place value updates?

For example, in-place quick sort is not possible (perhaps) with pure FPs, is it?

### I'm sorry if I'm just being

I'm sorry if I'm just being dense, but I don't see how it's possible with your scheme either.

You can't mutate an array passed to quicksort, so you have to make a copy local to the function and mutate that, then return it (probably copying it if you intend it to be mutable in the caller). Quicksort is recursive, remember, so now each of the two recursive calls returns separately allocated arrays, so you have to allocate a third array to copy both of those into and return (where has to be copied yet again, if it is to be mutable in the caller).

I guess you can get "in-place" (for a suitable definition of in-place, i.e. only making two whole copies of the original array ;)) if you implement your own stack to make quicksort recursive. But then I have to ask the dual of my question---what do you expect to gain by referential transparency? You now have a world where function calls and returns are expensive (all the copying), so you end up with huge function bodies which are as imperative as ever. Seems like a net loss to me.

### Is copying necessary?

Getting a compiler to notice that expressions like x=f(x) can be optimized with call by reference seems like it would be pretty easy - much easier than trying to get it to notice that a functional quick sort implementation can be made to operate in place.

### Arguments would be modifiable.

Arguments would be modifiable...therefore there wouldn't be a need for copying the array.

### Greg Buchholz showed you why

Greg Buchholz showed you why it won't work. If you don't copy arguments, they could get stored in a closure and subsequently modified, thus functions no longer depend on only their arguments (this is Greg's example, simplified a bit):

(let* ([f (lambda (m) (lambda (n) (* (car m) n)))]
[x (list 3)]
[g (f x)])
(display (g 7)) (newline)
(set-car! x 4)
(display (g 7)) (newline))

It doesn't seem to violate your rules. Likewise if you don't copy return values, they could be aliased in a returned closure, and functions no longer depend on only their arguments:

(let* ([f (lambda (m)
(let ([m (list m)])
(cons m (lambda (n) (* (car m) n)))))]
[x (f 3)]
[g (cdr x)])
(display (g 7)) (newline)
(set-car! (car x) 4)
(display (g 7)) (newline))

You can copy your arguments and return values (Matt M thinks that a sufficiently smart compiler can sometimes avoid it). You can disallow higher-order functions (or dump lexical scoping)---ouch. Or, you can disallow mutable data structures (but then you can't write quicksort in place anyway....). Am I just missing something obvious?

Closures should only store copy of values in them, not references to values, thus saving the problem of functions depending only on their arguments.

### Now I really am confused

If closures close over copies of values from outer scopes, then why on earth disallow assignment to those copies?

### Details

As long as you have a sufficiently well defined meaning for "no assignment to variables of the outer scopes"...

 
(define (foo number-list)
(lambda (multiplier) (map (lambda (x) (* multiplier x)) number-list)))

(let* ((n (list 1 2 3))
(bar (foo n)))
(begin
(display (bar 10))
(set-car! n 99)
(display (bar 10))))



### RT

I think Felicia answered your question earlier: you asked whether the definition of RT was "functions" that depend only on their arguments. She answered that RT applied to "expressions". Are function calls the only kind of expressions in the language you're thinking of?

### I did point out the

I did point out the similarities with the state monad a bit up, and sure it is macro expressible. I guess it depends on from where you view it from the outside its RT as Achilleas says, but inside the function it is not. This is by the way also my view of the state monad. To me its just an implementation of an imperative language made in a funcional one.

### Perhaps...

But perhaps it's a path worth taking, especially for people new to functional programming but with experience in imperative programming.

### Encoding the equivalent,

Encoding the equivalent, using binding rather than assignment, is certainly a valuable learning exercise - and I sometimes use a similar style, though I don't actually shadow the names. There's no need for a new language there though.

### I disagree:

It would be far easier to tell a programmer "here is a programming a language like the one you used to code in, with the only exception that assignment is restricted" than "here is a programming language with no assignment".

I can testify to that: I've shown and discussed with my colleagues pure functional programming, and they great trouble doing what they have done so far. On the other hand, a programming language with restricted assignment would work because they would continue to do something similar to what they did so far, with a small difference, but with a huge decrease in number of bugs.

### And just as easy to tell

And just as easy to tell them that assignment works like reference values in Java - that is, binding. Talking about /mutable/ assignment comes later, and let/in starts to look rather a lot like C-style blocks with only one non-declaration statement.

### Lazy declarative concurrency

There is a nice subset of Oz that is both lazy and concurrent (see chapter 4 of CTM). Semantically, it is a lazy concurrent constraint logic language.

### Imagine

What would we software people have done if the hardware people had (don't ask how) provided to us, from the start, only one level of memory? Assume it all cost the same, accessed at the same speed, and persisted until we programmed the CPU to erase it. How would the upgrade problems first have started to manifest themselves? What approaches would have been developed to solve them? After 40 years of evolution in that environment, what would software systems have come out looking like?

### Question

After re-reading the thread, a question comes up to mind: why should transparent persistence be a feature of an O/S that goes down the kernel? couldn't it be implemented as a language runtime future? in other words, why do you need to build a whole new O/S?

### You Don't

You don't have to. But eventually it might be iteresting to re-implement the OS in the new framework.

### By Copy-On-Write

Here's an approach I hadn't thought of (which may or may not have drawbacks, but is interesting):

http://article.gmane.org/gmane.os.unununium.devel/527

OK, looks as though Torsion (cited by the original poster in this thread) is using this same approach. But I found the above-cited explanation helpful anyway for understanding.

### Similar approach

I know at least one system using a similar but higher-level approach to incremental check-pointing. If you have an incremental two-space copying GC you can get this almost for free. The idea is to write objects out to disk as they are moved from one semi-space to another, using the memory barrier (which you already need for an incremental GC) to catch attempted mutations of objects in the old semi-space. When an attempted mutation of such an object is caught, the object is prematurely moved to the new semi-space (thus writing it to disk as a side-effect) before the object is mutated. The invariant (that mutations are never applied to objects in the old semi-space) ensures that the check-point is consistent.