Lambda the Ultimate

inactiveTopic Finalization (CLR)
started 2/21/2004; 1:11:02 AM - last post 2/22/2004; 2:18:14 PM
Ehud Lamm - Finalization (CLR)  blueArrow
2/21/2004; 1:11:02 AM (reads: 9712, responses: 9)
Finalization (CLR)
Finalization can be a tricky business, but as usual Chris Brumme provides a detailed and readable explanation.

One wonders about the cost/benefit ratio of confusing language constructs like finalization methods. Surely, there has to be a better way?

This paper may also be of interest.

Posted to cross-language-runtimes by Ehud Lamm on 2/21/04; 1:14:11 AM

Olivier Lefevre - Re: Finalization (CLR)  blueArrow
2/21/2004; 8:32:19 AM (reads: 322, responses: 1)
Finalization has a reputation for being tricky to implement correctly but if you don't care about implementation issues, how is it confusing?

Andris Birkmanis - Re: Finalization (CLR)  blueArrow
2/21/2004; 8:38:21 AM (reads: 325, responses: 0)
Specification usually guarantees very little about finalization. Starting from unspecified order of invocation, through ability to reach already finalized objects, towards completely unguaranteed finalization (as in Java). And do not forget about problems with repeatability of test cases. All in all, it's tricky not only to implement finalization, but also to use it correctly.

Neel Krishnaswami - Re: Finalization (CLR)  blueArrow
2/21/2004; 9:42:18 AM (reads: 312, responses: 2)
Yeah, everything Andris said is right. But we still need finalization as a feature, since it's the easiest way to link up C libraries with explicit memory management to gc'd languages -- put the C deallocation function in a finalizer.

Andris Birkmanis - Re: Finalization (CLR)  blueArrow
2/21/2004; 10:06:31 AM (reads: 311, responses: 0)
I didn't denied usefulness of finalizers. Even Haskell has them ;-)

A side note for audience: an introduction to finalizers Destructors, Finalizers, and Synchronization was left without any discussion. Should we crosslink these threads? :-)

Tim Sweeney - Re: Finalization (CLR)  blueArrow
2/21/2004; 1:02:58 PM (reads: 283, responses: 0)
Supporting finalization in a garbage-collected environment leads to a significant number of issues with performance, non-determinism, and collector inefficiency.

Rather than going down that path, I would much prefer to see significant R&D effort put into the development of imperative-style data structures and algorithms in a way that doesn't require finalization. I have experimented with some of these issues myself.

Here are some thoughts.

File handles: If you are going to just open a file, do a bunch of operations, and close it, you could use a monad encapsulation similar to the way Haskell's State works, enabling opening the file, performing operations, and closing it, in a way that's guaranteed to result in the file being closed when exiting the lexical scope in which it was created.

But much more interesting is the possibility of using memory mapped files, and treating file-opening calls as returning a (possibly mutable) array of bytes which you can operate on. Then the memory mapped file would be closed when the array is eventually garbage collected. The big problem here is that in current file-mapping implementations, holding a file handle isn't referentially transparent: another process can observe whether the file is locked. This could be remedied by adding support in the OS for "tear-off" file mappings which memory map the file and then make it appear to the outside world that the file is closed (for example, so another process could modify it or delete it without a sharing violation). In this case, the OS paging mechanism would need to implement a copy-on-write-or-delete scheme for torn-off file handles, so the referential transparency requirements of both the file-using process and the outside world are satisfied.

Network sockets: Sockets for protocols like TCP are not amenable to finalization-free garbage collectors, because the process of closing a socket has non-referentially-transparent IO effects that ought not occur nondeterministically. But a language could expose higher level protocols that are more suitable, for example creating the illusion of monadic persistent connections between processes.

Note that all of these considerations aren't terribly relevant in a runtime like .NET which doesn't have a first-class concept of referential transparency. There, any variable in any object might change at any time because other functions to which you have passed the object could at any time modify any variable in another thread or through reflection. But in an environment with a functional subset or just a means of specifying and verifying limits on the scope of effects of computations, it becomes very important to avoid observable side-effects of garbage collection.

Dan Shappir - Re: Finalization (CLR)  blueArrow
2/21/2004; 3:33:57 PM (reads: 260, responses: 0)
I always found this finalization issue rather amusing: isn't GC supposed to make things easier? And yet finalization is so much easier in C++ than in, say, Java (I was tempted to say trivial in C++, but there are some complications there as well: partial objects, and recursive exceptions for example).

There are alternatives: in Java for example, you should use the finally clause and a close method. C# makes it even easier with IDisposable and the using statement. But, at the end of the day, the library writer cannot rely on its users to always use it optimally. Hence, she must employ finalizers, and suffer at least some performance degradation as a result. The fact that a rare system resource might not be released for the duration of the application is even worse.

Ehud Lamm - Re: Finalization (CLR)  blueArrow
2/22/2004; 2:14:51 AM (reads: 210, responses: 0)
Languages must dance or die, right?

But in some cases one wonders whether it's worth the price?

Franck Arnaud - Re: Finalization (CLR)  blueArrow
2/22/2004; 2:18:14 PM (reads: 168, responses: 0)
This could be remedied by adding support in the OS for "tear-off" file mappings which memory map the file and then make it appear to the outside world that the file is closed

This is the semantics of 'rm' in Unix, you can remove a file opened by another application. The problem is if someone leaks such resources, you sometimes find yourself with an empty disk (du = 0) that is full (df = 0)! Well, it makes sysadmin more challenging.

But, at the end of the day, the library writer cannot rely on its users to always use it optimally.

It's sad when it's due to poor language support. It does not seem very hard to make to enforce at least some cases of usage/release with the type system, e.g. if the API for reading files was like:

 foldfile : (Line -> Result -> Result) -> Result -> Filename -> Result

The open/close can be safely included within one atomic call.

Even some trivial stuff like allowing finalizers only on reference-less objects would make the "freeing C API resource" pattern much safer; and is not incompatible with having a distinct unsafe facility for other things, if needed.