archives

The Transactional Memory / Garbage Collection Analogy

Courtesy of my shiny new commute, I have been listing to various podcasts, including Software Engineering Radio. A while back, they had an interview with Dan Grossman on his OOPSLA 2007 paper, which I have not seen discussed here.

The Transactional Memory / Garbage Collection Analogy is an essay comparing transactional memory with garbage collection based on the analogy:

Transactional memory (TM) is to shared-memory concurrency
as
garbage collection (GC) is to memory management.

Grossman presents the analogy as a word-for-word transliteration of a discussion of each of the technologies. (Hence the "fun" category.)

(As an aside, Grossman does not address message-passing, but says, "Assuming that shared memory is one model we will continue to use for the foreseeable future, it is worth improving," which is probably a correct assumption.)

One point that he does make is that

[This essay] will lead us to the balanced and obvious-once-you-say-it conclusion that transactions make it easy to define critical sections (which is a huge help in writing and maintaining shared-memory programs) but provide no help in identifying where a critical section should begin or end (which remains an enormous challenge).

The one serious weakness of the analogy, to me, is that GC does not require (much) programmer input to work, while TM does.

Although some parts of the analogy are strained, there are some interesting correspondences.

"Very linear" lambda calculus

What portion of linear lambda calculus corresponds to using only the B and I combinators (composition and identity)? I have a hunch that it's all lambda terms that use their input variables exactly once and without permuting them, but I'm finding it hard to prove that.

The standard abstraction elimination algorithm doesn't work to give a combinator:

\abcd.a(b(cd))
=\abc.B a (B b c)
=\ab.B (B a) (B b)
=\a.B (B (B a)) B

and to eliminate the variable "a" here, I need to use the C combinator.

But there is a combinator that is extensionally equivalent that doesn't use C:

\abcd.a(b(cd))
=\abc.B (B a b) c
=\ab.B (B a b)
=\a.B B (B a)
=B (B B) B

It depends on using the associativity of composition.

How can I tell which linear terms are in the span of B and I?

A Java-like formalism for control flow analysis.

I'm looking into defining a flow-sensitive effects system for Java (to check things like the definite assignment of local variables). Does anybody know of a good existing Java-like formalism I could use?

I don't think Featherweight Java is appropriate here because it mostly ignores Java's control-flow mechanisms. Much of what's interesting about a flow-sensitive effects system has to do with loops, conditionals, exceptions. On the other hand, I care less about things like classes and inheritance.

Maybe the simply typed lambda calculus (plus some Java-like exception mechanism) would work better?

Verifiable Functional Purity in Java

Verifiable Functional Purity in Java. Matthew Finifter, Adrian Mettler, Naveen Sastry, and David Wagner. To appear at 15th ACM Conference on Computer and Communication Security (CCS 2008).

Proving that particular methods within a code base are functionally pure - deterministic and side-effect free - would aid verification of security properties including function invertibility, reproducibility of computation, and safety of untrusted code execution. Until now it has not been possible to automatically prove a method is functionally pure within a high-level imperative language in wide use such as Java. We discuss a technique to prove that methods are functionally pure by writing programs in a subset of Java called Joe-E; a static verifier ensures that programs fall within the subset. In Joe-E, pure methods can be trivially recognized from their method signature.

The paper includes a nice discussion of security benefits that can stem from being able to identify pure functions (of course, it is not obvious that guarantees at the programming language level are maintained at the run time level). I am sure many here have opinions about whether it makes more sense to try to graft purity on an imperative language, exposing it as an added feature, or to move programmers to functional languages..

another multi language learning/overview site

possibly interesting, PLACE is something I stumbled across while googling to read up on Oz. not canonical in any way, but advertising polyglotism seems like a nice public service to me.

Another multimedia dataflow programming system

Didn't find it mentioned previously on LtU, although Pd is apparently wizened so I'd figure some folks here would already have checked it out? Considered to be in the "Max" family, I guess.

Even though several computer scientists followed Sutherland's pioneering work, somehow graphical programming didn't really take off in the mainstream computer world. Besides the LABView[3], a tool popular among industrial engineers, there is only one two-dimensional programming language that has found massive use - and that is the paradigm used in the "Max"-family of software which Pd is a member of.

Content Addressable Type Systems

Hello,

I have an interesting problem that this site's audience may have an opinion on.

I'm assuming that people know what content addressable storage is. Let's say it uses SHA1; we put some binary blob in it, get its hash back, which we can later use to retrieve the blob again.

Now imagine I have a primitive type system that looks something like this (pseudo-code):

    struct TypeInfo
    {
        string name;
        MemberInfo [] members;
    }

    struct MemberInfo
    {
        string name;
        Hash type;
    }

I can use the TypeInfo and MemberInfo structures to describe types, and insert them into the content addressable store. The type field in the MemberInfo structure contains the hash of the field type.

This all works fine and dandy, until a type refers to itself. The trivial case is a direct reference, for example:

    struct BinTreeNode
    {
        BinTreeNode left;
        BinTreeNode right;
    }

But one can imagine more complicated type relations where a type only refers to itself indirectly (i.e., Foo refers to Bar refers to Pep refers to Foo).

Suddenly, I can no longer build a type description using the content addressable store. After all, it means generating a SHA1 hash value for a binary blob that is supposed to contain that very hash itself.

Next thing I know, I'm looking at fixed point combinators and I'm trying to crack SHA1. Clearly that isn't going to work - not given the machine power and time available to me... :-)

Does anybody have some creative suggestions on how to solve this problem. So far, my work arounds include various combinations of losing type safety and/or introducing an extra level of indirection.

The former would involve dropping from a typed reference to an untyped one (a void* if you will). The latter would result in breaking the invariant of my content addressable store, namely that the SHA1 of each blob is the hash used to refer back to it.

Another interesting question is how to decide where in a type relation cycle it makes the most sense to insert such a work around. Theoretically, interpreting and inserting types can begin on any point on a cycle, so it seems arbitrary to pick the first time my parser detects a cycle; it would lead to non-deterministic type descriptions.

If the problem description is too vague, I can provide more detail if necessary. It's language agnostic.

I'm suspecting that there won't really be a clean solution; I'm looking for the least ugly work around.

Note that I'm aware we could generate hashes from just the type names instead of the full type description. But the whole point of the exercise is to capture a type's complete content in the system; this has numerous interesting benefits once types start evolving (in the area of database and schema upgrades).

Thanks,

Jaap Suter - http://jaapsuter.com

General admin notes

We are experiencing a surge of new members, and that's great! We always value new members.

Let me remind everyone to pursue the policies document (available through the FAQ page). I want to emphasize two policy items in particular: We discourage nicknames, and when they are used encourage members to provide a url of a home page or related information in their profile. Second, LtU is in general not intended for detailed design discussions. More relevant forums are listed in the policies document.

Genericity over tuples

One of Haskell's less elegant corners is the gaggle of zip functions specialized for tuples of different length (zip, zip2, zip3, etc..). Does anyone know what sort of language features would enable one to elegantly write a tuple agnostic zip function?