OO runtime graphs are scale-free

If you liked Barabási's book, you'll find this interesting. The May issue of CACM has an article called Scale-Free Geometry in OO Programs [PDF, 180K] by Alex Potanin, James Noble, Marcus Frean, and Robert Biddle. They examined 60 heap snapshots from 35 different Java programs and found all of the resulting object graphs to be scale-free. This is true for both incoming and outgoing links with the qualification that objects rich in incoming links are poor in outgoing links and vice versa.

What are the implications of this for PL research in OO languages? The authors point out the following:

  • One useful aspect of scale-free networks is their high resilience to random network damage. Since most nodes are poorly connected, deleting them doesn't affect the remaining nodes all that much.
  • A small number of nodes are highly connected. You may get the best bang for the buck by concentrating early QA effort on the highly connected nodes first.
  • "is is well-known that garbage collectors can improve their performance by assuming that most objects have only one or two outgoing references. The scale-free nature of object graphs explains why making this assumption is worhtwhile."
  • Any other interesting implications?

    Comment viewing options

    Select your preferred way to display the comments and click "Save settings" to activate your changes.

    Pseudo-bipartite graphs

    ...with the qualification that objects rich in incoming links are poor in outgoing links and vice versa.

    This is an interesting property that I've run into in a couple of places. A bipartite graph is a graph where one can group nodes into those with only incoming and only outgoing arcs (directed/undirected carries no information for such graphs). One can add the notion of noise to a directed graph and arrive at graphs that are bipartite but for some arrows going the wrong way, and some nodes that don't really fit under the heading of incoming or outgoing. This notion arises in the analysis of trust metrics, and I've thought about it already in the context of intermediate representations for compilers. Anyone seen something similar elsewhere?


    Anyone seen something similar elsewhere?
    Not quite, but similar: in a language with only base types, products and sums, if products and sums must alternate (I heard there are PLs with such weird limitations) instance graphs are bipartite in a way. Actually, they are a tree, and any tree is bipartite, but the point is products usually have many outgoing references (0 and more), while sums - exactly one. Throw in some reference cells (or tie some nodes) - and I guess you get a general bipartite graph.

    This covers just the outgoing edges, though.