Implementation

Judy Stores

A Judy tree is generally faster than and uses less memory than contemporary forms of trees such as binary (AVL) trees, b-trees, and skip-lists. When used in the "Judy Scalable Hashing" configuration, Judy is generally faster then a hashing method at all populations.

Some of the reasons Judy outperforms binary trees, b-trees, and skip-lists:

  • Judy rarely compromises speed/space performance for simplicity (Judy will never be called simple except at the API).
  • Judy is designed to avoid cache-line fills wherever possible.
  • A b-tree requires a search of each node (branch), resulting in more cache-line fills.
  • A binary-tree has many more levels (about 8X), resulting in more cache-line fills.
  • A skip-list is roughly equivalent to a degree-4 (4-ary) tree, resulting in more cache-line fills.
  • An "expanse"-based digital tree (of which Judy is a variation) never needs balancing as it grows.
  • A portion (8 bits) of the key is used to subdivide an expanse into sub-trees. Only the remainder of the key need exist in the sub-trees, if at all, resulting in key compression.

The Achilles heel of a simple digital tree is very poor memory utilization, especially when the N in N-ary (the degree or fanout of each branch) increases. The Judy tree design was able to solve this problem. In fact a Judy tree is more memory-efficient than almost any other competitive structure (including a simple linked list). A highly populated linear array[] is the notable exception.

Worth a look, considering the open-source LGPL license and range of applications. HP released this library in 2004. From the PDF manual,

Judy arrays are remarkably fast, space-efficient, and simple to use. No initialization, configuration, or tuning is required or even possible, yet Judy works well over a wide dynamic range from zero to billions of indexes, over a wide variety of types of data sets -- sequential, clustered, periodic, random.

dtracing Python

Speed is often used as an excuse for bad programming (premature optimization is the root of all evil and all that). Yet understanding the causes of poor performance of the language system, as opposed to user code, isn't all that easy.

If you are using an open source language implementation, and have access to something clever like dtrace, it might be easier than you think.

alphaWorks: Pattern Modeling and Analysis Tool for Java Garbage Collector

PMAT analyzes IBM verbose GC traces by parsing the traces and building pattern models. PMAT recommends key configurations by executing a diagnosis engine and pattern modeling algorithm. If there are any errors related with Java heap exhaustion or fragmentation in the verbose GC trace, PMAT can diagnose the root cause of failures. PMAT provides rich chart features that graphically display Java heap usage.

Sounds useful. If anyone has the time to download and check this out, it would be interesting to hear what you think.

The Fortress Language Spec v0.618

The Fortress Language Spec v0.618

We've already discussed the Fortress project at Sun's Programming Language Research lab. The project seems to be progressing well: they have now released a preliminary version of the language spec as a tech report. This should help assuage the folks who were grumbling about the hype and market-speak of Sun's promotional article. Their PL research team includes some real Smarty McSmartpants, so I expect really good things from them.

From a cursory glance at the spec, there seem to be many interesting features, including:

  • an advanced component and linking architecture
  • objects and traits (which look similar to Moby and Scala)
  • multiple dispatch
  • type inference
  • parameterized types
  • extensible syntax (with a conspicuous reference to Growing a Language)
  • first-class functions
  • proper tail calls
  • labelled jumps, exceptions
  • contracts
  • some interesting parallelism features (including futures, parallel loops, atomicity primitives, transactions)
  • matrices
  • comprehensions
  • bignums, fixnums (signed and unsigned), floating-point, imaginary and complex numbers
  • dimensions

Future work includes proving type soundness, apparently. :)

It is so much fun when the luminaries start working on new languages. History in the making!

jhc

jhc is a haskell compiler which aims to produce the most efficient programs possible via whole program analysis and other optimizations.

This seems like an interesting project, for example: Region Inferencing, Compilation by transformation with 2 general intermediate languages, very modern design, using rank-n polymorphism, monad transformers, generic programing, and existential types.

Note, howver, that there are quite a few problems (scaling, memory leaks, etc.)

Maybe some of you might want to offer a helping hand...

The Glasgow Haskell Compiler Survey - GHC needs your feedback!

If you're a GHC user, the Glasgow Haskell Compiler HeadQuarters needs your feedback! See Simon Peyton-Jones original message, or go directly to the user survey. Here's a quote from the original message:

We'd like to hear from *absolutely everyone* who uses GHC: whether you
are a first-year undergraduate or a famous professor; whether you use
GHC for industrial applications or recreation; whether you use
higher-rank polymorphism or have only just learned what functional
programming is. Everyone!

We'll use the information to guide our future priorities, and we'll
publish some kind of summary of what we learn in due course. It's all
anonymous, of course, unless you choose to say who you are.

Omega

Ωmega is a new programming language by Tim Sheard which is descended from Haskell and adds new facilities for defining static type constraints, such as allowing "users to write functions at the level of types, and then use those functions in the type of functions at value level". It also has "equality qualified types". See also Programming with Static Invariants in Omega and the manual for more information. Mentioned previously (in passing) on LtU.

Metaphor

Metaphor is a strongly-typed, multi-stage, object-oriented programming language. Metaphor is based on a subset of C# and is extended with multi-stage programming constructs in the style of MetaML or MetaOCaml. Metaphor is implemented as a compiler on the .NET CLR.

Metaphor features a static type system for object-oriented reflection operations (i.e. run-time type analysis). This allows the reflection system to be safely incorporated into the language’s staging constructs and thus allows the generation of code based on the structure of types.

Pugs, Practicing the Theories.

A lot of language theory goes past here on Lambda the Ultimate, but we rarely see that theory directly impacting commercial programmers.

I'm a great fan of theoretical concepts like arrows, but at the same time I'm a self-employed programmer interested in solving my clients' problems.

Pugs is notable in that it profitably uses recent developments such as GADTs and Template Haskell for an implementation of Perl6.

I recently became a regular on the #perl6 irc channel and soon after joined the list of committers.

In just a few days I've seen a lot. I've seen enthusiastic members of the Perl community learning Haskell. I've seen myself learning Perl. I've also seen how daily Perl programmers work with abstractions like monad transformers. I've seen how some structures are easy to extend for programmers new to both the Pugs codebase and Haskell.

The Pugs project was started 64 days ago by Autrijus Tang, as an exercise while reading TaPL. Pugs already includes network and threading primitives. New tests and code are add at an amazing rate, as evidenced by the smoke tests.

I don't know if I'll end up using Perl after Pugs is written, but I am learning how to practice the theory of programming language design and implementation.

UCPy: Reverse Engineering Python

Interesting paper from PyCon 2003 (so yes, it's old news - but new on Lambda, as far as I can see): UCPy: Reverse-Engineering Python.

The project partly entails replacing the "CISC-style" Python VM with a much smaller "RISC-style" VM. The authors' comments on this decision are worth considering in the light of recent discussions about the design of the Parrot VM.

XML feed