archives

Choosing Conferences

I'm trying to choose conferences to attend this summer. In the programming languages department, it looks like PLDI is my best bet. However, being fairly new to the programming languages community, I thought I'd ask a few questions before committing myself.

Hence:

  • Could anyone who has been to PLDI describe a bit about what I might expect if I decide to go?
  • How did the experience compare to any other languages/types/FP conferences you've attended?
  • Are there other conferences that I should be considering (preferably on the North American Eastern Seaboard)?
  • Do any of the language/types conferences have an LtU Birds of a Feather meeting?

Thanks!

Virtual Machines, Language Runtimes, and the Future of Objective C

In a series of three articles last year, John Siracusa at arstechnica argued that Apple has an impending crisis on its hands because it doesn't have "A memory-managed language and API" like Java or Microsoft's CLR. My question is less about whether this is true, and more about what it means to be "managed".

Specifically, there seems an implicit assumption that Java or .NET type architectures represent what "safe", garbaged-collected systems should look like.

In the case of Apple, couldn't some new, hypothetical system language(s) be based on the existing Objective C runtime? Or more generally, do the runtimes for systems written in lower-level languages like GObject/GTK or Objective C (if garbage collected) provide enough services and metadata to build more dynamic languages on top of them, and still provide object compatibility-- to not require an FFI to communicate with the older existing framework?

If that were in fact the case, wouldn't it be possible to gradually turn systems like Objective-C/Cocoa or C/GTK into something closer to Smalltalk or (god forbid) Java?

Half-baked idea: mini-language for vector processing

I've been looking at K. It's a truly marvelous tool for thinking about vectors, with the added bonus that your thoughts are automatically executable. Lots of interesting things can be done with (short) one-liners in K: "standard deviation", "are those two arrays permutations of each other", "transform seconds to hours/min/sec" etc.

And it looks like line noise. Smells just like regular expressions. Aha!

An idea naturally occurred to me: make a C library (for example) that takes C strings as K-style "array processing expressions", and C arrays as arguments. Just like regexes. I even thought of a cool name, "Knot", i.e. "Not K" or "K Notation" :-)

Unfortunately, I'm no Arthur Whitney, and don't have the skills to pull this off. So I decided to set the idea afloat here on LtU, hoping that something might become of it.

Is this feasible? Is this useful? Has this been done?

Contract Verification and Structural Subtyping of Interfaces

The following is a draft of a paper I am working on for OOPSLA. It is my first submission to a conference proceedings (I am doing this without any academic guidance!) any comments, crtiques, or suggestions would be most gratefully acknowedged. I haven't finished the bibliography yet, and marked sections which I plan of referencing using the [xxx] placeholder.

Abstract

Structural subtyping is a straightforward and powerful way of extending the abstraction of Java style interface types, however it is not without its shortcomings. I describe in this paper how, by including the verification of preconditions and postconditions, we can solve the commonly associated problems of naive structural subtyping, leading to more modular and robust software.

Introduction

The eternal question faced by software developers is how to write software which is both more modular and more robust. Interface oriented development has demonstrated itself to be an effective way to develop software which can be more modular than traditional class based OOP development techniques. Interfaces are allow a programmer to decouple the usage of the methods of an object from its actual implementation. In languages like Java there is still exists a coupling, between an interface and a class, in that a class has to declare a-priori its relationship to an interface. Languages like Standard ML [xxx] provide a more powerful and abstract alternative in the form of structurally subtyped interfaces called signatures. Signatures are are arguably too powerful, and can lead to a new class of errors due to structural conformance with incorrect semantics.

Contract verification, specifically the verification pre- and post-conditions, is a well-established and effective way of reducing defects in both procedural and object-oriented programming languages. Contracts also act to document the requirements and obligations of the methods [MeyXX], effectively expressing an object's semantics. One benefit of this, which is the focus of this paper, is the observation that when signatures are used with contracts, accidental structural matchings can be detected as pre- or post-condition violations.

Interfaces

An interface is a set of function signatures. An object or class is said to implement an interface when it provides definitions for functions with signatures corresponding to each of the function signatures in the interface. An interface variable is usually implemented as a reference or pointer to an object which implements as the interface. In many languages (e.g. Java, C#) a class has the responsibility of declaring which interfaces it implements, which is a form of nominal subtyping. An interface is a supertype of the class, and the subtype relationship is explicit. This has the limitation that a class can't be bound to a new interface without rewriting and recompiling the source.

Nominal subtyping of interfaces has another problem, it limits modularity and thus reusability. Consider the case of a class written in Java representing an indexable stack of cats:

    class Cats
    {
      public int size() {
        return m.size();
      }
      public Cat get(int n) {
        return (Cat)m.get(n);
      }
      public void set(int n, Cat x) {
        m.set(n, x);
      }
      public void push(Cat x) {
        m.add(x);
      }
      public Cat pop() {
        Cat ret = (Cat)m.lastElement();
        m.setSize(size() - 1);
        return ret;
      }
      public bool isEmpty() {
        return m.isEmpty();
      }
      private Vector m;
    }
  
The question a programmer needs to ask, is "Which interfaces should this class implement?". The answer is, that depends on how this class is being used. We can easily guess at a couple of likely interfaces such as the following ICatArray or ICatStack interfaces:
    interface ICatArray {
      int size();
      Cat get(int n);
      void set(int n, Cat x);
    }

    interface ICatStack {
      void push(Cat x);
      Cat pop();
      bool isEmpty();
    }
  
Now consider the posssible future case of a producer which only stacks cat and a consumer which only unstacks classes. In the interest of reducing coupling these classes would only accept minimal interfaces such as the following ICatStacker, and ICatUnstacker.
    interface ICatStacker {
      void push(Cat x);
    }

    interface ICatUnstacker {
      Cat pop();
      bool isEmpty();
    }
  
So unless the class explicitly implements these interfaces, the potential for resuability is restricted. Here the programmer faces a dilemma, should they anticipate the possible extra uses of the class, and pre-design numerous interfaces? Anticipating all possible interfaces is impractical since even in this trivial example there are 7! possible interfaces to consider. Even just implementing a handful of extra interfaces in anticipation of such scenarios violates the agile programming principle known as the YAGNI (You Aren't Going to Need It) principle [xxx]. This principle states that a programmer should only code what they need at the current time and no more. Anecdotal experience shows that one of the biggest sinks of programmer time and productivity is writing code for scenarios which don't arrive.

Some languages offer more powerful interface constructs which support structural subtyping, and thus overcome the stated limitations of nominal subtyping. For instance Standard ML provides a signature construct [xxx]. There is also a GCC extension to C++ called a signature[xxx]. In ISO compliant C++ we can emulate structural subtyping using templates [xxx], [xxx].

Semantic Mismatching with Structural Subtyping

Structural subtyping of interfaces introduces a new set of problems. The interface may match different classes with unanticipated semantics.

Consider the following example of a simple last-in, first-out (LIFO) stack interface written using Heron:

  interface IStack<type T>
  {
    public
    {
      def push(T x);
      def pop();
      def top() : T;
      def isEmpty() : bool;
    }
  }

Assuming a pre-existing double ended queue class (CDeque) a simple implementation of a LIFO stack could be as follows:

  class CStack<type T>
  {
    public
    {
      def push(T x) {
        m.push_back(x);
      }
      def pop() {
        result = m.pop_back();
      }
      def isEmpty() : bool {
        result = m.isEmpty();
      }
    }
    fields
    {
      CDeque<T> m;
    }
  }

In Heron an interface variable can refer to an instance of the class even though the class does not explicitly implement the interface:

    CStack<int> stack;
    stack.push(1);
    stack.push(2);
    IStack<int> stack_ref = stack;
    stack_ref.push(3);
    printf("%d\n",stack_ref.top()); // outputs 3
    printf("%d\n",stack.top()); // also outputs 3
  
Notice that in Heron an interface variable binds to the class just like a reference does.

Now consider the following class implementation of a first-in first-out ( FIFO ) stack, commonly known as a queue:

  class CQueue<type T>
  {
    public
    {
      def push(T x) {
        m.push_front(x);
      }
      def pop() {
        result = m.pop_back();
      }
      def isEmpty() : bool {
        result = m.isEmpty();
      }
    }
    fields
    {
      CDeque<T> m;
    }
  }

The problem that arises with structural subtyping is that since the CQueue class also satisfies the IStack interface the programmer can legally bind an IStack variable to an instance of CQueue. This causes problems when a programmer makes natural assumptions about the semantics of the class bound to an interface. For instance consider the innocuous example of a reversal function:

  def reverse(IStack<int> x)
  {
    CQueue<int> q;
    while (!x.isEmpty())
    {
      q.push(x.pop());
    }
    while (!q.isEmpty())
    {
      x.push(q.pop());
    }
  }

  def demo()
  {
    Stack<int> s;
    s.push(1); s.push(2); s.push(3);
    reverse(s);
    // as expected s is now reversed

    CQueue<int> q;
    q.push(1); q.push(2); q.push(3);
    reverse(q);
    // q is not reversed!
  }
  

Solution

One simple and effective solution to the problem of semantic mismatching is to introduce contract verification. The idea of expressing and verifying contracts as part of an interface is not a new idea [xxx], however to the best of the author's knowledge, the proposal of a mechanism for combining contract verification and strucutural subtyping is itself novel.

The approach that was taken with the Heron language is to allow interfaces to be written explicitly, with the help of two reserved keywords: inner and that.

An interface variable can be thought of as syntactic sugar for a class which delegates all the member functions to a single object. Accordingly Heron allows the earlier IStack interface example to be written out equivalently as :

  interface IStack<type T>
  {
    public
    {
      def push(T x) {
        inner;
      }
      def pop() : T {
        result = inner;
      }
      def isEmpty() : bool {
        result = inner;
      }
    }
  }
  
The inner keyword is equivalent to a call to the bound object. Since the bound object is actually a sub-type of the interface, the keyword inner from the BETA programming language lexicon was chosen, to represent the call from super-class to sub-class.

In order to distinguish between the two kinds of stacks, we add assertions to verify the preconditions/postconditions. The pre and post macros are library defined macros which behave like the assert macro in C++ but can be turned on or off independently of each other.

  interface IStack<type T>
  {
    requires
    {
      def push(T x) {
        inner;
        post(!isEmpty());
        post(top() == x);
      }
      def pop() : T {
        pre(!isEmpty());
        result = inner;
      }
      def isEmpty() : bool {
        result = inner;
      }
    }
    extension
    {
      def top() : T {
        pre(!isEmpty());
        result = that.pop();
        that.push(result);
      }
    }
  }
  
The extension section contains function definition which are not required in the implementing class. The keyword that refers to the object referenced by the interface, rather than to the interface reference itself. The importance of the that keyword is that it allows the prevention of infinite recursion (e.g. push calls top which calls push ad infinitum).

With the introduction of the postcondition check in the push function that top() == x any usage of IStack with an instance of a CQueue class will trigger an assertion failure. For example:

  def demo()
  {
    CQueue<int> q;
    IStack< s = q;
    s.push(1); // no problems, yet
    s.push(2); // assertion failure!
  }
  

Implementation

The techniques described in this paper were implemented in the Heron programming language, using a Heron to C++ translation tool called HeronFront [xxx]. The Heron language is closely related to C++ and shares much of the C++ semantics. The more salient differences include:

  • implicit result variables - like Eiffel [xxx], Turbo Pascal [xxx] and other languages, Heron functions have implicit result variables. This makes it easier for the programmer to write post-conditions which examine the result of a function.
  • inner keyword - used in interfaces to call the corresponding function in the bound object.
  • that keyword - used in interfaces as a disambiguator to direct function calls to the bound object instead of the interface function.
  • visibility - class fields are always private, this ensuring that the only manipulation of a class is through its methods, allowing for easier verifications of invariants.
  • runtime polymorphism - the only mechanism for runtime polymorphism in Heron is through the interface dispatch mechanism. Virtual functions are more or less redundant, since every design pattern involving virtual functions can be rewritten using a callback pattern involving interfaces.
  • interface - the Heron interface consists of required functions, and extension functions.
  • guaranteed default construction - all Heron objects support default construction. This is used so that result variables, can be implicitly default generated. This is a design choice which was done more out of convenience than anything else.
  • comparison operators - all Heron objects support comparison using the == and != operator.

Heron Interface Syntax

The grammar for the interfaces in Heron is defined here in relation to the C++ grammar:

    interface ::==
      interface name template_params? { section* }

    section ::==
      inherits_section | requires_section | extension_section

    inherits_section ::==
      inherits { type_expression* }

    requires_section ::==
      requires { (function_def | function_decl)* }

    extension_section ::==
      inherits { function_def* }

    template_params ::==
      < type identifier [ , type identifier ]* >

    function_param ::==
      type_expression identifier

    function_params ::==
      ( function_param* )

    result_type_decl ::==
      : type_expression

    function_sig ::==
      def identifier result_type_decl?

    function_decl ::==
      function_sig ;

    function_def ::==
      function_sig { statement* }
  

Conclusion

The idea of verifying pre- and post-conditions in interfaces is rapidly gaining traction [xxx]. This paper has attempted to point out that not only is it a good idea but one which also works well in conjunction with structural subtyping to make it a much more practical and effective technique.

This work diverges from related work on adding contract verification to interfaces in that it doesn't prescribe how or even whether such a thing should be done, but rather enables it. The goal here is to enable the programmer to implement things as needed, rather than enforce a particular approach. This has a benefit of keeping the language complexity much simpler and maximizes programmer flexibility.

The approach of adding contract verification does not completely solve the problems that have been associated with structural subtyping. However, this is a case of where some medicine is better than none. It is hopefully an inescapable conclusion that the expression of preconditions and postconditions in an interface is a good thing, and when used in conjunction with structural subtyping it becomes particularly powerful.

A logical extension of this work is to use static contract verification techniques, such as the Boogie program verifier [xxx], to perform compile-time checking of contract conditions to avoid the overhead of a having to execute programs to catch incorrect use of structural subtyping.

Acknowledgements

Much thanks to Peter Grogono, Oscar Nierstrasz, Kresimir Cosic, Achilleas Margaritas, Howard Lovatt, Max Lybbert, and Pablo Aguilar for reviewing and commenting on various drafts this paper, and providing useful insights.