User loginNavigation |
Contract Verification and Structural Subtyping of InterfacesThe following is a draft of a paper I am working on for OOPSLA. It is my first submission to a conference proceedings (I am doing this without any academic guidance!) any comments, crtiques, or suggestions would be most gratefully acknowedged. I haven't finished the bibliography yet, and marked sections which I plan of referencing using the [xxx] placeholder. Abstract Structural subtyping is a straightforward and powerful way of extending the abstraction of Java style interface types, however it is not without its shortcomings. I describe in this paper how, by including the verification of preconditions and postconditions, we can solve the commonly associated problems of naive structural subtyping, leading to more modular and robust software. Introduction The eternal question faced by software developers is how to write software which is both more modular and more robust. Interface oriented development has demonstrated itself to be an effective way to develop software which can be more modular than traditional class based OOP development techniques. Interfaces are allow a programmer to decouple the usage of the methods of an object from its actual implementation. In languages like Java there is still exists a coupling, between an interface and a class, in that a class has to declare a-priori its relationship to an interface. Languages like Standard ML [xxx] provide a more powerful and abstract alternative in the form of structurally subtyped interfaces called signatures. Signatures are are arguably too powerful, and can lead to a new class of errors due to structural conformance with incorrect semantics. Contract verification, specifically the verification pre- and post-conditions, is a well-established and effective way of reducing defects in both procedural and object-oriented programming languages. Contracts also act to document the requirements and obligations of the methods [MeyXX], effectively expressing an object's semantics. One benefit of this, which is the focus of this paper, is the observation that when signatures are used with contracts, accidental structural matchings can be detected as pre- or post-condition violations. Interfaces An interface is a set of function signatures. An object or class is said to implement an interface when it provides definitions for functions with signatures corresponding to each of the function signatures in the interface. An interface variable is usually implemented as a reference or pointer to an object which implements as the interface. In many languages (e.g. Java, C#) a class has the responsibility of declaring which interfaces it implements, which is a form of nominal subtyping. An interface is a supertype of the class, and the subtype relationship is explicit. This has the limitation that a class can't be bound to a new interface without rewriting and recompiling the source. Nominal subtyping of interfaces has another problem, it limits modularity and thus reusability. Consider the case of a class written in Java representing an indexable stack of cats: class Cats { public int size() { return m.size(); } public Cat get(int n) { return (Cat)m.get(n); } public void set(int n, Cat x) { m.set(n, x); } public void push(Cat x) { m.add(x); } public Cat pop() { Cat ret = (Cat)m.lastElement(); m.setSize(size() - 1); return ret; } public bool isEmpty() { return m.isEmpty(); } private Vector m; }The question a programmer needs to ask, is "Which interfaces should this class implement?". The answer is, that depends on how this class is being used. We can easily guess at a couple of likely interfaces such as the following ICatArray or ICatStack interfaces: interface ICatArray { int size(); Cat get(int n); void set(int n, Cat x); } interface ICatStack { void push(Cat x); Cat pop(); bool isEmpty(); }Now consider the posssible future case of a producer which only stacks cat and a consumer which only unstacks classes. In the interest of reducing coupling these classes would only accept minimal interfaces such as the following ICatStacker, and ICatUnstacker. interface ICatStacker { void push(Cat x); } interface ICatUnstacker { Cat pop(); bool isEmpty(); }So unless the class explicitly implements these interfaces, the potential for resuability is restricted. Here the programmer faces a dilemma, should they anticipate the possible extra uses of the class, and pre-design numerous interfaces? Anticipating all possible interfaces is impractical since even in this trivial example there are 7! possible interfaces to consider. Even just implementing a handful of extra interfaces in anticipation of such scenarios violates the agile programming principle known as the YAGNI (You Aren't Going to Need It) principle [xxx]. This principle states that a programmer should only code what they need at the current time and no more. Anecdotal experience shows that one of the biggest sinks of programmer time and productivity is writing code for scenarios which don't arrive. Some languages offer more powerful interface constructs which support structural subtyping, and thus overcome the stated limitations of nominal subtyping. For instance Standard ML provides a signature construct [xxx]. There is also a GCC extension to C++ called a signature[xxx]. In ISO compliant C++ we can emulate structural subtyping using templates [xxx], [xxx]. Semantic Mismatching with Structural Subtyping Structural subtyping of interfaces introduces a new set of problems. The interface may match different classes with unanticipated semantics. Consider the following example of a simple last-in, first-out (LIFO) stack interface written using Heron: interface IStack<type T> { public { def push(T x); def pop(); def top() : T; def isEmpty() : bool; } } Assuming a pre-existing double ended queue class (CDeque) a simple implementation of a LIFO stack could be as follows: class CStack<type T> { public { def push(T x) { m.push_back(x); } def pop() { result = m.pop_back(); } def isEmpty() : bool { result = m.isEmpty(); } } fields { CDeque<T> m; } } In Heron an interface variable can refer to an instance of the class even though the class does not explicitly implement the interface: CStack<int> stack; stack.push(1); stack.push(2); IStack<int> stack_ref = stack; stack_ref.push(3); printf("%d\n",stack_ref.top()); // outputs 3 printf("%d\n",stack.top()); // also outputs 3Notice that in Heron an interface variable binds to the class just like a reference does. Now consider the following class implementation of a first-in first-out ( FIFO ) stack, commonly known as a queue: class CQueue<type T> { public { def push(T x) { m.push_front(x); } def pop() { result = m.pop_back(); } def isEmpty() : bool { result = m.isEmpty(); } } fields { CDeque<T> m; } } The problem that arises with structural subtyping is that since the CQueue class also satisfies the IStack interface the programmer can legally bind an IStack variable to an instance of CQueue. This causes problems when a programmer makes natural assumptions about the semantics of the class bound to an interface. For instance consider the innocuous example of a reversal function: def reverse(IStack<int> x) { CQueue<int> q; while (!x.isEmpty()) { q.push(x.pop()); } while (!q.isEmpty()) { x.push(q.pop()); } } def demo() { Stack<int> s; s.push(1); s.push(2); s.push(3); reverse(s); // as expected s is now reversed CQueue<int> q; q.push(1); q.push(2); q.push(3); reverse(q); // q is not reversed! } Solution One simple and effective solution to the problem of semantic mismatching is to introduce contract verification. The idea of expressing and verifying contracts as part of an interface is not a new idea [xxx], however to the best of the author's knowledge, the proposal of a mechanism for combining contract verification and strucutural subtyping is itself novel. The approach that was taken with the Heron language is to allow interfaces to be written explicitly, with the help of two reserved keywords: inner and that. An interface variable can be thought of as syntactic sugar for a class which delegates all the member functions to a single object. Accordingly Heron allows the earlier IStack interface example to be written out equivalently as : interface IStack<type T> { public { def push(T x) { inner; } def pop() : T { result = inner; } def isEmpty() : bool { result = inner; } } }The inner keyword is equivalent to a call to the bound object. Since the bound object is actually a sub-type of the interface, the keyword inner from the BETA programming language lexicon was chosen, to represent the call from super-class to sub-class. In order to distinguish between the two kinds of stacks, we add assertions to verify the preconditions/postconditions. The pre and post macros are library defined macros which behave like the assert macro in C++ but can be turned on or off independently of each other. interface IStack<type T> { requires { def push(T x) { inner; post(!isEmpty()); post(top() == x); } def pop() : T { pre(!isEmpty()); result = inner; } def isEmpty() : bool { result = inner; } } extension { def top() : T { pre(!isEmpty()); result = that.pop(); that.push(result); } } }The extension section contains function definition which are not required in the implementing class. The keyword that refers to the object referenced by the interface, rather than to the interface reference itself. The importance of the that keyword is that it allows the prevention of infinite recursion (e.g. push calls top which calls push ad infinitum). With the introduction of the postcondition check in the push function that top() == x any usage of IStack with an instance of a CQueue class will trigger an assertion failure. For example: def demo() { CQueue<int> q; IStack< s = q; s.push(1); // no problems, yet s.push(2); // assertion failure! } Implementation The techniques described in this paper were implemented in the Heron programming language, using a Heron to C++ translation tool called HeronFront [xxx]. The Heron language is closely related to C++ and shares much of the C++ semantics. The more salient differences include:
Heron Interface Syntax The grammar for the interfaces in Heron is defined here in relation to the C++ grammar: interface ::== interface name template_params? { section* } section ::== inherits_section | requires_section | extension_section inherits_section ::== inherits { type_expression* } requires_section ::== requires { (function_def | function_decl)* } extension_section ::== inherits { function_def* } template_params ::== < type identifier [ , type identifier ]* > function_param ::== type_expression identifier function_params ::== ( function_param* ) result_type_decl ::== : type_expression function_sig ::== def identifier result_type_decl? function_decl ::== function_sig ; function_def ::== function_sig { statement* } Conclusion The idea of verifying pre- and post-conditions in interfaces is rapidly gaining traction [xxx]. This paper has attempted to point out that not only is it a good idea but one which also works well in conjunction with structural subtyping to make it a much more practical and effective technique. This work diverges from related work on adding contract verification to interfaces in that it doesn't prescribe how or even whether such a thing should be done, but rather enables it. The goal here is to enable the programmer to implement things as needed, rather than enforce a particular approach. This has a benefit of keeping the language complexity much simpler and maximizes programmer flexibility. The approach of adding contract verification does not completely solve the problems that have been associated with structural subtyping. However, this is a case of where some medicine is better than none. It is hopefully an inescapable conclusion that the expression of preconditions and postconditions in an interface is a good thing, and when used in conjunction with structural subtyping it becomes particularly powerful. A logical extension of this work is to use static contract verification techniques, such as the Boogie program verifier [xxx], to perform compile-time checking of contract conditions to avoid the overhead of a having to execute programs to catch incorrect use of structural subtyping. Acknowledgements Much thanks to Peter Grogono, Oscar Nierstrasz, Kresimir Cosic, Achilleas Margaritas, Howard Lovatt, Max Lybbert, and Pablo Aguilar for reviewing and commenting on various drafts this paper, and providing useful insights. By cdiggins at 2006-03-12 20:18 | LtU Forum | previous forum topic | next forum topic | other blogs | 8019 reads
|
Browse archives
Active forum topics |
Recent comments
22 weeks 6 days ago
22 weeks 6 days ago
22 weeks 6 days ago
45 weeks 19 hours ago
49 weeks 2 days ago
50 weeks 6 days ago
50 weeks 6 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 6 weeks ago