Signals in an OOPL

In UML the boxes used to describe classes have three compartments: attributes, operations, and signals. This brings up an interesting question: should an OOPL deal with signals (and signal handlers) as primitive constructs that are independent from operations (i.e. methods). Of course we can implement signals as objects, and signal handlers as operations in a library, but perhaps it is so fundamentally useful that a language should support it natively. Especially given the trend towards concurrency in software. I only have vague ideas of how this could improve things, but I think that it could simplify writing concurrent programs by the programmer, since the compiler can identify the concurrency patterns, and verify the code.

I'd love to hear people's thoughts on the subject.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I think languages should support signals natively.

I assume that in this context, signals means 'signals as slots' as in Qt, not C signals from the operating system.

I think it's necessary to support signals natively, not because verification is easier, but because coding signals in a library results in a lot of boilerplate code and many incompatibilities between libraries.

Signals is an important concept that allows the unification of many mechanisms. In my experimental language, expressions do not return values, they emit signals. Results are signals themselves. This allows the language to to express the switch statement, pattern matching, exception handling and callbacks without specific support for these constructs.

For example, if I have a division, I can do the following:

a / b:
    divisionByZero => { print("division by zero"); exit(0);} 
    0 => { print("result was zero"); }
    x : int => { print("result was " + x); }

The above demonstrates the usage of signals as an if construct, a try-catch construct and as a pattern-matching construct.

Signal handlers can have parameters, allowing the creation of clever constructs like Python iterators, continuation passing style etc. For example:

fun iterateList(node) {
    => withNode(node, (){iterateList(node.next)});
}

var list = 1, 2, 3;

fun main () {
    iterateList(list):
        withNode(node, next) => {
            print(node.data);
            next();
        }
}

But the above is not really necessary, because signals contain an implicit parameter which represents the continuation at the point of signal emit. The above could be written like this:

fun iterateList(node) {
    for(; node; node = node.next) {
        => node;
    }
}

var list = 1, 2, 3;

fun main() {
    iterateList(list):
        node => {
            print(node.data);
            =>;
        }
}

The => symbol, when there is no signal specified, it continues execution from the hidden continuation argument.

I assume that in this

I assume that in this context, signals means 'signals as slots' as in Qt, not C signals from the operating system.

I meant executable UML signals, which are asynchronous. How do you handle concurrency in your experimental language? Do you use threads?

I use the actor model.

I use the actor model. Every object is an actor, i.e. a separate thread. When an object waits for a signal from another object, it enters a new message loop until a signal is raised from the other object.

What is a "signal"?

Achilleas brings up a good point: That the definition of "signal" varies from place to place. Unix "signals" are asynchronous events handled by the operating system. Qt "signals" are synchronous events. The term has other uses as well.

I'm not sure how UML signals differ from "operations". It's worth noting that the synchronous construct found in Qt and other toolkits can be modelled rather easily with HOFs--making them syntactic sugar in pretty much any language with lambda. (Of course, if your language doesn't have proper lambdas, i.e. C++, then it becomes trickier).

Asynchronous methods

I think, but could be wrong, that UML signals are just asynchronous methods with no other semantics assumed (e.g. buses, queues, etc).

Gawk....

I question one's ability to verify a design, if that design contains "asynchronous methods with no other semantics assumed". If UML is simply a means for "blackboarding" a design on paper in order to please the boss (and UML artifacts aren't considered real design constraints), then OK I guess... OTOH, if UML is really this ill-specified, then I'm glad my present employer doesn't bother. :)

An amusing rant on UML, courtesy of Bertrand Meyer, can be found here, although some of the criticisms found therein I think are "good" things--in particular, the criticisms that UML isn't "OO" enough. :)

Semantics appended

Anytime I've ever seen anybody use UML as more than just a fancy sketching tool semantics have been added. Usually this takes the form of a mapping from UML constructs to language/library constructs.

Semantics in xUML

There are now semantics defined for signal handling in UML as part of executable UML. The semantics are discussed informally in Executable UML: A Foundation for Model-Driven Architecture by Mellor and Balcer. AFAICT though there is still a fair amount of variation between vendors, and little (if any) open source tool support.

AFAIK Signal handlers = asynchronous operations

AFAIK UML signal handlers are more or less asynchronous UML operations. There are some semantics proposed as part of the executable UML profile for UML, but the whole thing is under development and I have yet to find public specifications of the executable UML profile.

Actors

What you're talking about sounds like the OO form of actors. While noodling around E, I found they have a page with links on a lot of actors work.

On Message Passing

Nice link page, thanks. I wonder if I need all of what the actor concurrency model offers? I am thinking that simple message passing (messages == signals AFAICT), which is only part of the actor model, should be sufficient. However my understanding of issues related to concurrent systems is naive. I would like to hear why one needs more than that.

I just realized that this topic is also very closely related to another active thread here at Lambda-the-Ultimate.org. In that thread Allan McInnes points to The Problem with Threads by Ed Lee. Which I think is interesting and relevant.

Might process algebras have insights?

Since they are all about concurrency? I dunno, I only just started in on the overview paper mentioned before on LtU.

Eiffel

You might want to take a look at how Eiffel handles concurrency and message-passing. It's a carefully conceived, fairly minimal extension to standard OO programming. A good description (extracted from Meyer's "Object Oriented Software Construction") can be found here. Meyer discusses the rationale for his approach in a lot of detail, getting into why and how it differs from "processes" and "active objects". The approach to communications is fundamentally asynchronous, but includes some provisions for synchronization. Unfortunately, I can't speak to how easy it is to use Eiffel's concurrency mechanisms in practice, since I've not (yet) actually used them.

Great link

Thanks, that is a very interesting read. When I get to his discussion of processors, I find myself getting a bit lost. There is a lot of terminology that is unfamiliar to me. I suspect he has invented a lot of the theory there, am I correct? I can't help but wonder if he is making things more complicated than need be? The executable UML model for example seems as complete as needed, and far simpler: it simply extends classes with Moore finite state machines to model asynchronous communication.

Extending a class with a Moore machine

(or some other computational device that both accepts inputs and emits outputs), fails to abstract away several key issues with concurrency, that real applications have to be concerned about:

* Scheduling--ensuring progress (avoiding starvation), ensuring scheculing deadlines are met (meeting realtime constraints, if any), mapping objects/actors/etc. onto lower level entities (OS processes/threads, CPU or cores).

* Avoidance of deadlock/livelock.

* The problem of global consensus (something which is difficult in reliable distributed systems; and intractable in unreliable systems).

Meyer has come up with some interesting stuff in the OO context, but as far as "inventing the theory", I would instead refer you to the more fundamental process calculii and models (Actors, CSP/CCS, pi-calculus) as more theoretically sound. Meyer has the unfortunate habit of ignoring theory when he thinks it inconvenient, and trying to hack his way around it (see prior rants concerning the whole covariance mess in Eiffel). It also should be noted, that Eiffel's entire concurrency model is trying to tame the temptest of shared-state concurrency while hiding the fact that concurrency is there--a course of action I believe to be fundamentally unwise.

Makes sense

This makes a lot of sense. Thanks for the insight into Meyer's writing.

I am wondering though how many of the key issues of concurrency can be managed by a compiler, if the concurrency demands on the system are not strict (e.g. no realtime constraints, just benefiting from the concurrency without worrying about it). Going forward, many software developers are going to want concurrency for free: without having to their hands dirty with the details. It seems the executable UML (xUML) approach of using Moore machines would make it easy for software developers to design software that can be compiled to take advantage of available concurrency, and even be run easily on distributed systems.

Pragmatism and problem-avoidance

I wouldn't say that Meyer invented a lot of theory. He did build on some theory (specifically CSP), but I wouldn't say that there's a lot of new theory there - rather a lot of pragmatic choices intended to produce something that looks as similar as possible to sequential OO programming. I agree with Scott that the way that Meyer tries to hide the concurrent nature of what's going on is a little worrying (e.g. the convention of implicit object locking when the object is used as a feature parameter). On the other hand, that approach is somewhat consistent with Meyer's general philosophy of having a "uniform access" style of coding that makes it easy to change implementations later.

I suspect that the reason that things seem more complicated than they have to be to you is the extra contortions Meyer goes through to retain all of the benefits of OO. Specifically, as Meyer points out, simply extending classes with state machines (and thus with behavior) can conflict with inheritance since it isn't obvious how inherited and overridden behaviors should be treated. Which isn't a problem if you're willing to abandon inheritance. But Meyer isn't. Thus the additional complexity. Meyer avoids the problem by avoiding state-machine-like behavior. An alternative approach is to try to reconcile inheritance and behavior. If you're interested in seeing how that might be tackled, you might want to have look at W.M.P. van der Aalst, Inheritance of Dynamic Behavior in UML.

Like the other posters, I'm

Like the other posters, I'm going to assume that "signals" here mean named slots on objects to which clients can attach callback procedures -- sort of a finer-grained version of the Observer pattern.

The thing is that, in my experience, signals are often the wrong abstraction: side-effects of signal handlers do not compose well, but side-effects are often the preferred way to get a value "out" of a handler. Once you are attaching more than one handler to the same signal, you're in dangerous territory, and things get hard to reason about (particularly if handlers may add or remove handlers, which is normally the case).

In most cases, people would be better served by using the actor model or a constraint solver. In general I think constraints are shamefully under-utilized in contemporary programming, and are perfectly suited for the "when this changes, update these things" problems to which signals are most often applied.

Signals Handlers and State Machines

So as I understand it the semantics of signals in UML resemble message passing semantics. Each object that can handle signals (called an active object) does so using a conceptual queue. The signal recieved order is indeterminate except that it maintains the sending order of signals from particular objects. The queue then hands then hands the signals to different signal handlers.

It seems though that this description most resembles the actor model of concurrency, and that the UML usage of the term signal is perhaps misguided and should be replaced with "message". What do you think?

side-effects of signal handlers do not compose well, but side-effects are often the preferred way to get a value "out" of a handler.

Why is it the preferred way then, is it just laziness or are there other motivations? In executable UML the only way to get a value out of a signal handler, is to wait for it to send another signal back to you.

Clearly systems with lots of signals can be complex to keep track of, but UML provides state machine diagrams and state transition tables for the purpose of understanding the interaction of signals. UML also has a constraint language, called OCL, for expressing constraints. On that note, is anyone aware of general purpose languages which support state machines as first class constructs?

state machines

cdiggins: On that note, is anyone aware of general purpose languages which support state machines as first class constructs?

I haven't noticed one mentioned yet. But I'm headed in that direction myself — not that state machines are a particular goal. (I've just had folks ask me "can I create state machines with that?" and I said, yeah sure.)

If you make a language that let's you declare async callback handlers, you can use those to build a state machine by accretion without delaring that's what you're doing. But it might be better to also build into such a language a way of declaring the entire state machine, part of which is the async event handers, so a system could reason about the whole a little better.

Among other things, it would make a state machine easier for users to comprehend if a language had a way of managing one as a first class object. If you're designing an async system, you probably want the state machine manifested in some form, if only so you can assert the description is the intended spec to folks working on the system. (Then you could also track evidence like stats that agree or disagree with the state machine's model.)

Possibly of interest

You might want to look at the gen_fsm behaviour in Erlang. I haven't used it myself, and it doesn't seem to be as full-featured as what you describe, but it does provide a way to create an FSM and treat it as a first-class managed entity (put it in a supervision tree, shut the entire thing down, etc.).

UnrealScript...

...provides native support for hierarchical finite state machines. Perhaps Tim can comment on that when he can spare a moment.

Why the complexity?

I don't understand why you are asking for such a construct. If it existed, it would only hinder development, in most cases. The simple Actor model is more than enough for all cases of concurrency. What more are you asking for?

UML is just another language, and programs written in it need to be proved correct as well. So why bother?

"If it existed, it would

"If it existed, it would only hinder development, in most cases."

Why do you say that? Finite state machines are a very effective way to express many different kinds of concurrent systems. How would it hinder development to use a FSM to express the behavior of my software?

You are doing the work

You are doing the work twice. You have to write the code at some point. And since tools can never write the code 100%, it's highly unlikely that you will not have to prove that the code does as expected.