archives

PECAN: Persuasive Prediction of Concurrency Access Anomalies

(This group and the diaspora seems to be doing all sorts of interesting things. I got around to this by wondering about static analysis of JavaScript wrt performance.)

As a developer
Who ends up on concurrent systems
I would like to be able to debug them.
(Even before I run them.)

Persuasive Prediction of Concurrency Access Anomalies
Jeff Huang, Charles Zhang
Department of Computer Science and Engineering The Hong
Kong University of Science and Technology
Predictive analysis is a powerful technique that exposes concurrency bugs in un-exercised program executions. However, current predictive analysis approaches lack the persuasiveness property as they offer little assistance in helping programmers fully understand the execution history that triggers the predicted bugs. We present a persuasive bug prediction technique as well as a prototype tool, PECAN , for detecting general access anomalies (AAs) in concurrent programs. The main characteristic of PECAN is that, in addition to predict AAs in a more general way, it gener- ates ‚bug hatching clips‚ that deterministically instruct the input program to exercise the predicted AAs. The key in- gredient of PECAN is an efficient offline schedule generation algorithm, with proof of the soundness, that guarantees to generate a feasible schedule for every real AA in programs that use locks in a nested way. We evaluate PECAN using twenty-two multi-threaded subjects including six large concurrent systems, and our experiments demonstrate that PECAN is able to effectively predict and deterministically expose real AAs. Several serious and previously unknown bugs in large open source concurrent systems were also re- vealed in our experiments

(sotto voice: I guess there's something to be said for using just utterly unprincipled, unrestricted, unconstrained, awful things like rampant concurrency, and Java, JavaScript, et. al., because it gives the Really Smart People in the world something to attack and improve and show off around. I mean, if we all had the luxury of Doing Things Right from the get-go I feel like lots of valuable insights (with wider application than their originating research) would never have been discovered or created.)

Coordinated concurrent programming in Syndicate

Coordinated concurrent programming in Syndicate
Tony Garnock-Jones and Matthias Felleisen.
2016

Most programs interact with the world: via graphical user interfaces, networks, etc. This form of interactivity entails concurrency, and concurrent program components must coordinate their computations. This paper presents Syndicate, a novel design for a coordinated, concurrent programming language. Each concurrent component in Syndicate is a functional actor that participates in scoped conversations. The medium of conversation arranges for message exchanges and coordinates access to common knowledge. As such, Syndicate occupies a novel point in this design space, halfway between actors and threads.
If you want to understand the language, I would recommend looking at sections 1 to 2.2 (motivation and introducory examples) and then jumping at section 5, which presents fairly interesting designs for larger programs.

Concurrent program components must coordinate their computations to realize the overall goals of the program. This coordination takes two forms: the exchange of knowledge and the establishment of frame conditions. In addition, coordination must take into account that reactions to events may call for the creation of new concurrent components or that existing components may disappear due to exceptions or partial failures. In short, coordination poses a major problem to the proper design of effective communicating, concurrent components.

This paper presents Syndicate, a novel language for coordinated concurrent programming. A Syndicate program consists of functional actors that participate in precisely scoped conversations. So-called networks coordinate these conversations. When needed, they apply a functional actor to an event and its current state; in turn, they receive a new state plus descriptions of actions. These actions may represent messages for other participants in the conversations or assertions for a common space of knowledge.

Precise scoping implies a separation of distinct conversations, and hence existence of multiple networks. At the same time, an actor in one network may have to communicate with an actor in a different network. To accommodate such situations, Syndicate allows the embedding of one network into another as if the first were just an actor within the second. In other words, networks simultaneously scope and compose conversations. The resulting tree-structured shape of networked conversations corresponds both to tree-like arrangements of containers and processes in modern operating systems and to the nesting of layers in network protocols [1]. Syndicate thus unifies the programming techniques of distributed programming with those of coordinated concurrent programming.

By construction, Syndicate networks also manage resources. When a new actor appears in a conversation, a network allocates the necessary resources. When an actor fails, it deallocates the associated resources. In particular, it retracts all shared state associated with the actor, thereby making the failure visible to interested participants. Syndicate thus solves notorious problems of service discovery and resource management in the coordination of communicating components.

In sum, Syndicate occupies a novel point in the design space of coordinated concurrent (functional) components (sec. 2), sitting firmly between a thread- based world with sharing and local-state-only, message-passing actors. Our de- sign for Syndicate includes two additional contributions: an efficient protocol for incrementally maintaining the common knowledge base and a trie-based data structure for efficiently indexing into it (sec. 3). Finally, our paper presents eval- uations concerning the fundamental performance characteristics of Syndicate as well as its pragmatics (secs. 4 and 5).


Our examples illustrate the key properties of Syndicate and their unique combination. Firstly, the box and demand-matcher examples show that Syndicate conversations may involve many parties, generalizing the Actor model’s point-to-point conversations. At the same time, the file server example shows that Syndicate conversations are more precisely bounded than those of traditional Actors. Each of its networks crisply delimits its contained conversations, each of which may therefore use a task-appropriate language of discourse.

Secondly, all three examples demonstrate the shared-dataspace aspect of Syndicate. Assertions made by one actor can influence other actors, but cannot directly alter or remove assertions made by others. The box’s content is made visible through an assertion in the dataspace, and any actor that knows id can retrieve the assertion. The demand-matcher responds to changes in the dataspace that denote the existence of new conversations. The file server makes file contents available through assertions in the (outer) dataspace, in response to clients placing subscriptions in that dataspace.

Finally, Syndicate places an upper bound on the lifetimes of entries in the shared space. Items may be asserted and retracted by actors at will in response to incoming events, but when an actor crashes, all of its assertions are automatically retracted. If the box actor were to crash during a computation, the assertion describing its content would be visibly withdrawn, and peers could take some compensating action. The demand matcher can be enhanced to monitor supply as well as demand and to take corrective action if some worker instance exits unexpectedly. The combination of this temporal bound on assertions with Syndicate’s state change notifications gives good failure-signalling and fault-tolerance properties, improving on those seen in Erlang [7].


Syndicate draws directly on Network Calculus [2], which, in turn, has borrowed elements from Actor models [16,17,18], process calculi [19,20,21,22,23], and actor languages such as Erlang [7], Scala [24], E [25] and AmbientTalk [26].

This work makes a new connection to shared-dataspace coordination models [27], including languages such as Linda [28] and Concurrent ML (CML) [29]. Linda’s tuplespaces correspond to Syndicate’s dataspaces, but Linda is “generative,” meaning that its tuples take on independent existence once created. Syndicate’s assertions instead exist only as long as some actor continues to assert them, which provides a natural mechanism for managing resources and dealing with partial failures (sec. 2). Linda research on failure-handling focuses mostly on atomicity and transactions [30], though Rowstron introduces agent wills [31] and uses them to build a fault-tolerance mechanism. Turning to multiple tuplespaces, the Linda variants Klaim [32] and Lime [33] offer multiple spaces and different forms of mobility. Papadopoulos [34] surveys the many other variations; Syndicate’s non-mobile, hierarchical, nameless actors and networks occupy a hitherto unexplored point in this design space.

Some of the proposed designs were surprising to me. There is a reversal of perspective, from the usual application-centric view of applications being first, with lower-level services hidden under abstraction layers, to a more medium-directed perspective that puts the common communication layer first -- in the example of the TCP/IP stack, this is the OS kernel.