Parallel/Distributed

"Three Things I Wish I Knew When I Started Designing Languages"

The transcript of Three Things I Wish I Knew When I Started Designing Languages, a talk given by Peter Alvaro somewhere or other, is up at Info Q.

Peter Alavaro's main research interest is in taming distributed systems. He starts his talk with the provocative thesis, "In the future, all radical new languages will be domain-specific languages." He talks of the evolution of his ideas about dealing with distributed systems:

Little interest by designers of programming-language designers in filling huge difficulty of debugging in context of distributed systems;
PLs often make handling of data somewhat implicit, even with functional programming, which he says is dangerous in distributed programming;
To talk about the flow of data properly, we need to talk about time;
Two things that influenced him as a grad student: Jeff Ullman's claim that encapsulation and declarativity are in tension, and Fagin's theorem (the existential fragment of second-order logic characterises NP);
Idea that distributed systems can be considered as protocols specified a bit like SQL or Datalog queries;
Triviality with query languages of characterising the idea of place in distributive systems: they are just another relation parameter;
Describing evolution of a system in time can be done with two other things: counters and negation, leading to Bertram Ludäscher's language Statelog. But this way of doing things leads to the kind of low-level overexpressive modelling he was trying to avoid;
"What is it about...protocols that they seem to require negation to express?” Turns out that if you drop negation, you characterise the protocols that deliver messages deterministically.

He summarises by saying the only good reason to design a programming language (I assume he means a radically novel language) is to shape your understanding of the problem. No regrets of being the only user of his first language, Datalist, because the point is that it shaped all his later thought in his research.

By Charles Stewart at 2019-03-19 00:20 | Parallel/Distributed | 4 comments | other blogs | 134007 reads

"C Is Not a Low-level Language"

David Chisnall, "C Is Not a Low-level Language. Your computer is not a fast PDP-11.", ACM Queue, Volume 16, issue 2.

"For a language to be "close to the metal," it must provide an abstract machine that maps easily to the abstractions exposed by the target platform. It's easy to argue that C was a low-level language for the PDP-11.
...
it is possible to make C code run quickly but only by spending thousands of person-years building a sufficiently smart compiler—and even then, only if you violate some of the language rules. Compiler writers let C programmers pretend that they are writing code that is "close to the metal" but must then generate machine code that has very different behavior if they want C programmers to keep believing that they are using a fast language."

Includes a discussion of various ways in which modern processors break the C abstract machine, as well as some interesting speculation on what a "non-C processor" might look like. The latter leads to thinking about what a low-level language for such a processor should look like.

kdb+ 3.5 released last month

kdb+ is a real-time time series database, known in the financial services universe as the fastest tick database on the market. It was first conceived by Arthur Whitney at Morgan Stanley as a prototype, and over the last 35+ years has grown to add many features. The database makes such aggressive usage of mmap() POSIX function for mapping file chunks into main memory, to the point where it has exposed issues with the implementation of mmap itself.

Recently, the company now behind kdb+ has also built Kx for DAAS (Data-as-a-Service), which is basically a cloud-based, massively clustered version of kdb+ that deals with the curious oddity that kdb+ is effectively entirely singly threaded. For those interested in reading more about kdb+'s unique cloud architecture (as compared to "big data" solutions like Hadoop), you can read the following whitepapers as suggestive guidelines for how the q community thinks about truly "big data" several orders of magnitude faster and larger than most Hadoop data sets:

Query Routing: a kdb+ Framework for a Scalable, Load Balanced System (whitepaper) and Kevin Holsgrove: Query Routing, A kdb+ Framework for a Scalable, Load Balanced System (YouTube talk from Kx Meet-up last year)
Common Design Principles for kdb+ Gateways
A Natural Query Interface for Distributed Systems

While I don't suggest these papers are the blueprint for copying/mimicking the DAAS product, it does help the LtU reader imagine a "different world" of data processing than the often cited Map/Reduce paper and other more mainstream approaches. What is particularly striking is how tiny q.exe (the program that runs kdb+ and provides a CLI for q scripting) is. Language researchers are looking at provably correct C compilers, and it is not a huge leap to think about the world soon seeing provably correct real-time time series databases using kdb+ as an inspiration.

Another curiosity, relevant to us here at LtU, is that kdb+ has its own programming language, q. q is a variant of APL with a special library for statistics. Most "big data" solutions don't have native implementations for weighted average, which is a fairly important and frequently used function in quantitative finance, useful for computing volume weighted average price (VWAP) as well as tilt and weighted spread. q is itself implemented in another language, k. The whole language of each is just a couple lines of (terse) code.

Coordinated concurrent programming in Syndicate

Coordinated concurrent programming in Syndicate
Tony Garnock-Jones and Matthias Felleisen.
2016

Most programs interact with the world: via graphical user interfaces, networks, etc. This form of interactivity entails concurrency, and concurrent program components must coordinate their computations. This paper presents Syndicate, a novel design for a coordinated, concurrent programming language. Each concurrent component in Syndicate is a functional actor that participates in scoped conversations. The medium of conversation arranges for message exchanges and coordinates access to common knowledge. As such, Syndicate occupies a novel point in this design space, halfway between actors and threads.

If you want to understand the language, I would recommend looking at sections 1 to 2.2 (motivation and introducory examples) and then jumping at section 5, which presents fairly interesting designs for larger programs.

Concurrent program components must coordinate their computations to realize the overall goals of the program. This coordination takes two forms: the exchange of knowledge and the establishment of frame conditions. In addition, coordination must take into account that reactions to events may call for the creation of new concurrent components or that existing components may disappear due to exceptions or partial failures. In short, coordination poses a major problem to the proper design of effective communicating, concurrent components.

This paper presents Syndicate, a novel language for coordinated concurrent programming. A Syndicate program consists of functional actors that participate in precisely scoped conversations. So-called networks coordinate these conversations. When needed, they apply a functional actor to an event and its current state; in turn, they receive a new state plus descriptions of actions. These actions may represent messages for other participants in the conversations or assertions for a common space of knowledge.

Precise scoping implies a separation of distinct conversations, and hence existence of multiple networks. At the same time, an actor in one network may have to communicate with an actor in a different network. To accommodate such situations, Syndicate allows the embedding of one network into another as if the first were just an actor within the second. In other words, networks simultaneously scope and compose conversations. The resulting tree-structured shape of networked conversations corresponds both to tree-like arrangements of containers and processes in modern operating systems and to the nesting of layers in network protocols [1]. Syndicate thus unifies the programming techniques of distributed programming with those of coordinated concurrent programming.

By construction, Syndicate networks also manage resources. When a new actor appears in a conversation, a network allocates the necessary resources. When an actor fails, it deallocates the associated resources. In particular, it retracts all shared state associated with the actor, thereby making the failure visible to interested participants. Syndicate thus solves notorious problems of service discovery and resource management in the coordination of communicating components.

In sum, Syndicate occupies a novel point in the design space of coordinated concurrent (functional) components (sec. 2), sitting firmly between a thread- based world with sharing and local-state-only, message-passing actors. Our de- sign for Syndicate includes two additional contributions: an efficient protocol for incrementally maintaining the common knowledge base and a trie-based data structure for efficiently indexing into it (sec. 3). Finally, our paper presents eval- uations concerning the fundamental performance characteristics of Syndicate as well as its pragmatics (secs. 4 and 5).

Our examples illustrate the key properties of Syndicate and their unique combination. Firstly, the box and demand-matcher examples show that Syndicate conversations may involve many parties, generalizing the Actor model’s point-to-point conversations. At the same time, the file server example shows that Syndicate conversations are more precisely bounded than those of traditional Actors. Each of its networks crisply delimits its contained conversations, each of which may therefore use a task-appropriate language of discourse.
Secondly, all three examples demonstrate the shared-dataspace aspect of Syndicate. Assertions made by one actor can influence other actors, but cannot directly alter or remove assertions made by others. The box’s content is made visible through an assertion in the dataspace, and any actor that knows id can retrieve the assertion. The demand-matcher responds to changes in the dataspace that denote the existence of new conversations. The file server makes file contents available through assertions in the (outer) dataspace, in response to clients placing subscriptions in that dataspace.
Finally, Syndicate places an upper bound on the lifetimes of entries in the shared space. Items may be asserted and retracted by actors at will in response to incoming events, but when an actor crashes, all of its assertions are automatically retracted. If the box actor were to crash during a computation, the assertion describing its content would be visibly withdrawn, and peers could take some compensating action. The demand matcher can be enhanced to monitor supply as well as demand and to take corrective action if some worker instance exits unexpectedly. The combination of this temporal bound on assertions with Syndicate’s state change notifications gives good failure-signalling and fault-tolerance properties, improving on those seen in Erlang [7].

Syndicate draws directly on Network Calculus [2], which, in turn, has borrowed elements from Actor models [16,17,18], process calculi [19,20,21,22,23], and actor languages such as Erlang [7], Scala [24], E [25] and AmbientTalk [26].

This work makes a new connection to shared-dataspace coordination models [27], including languages such as Linda [28] and Concurrent ML (CML) [29]. Linda’s tuplespaces correspond to Syndicate’s dataspaces, but Linda is “generative,” meaning that its tuples take on independent existence once created. Syndicate’s assertions instead exist only as long as some actor continues to assert them, which provides a natural mechanism for managing resources and dealing with partial failures (sec. 2). Linda research on failure-handling focuses mostly on atomicity and transactions [30], though Rowstron introduces agent wills [31] and uses them to build a fault-tolerance mechanism. Turning to multiple tuplespaces, the Linda variants Klaim [32] and Lime [33] offer multiple spaces and different forms of mobility. Papadopoulos [34] surveys the many other variations; Syndicate’s non-mobile, hierarchical, nameless actors and networks occupy a hitherto unexplored point in this design space.

Some of the proposed designs were surprising to me. There is a reversal of perspective, from the usual application-centric view of applications being first, with lower-level services hidden under abstraction layers, to a more medium-directed perspective that puts the common communication layer first -- in the example of the TCP/IP stack, this is the OS kernel.

By gasche at 2016-01-29 22:33 | Parallel/Distributed | 1 comment | other blogs | 21122 reads

Composite Replicated Data Types: eventually consistent libraries as non-leaky abstractions

Composite Replicated Data Types
Alexey Gotsman and Hongseok Yang
2015

Modern large-scale distributed systems often rely on eventually consistent replicated stores, which achieve scalability in exchange for providing weak semantic guarantees. To compensate for this weakness, researchers have proposed various abstractions for programming on eventual consistency, such as replicated data types for resolving conflicting updates at different replicas and weak forms of transactions for maintaining relationships among objects. However, the subtle semantics of these abstractions makes using them correctly far from trivial.

To address this challenge, we propose composite replicated data types, which formalise a common way of organising applications on top of eventually consistent stores. Similarly to a class or an abstract data type, a composite data type encapsulates objects of replicated data types and operations used to access them, implemented using transactions. We develop a method for reasoning about programs with composite data types that reflects their modularity: the method allows abstracting away the internals of composite data type implementations when reasoning about their clients. We express the method as a denotational semantics for a programming language with composite data types. We demonstrate the effectiveness of our semantics by applying it to verify subtle data type examples and prove that it is sound and complete with respect to a standard non-compositional semantics

By gasche at 2015-05-21 15:29 | Parallel/Distributed | 55 comments | other blogs | 16030 reads

A Next Generation Smart Contract and Decentralized Application Platform

A Next Generation Smart Contract and Decentralized Application Platform, Vitalik Buterin.

When Satoshi Nakamoto first set the Bitcoin blockchain into motion in January 2009, he was simultaneously introducing two radical and untested concepts. The first is the "bitcoin", a decentralized peer-to-peer online currency that maintains a value without any backing, intrinsic value or central issuer. So far, the "bitcoin" as a currency unit has taken up the bulk of the public attention, both in terms of the political aspects of a currency without a central bank and its extreme upward and downward volatility in price. However, there is also another, equally important, part to Satoshi's grand experiment: the concept of a proof of work-based blockchain to allow for public agreement on the order of transactions. Bitcoin as an application can be described as a first-to-file system: if one entity has 50 BTC, and simultaneously sends the same 50 BTC to A and to B, only the transaction that gets confirmed first will process. There is no intrinsic way of determining from two transactions which came earlier, and for decades this stymied the development of decentralized digital currency. Satoshi's blockchain was the first credible decentralized solution. And now, attention is rapidly starting to shift toward this second part of Bitcoin's technology, and how the blockchain concept can be used for more than just money.

Commonly cited applications include using on-blockchain digital assets to represent custom currencies and financial instruments ("colored coins"), the ownership of an underlying physical device ("smart property"), non-fungible assets such as domain names ("Namecoin") as well as more advanced applications such as decentralized exchange, financial derivatives, peer-to-peer gambling and on-blockchain identity and reputation systems. Another important area of inquiry is "smart contracts" - systems which automatically move digital assets according to arbitrary pre-specified rules. For example, one might have a treasury contract of the form "A can withdraw up to X currency units per day, B can withdraw up to Y per day, A and B together can withdraw anything, and A can shut off B's ability to withdraw". The logical extension of this is decentralized autonomous organizations (DAOs) - long-term smart contracts that contain the assets and encode the bylaws of an entire organization. What Ethereum intends to provide is a blockchain with a built-in fully fledged Turing-complete programming language that can be used to create "contracts" that can be used to encode arbitrary state transition functions, allowing users to create any of the systems described above, as well as many others that we have not yet imagined, simply by writing up the logic in a few lines of code.

Includes code samples.

Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading

Junfeng Yang, Heming Cui, Jingyue Wu, Yang Tang, and Gang Hu, "Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading", Communications of the ACM, Vol. 57 No. 3, Pages 58-69.

We believe what makes multithreading hard is rather quantitative: multithreaded programs have too many schedules. The number of schedules for each input is already enormous because the parallel threads may interleave in many ways, depending on such factors as hardware timing and operating system scheduling. Aggregated over all inputs, the number is even greater. Finding a few schedules that trigger concurrency errors out of all enormously many schedules (so developers can prevent them) is like finding needles in a haystack. Although Deterministic Multi-Threading reduces schedules for each input, it may map each input to a different schedule, so the total set of schedules for all inputs remains enormous.

We attacked this root cause by asking: are all the enormously many schedules necessary? Our study reveals that many real-world programs can use a small set of schedules to efficiently process a wide range of inputs. Leveraging this insight, we envision a new approach we call stable multithreading (StableMT) that reuses each schedule on a wide range of inputs, mapping all inputs to a dramatically reduced set of schedules. By vastly shrinking the haystack, it makes the needles much easier to find. By mapping many inputs to the same schedule, it stabilizes program behaviors against small input perturbations.

The link above is to a publicly available pre-print of the article that appeared in the most recent CACM. The CACM article is a summary of work by Junfeng Yang's research group. Additional papers related to this research can be found at http://www.cs.columbia.edu/~junfeng/

LVars: monotonic update for deterministic parallel programming

LVars are one outcome of Lindsey Kuper's ongoing PhD work at Indiana University. They generalize existing models for deterministic parallelism by considering a general framework of monotonic read and write operations. They were briefly mentioned on LtU before (along with the strongly related work on Bloom in the distributed systems community), and were recently presented in two distinct and complementary articles.

The first article describes the basic building blocks and ideas of LVars:

LVars: Lattice-Based Data Structures for Deterministic Parallelism
Lindsey Kuper, Ryan R. Newton
2013

Programs written using a deterministic-by-construction model of parallel computation are guaranteed to always produce the same observable results, offering programmers freedom from subtle, hard-to-reproduce nondeterministic bugs that are the scourge of parallel software. We present LVars, a new model for deterministic- by-construction parallel programming that generalizes existing single-assignment models to allow multiple assignments that are monotonically increasing with respect to a user-specified lattice.

LVars ensure determinism by allowing only monotonic writes and "threshold" reads that block until a lower bound is reached. We give a proof of determinism and a prototype implementation for a language with LVars and describe how to extend the LVars model to support a limited form of nondeterminism that admits failures but never wrong answers

The second relaxes the original model by introducing failure, which widens its applicability:

Freeze After Writing: Quasi-Deterministic Parallel Programming with LVars
Lindsey Kuper, Aaron Turon, Neelakantan Krishnaswami, Ryan R. Newton
2014

Deterministic-by-construction parallel programming models offer programmers the promise of freedom from subtle, hard-to-reproduce nondeterministic bugs in parallel code. A principled approach to deterministic-by-construction parallel programming with shared state is offered by LVars: shared memory locations whose semantics are defined in terms of a user-specified lattice. Writes to an LVar take the least upper bound of the old and new values with respect to the lattice, while reads from an LVar can observe only that its contents have crossed a specified threshold in the lattice. Although it guarantees determinism, this interface is quite limited.

We extend LVars in two ways. First, we add the ability to â€œfreezeâ€ and then read the contents of an LVar directly. Second, we add the ability to attach callback functions to an LVar, allowing events to be triggered by writes to it. Together, callbacks and freezing enable an expressive and useful style of parallel programming. We prove that in a language where communication takes place through freezable LVars, programs are at worst quasi-deterministic: on every run, they either produce the same answer or raise an error. We demonstrate the viability of our approach by implementing a library for Haskell supporting a variety of LVar-based data structures, together with two case studies that illustrate the programming model and yield promising parallel speedup.

Something I personally found surprising and impressive about LVars is that, while I was initially interested in the very formal aspects of providing a theoretical framework for deterministic concurrency, it very quickly produced a practical library that people can use to write parallel program -- and competitive with existing high-performance approaches. As described in a recent blog post, a Haskell library is available on Hackage -- but surely LVars-inspired libraries could make sense in a lot of other languages as well.

By gasche at 2013-10-07 08:56 | Parallel/Distributed | 9 comments | other blogs | 36802 reads

Concurrent Revisions

Concurrent Revisions is a Microsoft Research project doing interesting work in making concurrent programming scalable and easier to reason about. These papers work have been mentioned a number of times here on LtU, but none of them seem to have been officially posted as stories.

Concurrent Revisions are a distributed version control-like abstraction [1] for concurrently mutable state that requires clients to specify merge functions that make fork-join deterministic, and so make concurrent programs inherently composable. The library provide default merge behaviour for various familiar objects like numbers and lists, and it seems somewhat straightforward to provide a merge function for many other object types.

They've also extended the work to seamlessly integrate incremental and parallel computation [2] in a fairly intuitive fashion, in my opinion.

Their latest work [3] extends these concurrent revisions to distributed scenarios with disconnected operations, which operate much like distributed version control works with source code, with guarantees of eventual consistency.

All in all, a very promising approach, and deserving of wider coverage.

[1] Sebastian Burckhardt and Daan Leijen, Semantics of Concurrent Revisions, in European Symposium on Programming (ESOP'11), Springer Verlag, Saarbrucken, Germany, March 2011
[2] Sebastian Burckhardt, Daan Leijen, Caitlin Sadowski, Jaeheon Yi, and Thomas Ball, Two for the Price of One: A Model for Parallel and Incremental Computation, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'11), ACM SIGPLAN, Portland, Oregon, 22 October 2011
[3] Sebastian Burckhardt, Manuel Fahndrich, Daan Leijen, and Benjamin P. Wood, Cloud Types for Eventual Consistency, in Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP), Springer, 15 June 2012

Visi.io

Visi.io comes from David Pollak and aims at revolutionizing building tablet apps, but the main attraction now seems to be in exploring the way data flow and cloud computing can be integrated. The screencast is somewhat underwhelming but at least convinces me that there is a working prototype (I haven't looked further than the website as yet). The vision document has some nice ideas. Visi.io came up recently in the discussion of the future of spreadsheets.

Lambda the Ultimate

User login

Navigation