kdb+ is a real-time time series database, known in the financial services universe as the fastest tick database on the market. It was first conceived by Arthur Whitney at Morgan Stanley as a prototype, and over the last 35+ years has grown to add many features. The database makes such aggressive usage of mmap() POSIX function for mapping file chunks into main memory, to the point where it has exposed issues with the implementation of mmap itself.
Recently, the company now behind kdb+ has also built Kx for DAAS (Data-as-a-Service), which is basically a cloud-based, massively clustered version of kdb+ that deals with the curious oddity that kdb+ is effectively entirely singly threaded. For those interested in reading more about kdb+'s unique cloud architecture (as compared to "big data" solutions like Hadoop), you can read the following whitepapers as suggestive guidelines for how the q community thinks about truly "big data" several orders of magnitude faster and larger than most Hadoop data sets:
While I don't suggest these papers are the blueprint for copying/mimicking the DAAS product, it does help the LtU reader imagine a "different world" of data processing than the often cited Map/Reduce paper and other more mainstream approaches. What is particularly striking is how tiny q.exe (the program that runs kdb+ and provides a CLI for q scripting) is. Language researchers are looking at provably correct C compilers, and it is not a huge leap to think about the world soon seeing provably correct real-time time series databases using kdb+ as an inspiration.
Another curiosity, relevant to us here at LtU, is that kdb+ has its own programming language, q. q is a variant of APL with a special library for statistics. Most "big data" solutions don't have native implementations for weighted average, which is a fairly important and frequently used function in quantitative finance, useful for computing volume weighted average price (VWAP) as well as tilt and weighted spread. q is itself implemented in another language, k. The whole language of each is just a couple lines of (terse) code.
Coordinated concurrent programming in Syndicate
Tony Garnock-Jones and Matthias Felleisen.
Most programs interact with the world: via graphical user interfaces, networks, etc. This form of interactivity entails concurrency, and concurrent program components must coordinate their computations. This paper presents Syndicate, a novel design for a coordinated, concurrent programming language. Each concurrent component in Syndicate is a functional actor that participates in scoped conversations. The medium of conversation arranges for message exchanges and coordinates access to common knowledge. As such, Syndicate occupies a novel point in this design space, halfway between actors and threads.
If you want to understand the language, I would recommend looking at sections 1 to 2.2 (motivation and introducory examples) and then jumping at section 5, which presents fairly interesting designs for larger programs.
Concurrent program components must coordinate their computations to realize the overall goals of the program. This coordination takes two forms: the exchange of knowledge and the establishment of frame conditions. In addition, coordination must take into account that reactions to events may call for the creation of new concurrent components or that existing components may disappear due to exceptions or partial failures. In short, coordination poses a major problem to the proper design of effective communicating, concurrent components.
This paper presents Syndicate, a novel language for coordinated concurrent
programming. A Syndicate program consists of functional actors that participate in precisely scoped conversations. So-called networks coordinate these conversations. When needed, they apply a functional actor to an event and its current state; in turn, they receive a new state plus descriptions of actions. These
actions may represent messages for other participants in the conversations or
assertions for a common space of knowledge.
Precise scoping implies a separation of distinct conversations, and hence existence of multiple networks. At the same time, an actor in one network may
have to communicate with an actor in a different network. To accommodate such
situations, Syndicate allows the embedding of one network into another as if
the first were just an actor within the second. In other words, networks simultaneously scope and compose conversations. The resulting tree-structured shape
of networked conversations corresponds both to tree-like arrangements of containers and processes in modern operating systems and to the nesting of layers
in network protocols . Syndicate thus unifies the programming techniques
of distributed programming with those of coordinated concurrent programming.
By construction, Syndicate networks also manage resources. When a new
actor appears in a conversation, a network allocates the necessary resources.
When an actor fails, it deallocates the associated resources. In particular, it
retracts all shared state associated with the actor, thereby making the failure
visible to interested participants. Syndicate thus solves notorious problems of
service discovery and resource management in the coordination of communicating components.
In sum, Syndicate occupies a novel point in the design space of coordinated
concurrent (functional) components (sec. 2), sitting firmly between a thread-
based world with sharing and local-state-only, message-passing actors. Our de-
sign for Syndicate includes two additional contributions: an efficient protocol
for incrementally maintaining the common knowledge base and a trie-based data
structure for efficiently indexing into it (sec. 3). Finally, our paper presents eval-
uations concerning the fundamental performance characteristics of Syndicate
as well as its pragmatics (secs. 4 and 5).
Our examples illustrate the key properties of Syndicate and their unique
combination. Firstly, the box and demand-matcher examples show that Syndicate conversations may involve many parties, generalizing the Actor model’s
point-to-point conversations. At the same time, the file server example shows
that Syndicate conversations are more precisely bounded than those of traditional Actors. Each of its networks crisply delimits its contained conversations,
each of which may therefore use a task-appropriate language of discourse.
Secondly, all three examples demonstrate the shared-dataspace aspect of
Syndicate. Assertions made by one actor can influence other actors, but cannot
directly alter or remove assertions made by others. The box’s content is made
visible through an assertion in the dataspace, and any actor that knows id can
retrieve the assertion. The demand-matcher responds to changes in the dataspace that denote the existence of new conversations. The file server makes file
contents available through assertions in the (outer) dataspace, in response to
clients placing subscriptions in that dataspace.
Finally, Syndicate places an upper bound on the lifetimes of entries in the
shared space. Items may be asserted and retracted by actors at will in response
to incoming events, but when an actor crashes, all of its assertions are automatically retracted. If the box actor were to crash during a computation, the
assertion describing its content would be visibly withdrawn, and peers could take
some compensating action. The demand matcher can be enhanced to monitor
supply as well as demand and to take corrective action if some worker instance
exits unexpectedly. The combination of this temporal bound on assertions with
Syndicate’s state change notifications gives good failure-signalling and fault-tolerance properties, improving on those seen in Erlang .
Syndicate draws directly on Network Calculus , which, in turn, has borrowed
elements from Actor models [16,17,18], process calculi [19,20,21,22,23], and actor
languages such as Erlang , Scala , E  and AmbientTalk .
This work makes a new connection to shared-dataspace coordination models , including languages such as Linda  and Concurrent ML (CML) .
Linda’s tuplespaces correspond to Syndicate’s dataspaces, but Linda is “generative,” meaning that its tuples take on independent existence once created.
Syndicate’s assertions instead exist only as long as some actor continues to
assert them, which provides a natural mechanism for managing resources and
dealing with partial failures (sec. 2). Linda research on failure-handling focuses
mostly on atomicity and transactions , though Rowstron introduces agent
wills  and uses them to build a fault-tolerance mechanism. Turning to multiple tuplespaces, the Linda variants Klaim  and Lime  offer multiple
spaces and different forms of mobility. Papadopoulos  surveys the many other
variations; Syndicate’s non-mobile, hierarchical, nameless actors and networks
occupy a hitherto unexplored point in this design space.
Some of the proposed designs were surprising to me. There is a reversal of perspective, from the usual application-centric view of applications being first, with lower-level services hidden under abstraction layers, to a more medium-directed perspective that puts the common communication layer first -- in the example of the TCP/IP stack, this is the OS kernel.
Composite Replicated Data Types
Alexey Gotsman and Hongseok Yang
Modern large-scale distributed systems often rely on eventually consistent replicated stores, which achieve scalability in exchange for providing weak semantic guarantees. To compensate for this weakness, researchers have proposed various abstractions for programming on eventual consistency, such as replicated data types for resolving conflicting updates at different replicas and weak forms of transactions for maintaining relationships among objects. However, the subtle semantics of these abstractions makes using them correctly far from trivial.
To address this challenge, we propose composite replicated data types, which formalise a common way of organising applications on top of eventually consistent stores. Similarly to a class or an abstract data type, a composite data type encapsulates objects of replicated data types and operations used to access them, implemented using transactions. We develop a method for reasoning about programs with composite data types that reflects their modularity: the method allows abstracting away the internals of composite data type implementations when reasoning about their clients. We express the method as a denotational semantics for a programming language with composite data types. We demonstrate the effectiveness of our semantics by applying it to verify subtle data type examples and prove that it is sound and complete with respect to a standard non-compositional semantics
A Next Generation Smart Contract and Decentralized Application Platform, Vitalik Buterin.
When Satoshi Nakamoto first set the Bitcoin blockchain into motion in January 2009, he was simultaneously introducing two radical and untested concepts. The first is the "bitcoin", a decentralized peer-to-peer online currency that maintains a value without any backing, intrinsic value or central issuer. So far, the "bitcoin" as a currency unit has taken up the bulk of the public attention, both in terms of the political aspects of a currency without a central bank and its extreme upward and downward volatility in price. However, there is also another, equally important, part to Satoshi's grand experiment: the concept of a proof of work-based blockchain to allow for public agreement on the order of transactions. Bitcoin as an application can be described as a first-to-file system: if one entity has 50 BTC, and simultaneously sends the same 50 BTC to A and to B, only the transaction that gets confirmed first will process. There is no intrinsic way of determining from two transactions which came earlier, and for decades this stymied the development of decentralized digital currency. Satoshi's blockchain was the first credible decentralized solution. And now, attention is rapidly starting to shift toward this second part of Bitcoin's technology, and how the blockchain concept can be used for more than just money.
Commonly cited applications include using on-blockchain digital assets to represent custom currencies and financial instruments ("colored coins"), the ownership of an underlying physical device ("smart property"), non-fungible assets such as domain names ("Namecoin") as well as more advanced applications such as decentralized exchange, financial derivatives, peer-to-peer gambling and on-blockchain identity and reputation systems. Another important area of inquiry is "smart contracts" - systems which automatically move digital assets according to arbitrary pre-specified rules. For example, one might have a treasury contract of the form "A can withdraw up to X currency units per day, B can withdraw up to Y per day, A and B together can withdraw anything, and A can shut off B's ability to withdraw". The logical extension of this is decentralized autonomous organizations (DAOs) - long-term smart contracts that contain the assets and encode the bylaws of an entire organization. What Ethereum intends to provide is a blockchain with a built-in fully fledged Turing-complete programming language that can be used to create "contracts" that can be used to encode arbitrary state transition functions, allowing users to create any of the systems described above, as well as many others that we have not yet imagined, simply by writing up the logic in a few lines of code.
Includes code samples.
Junfeng Yang, Heming Cui, Jingyue Wu, Yang Tang, and Gang Hu, "Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading", Communications of the ACM, Vol. 57 No. 3, Pages 58-69.
We believe what makes multithreading hard is rather quantitative: multithreaded programs have too many schedules. The number of schedules for each input is already enormous because the parallel threads may interleave in many ways, depending on such factors as hardware timing and operating system scheduling. Aggregated over all inputs, the number is even greater. Finding a few schedules that trigger concurrency errors out of all enormously many schedules (so developers can prevent them) is like finding needles in a haystack. Although Deterministic Multi-Threading reduces schedules for each input, it may map each input to a different schedule, so the total set of schedules for all inputs remains enormous.
We attacked this root cause by asking: are all the enormously many schedules necessary? Our study reveals that many real-world programs can use a small set of schedules to efficiently process a wide range of inputs. Leveraging this insight, we envision a new approach we call stable multithreading (StableMT) that reuses each schedule on a wide range of inputs, mapping all inputs to a dramatically reduced set of schedules. By vastly shrinking the haystack, it makes the needles much easier to find. By mapping many inputs to the same schedule, it stabilizes program behaviors against small input perturbations.
The link above is to a publicly available pre-print of the article that appeared in the most recent CACM. The CACM article is a summary of work by Junfeng Yang's research group. Additional papers related to this research can be found at http://www.cs.columbia.edu/~junfeng/
LVars are one outcome of Lindsey Kuper's ongoing
PhD work at Indiana University. They generalize existing models for
deterministic parallelism by considering a general framework of
monotonic read and write operations. They were briefly mentioned
on LtU before (along with the strongly related work on Bloom in the distributed
systems community), and were recently presented in two
distinct and complementary articles.
The first article describes the basic building blocks and ideas of LVars:
LVars: Lattice-Based Data Structures for Deterministic Parallelism
Lindsey Kuper, Ryan R. Newton
Programs written using a deterministic-by-construction model of
parallel computation are guaranteed to always produce the same
observable results, offering programmers freedom from subtle,
hard-to-reproduce nondeterministic bugs that are the scourge of
parallel software. We present LVars, a new model for deterministic-
by-construction parallel programming that generalizes existing
single-assignment models to allow multiple assignments that are
monotonically increasing with respect to a user-specified lattice.
LVars ensure determinism by allowing only monotonic writes and
"threshold" reads that block until a lower bound is reached. We
give a proof of determinism and a prototype implementation for a
language with LVars and describe how to extend the LVars model
to support a limited form of nondeterminism that admits failures
but never wrong answers
The second relaxes the original model by introducing failure, which
widens its applicability:
Freeze After Writing: Quasi-Deterministic Parallel Programming with LVars
Lindsey Kuper, Aaron Turon, Neelakantan Krishnaswami, Ryan R. Newton
Deterministic-by-construction parallel programming models offer
programmers the promise of freedom from subtle, hard-to-reproduce
nondeterministic bugs in parallel code. A principled approach to
deterministic-by-construction parallel programming with shared state
is offered by LVars: shared memory locations whose semantics are
defined in terms of a user-specified lattice. Writes to an LVar take
the least upper bound of the old and new values with respect to the
lattice, while reads from an LVar can observe only that its contents
have crossed a specified threshold in the lattice. Although it
guarantees determinism, this interface is quite limited.
We extend LVars in two ways. First, we add the ability to â€œfreezeâ€
and then read the contents of an LVar directly. Second, we add the
ability to attach callback functions to an LVar, allowing events to be
triggered by writes to it. Together, callbacks and freezing enable
an expressive and useful style of parallel programming. We prove that
in a language where communication takes place through freezable LVars,
programs are at worst quasi-deterministic: on every run, they either
produce the same answer or raise an error. We demonstrate the
viability of our approach by implementing a library for Haskell
supporting a variety of LVar-based data structures, together with
two case studies that illustrate the programming model and yield
promising parallel speedup.
Something I personally found surprising and impressive about LVars is
that, while I was initially interested in the very formal aspects of
providing a theoretical framework for deterministic concurrency, it
very quickly produced a practical library that people can use to write
parallel program -- and competitive with existing high-performance
approaches. As described in a
recent blog post, a Haskell library is available on Hackage -- but
surely LVars-inspired libraries could make sense in a lot of other
languages as well.
Concurrent Revisions is a Microsoft Research project doing interesting work in making concurrent programming scalable and easier to reason about. These papers work have been mentioned a number of times here on LtU, but none of them seem to have been officially posted as stories.
Concurrent Revisions are a distributed version control-like abstraction  for concurrently mutable state that requires clients to specify merge functions that make fork-join deterministic, and so make concurrent programs inherently composable. The library provide default merge behaviour for various familiar objects like numbers and lists, and it seems somewhat straightforward to provide a merge function for many other object types.
They've also extended the work to seamlessly integrate incremental and parallel computation  in a fairly intuitive fashion, in my opinion.
Their latest work  extends these concurrent revisions to distributed scenarios with disconnected operations, which operate much like distributed version control works with source code, with guarantees of eventual consistency.
All in all, a very promising approach, and deserving of wider coverage.
 Sebastian Burckhardt and Daan Leijen, Semantics of Concurrent Revisions, in European Symposium on Programming (ESOP'11), Springer Verlag, Saarbrucken, Germany, March 2011
 Sebastian Burckhardt, Daan Leijen, Caitlin Sadowski, Jaeheon Yi, and Thomas Ball, Two for the Price of One: A Model for Parallel and Incremental Computation, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'11), ACM SIGPLAN, Portland, Oregon, 22 October 2011
 Sebastian Burckhardt, Manuel Fahndrich, Daan Leijen, and Benjamin P. Wood, Cloud Types for Eventual Consistency, in Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP), Springer, 15 June 2012
Visi.io comes from David Pollak and aims at revolutionizing building tablet apps, but the main attraction now seems to be in exploring the way data flow and cloud computing can be integrated. The screencast is somewhat underwhelming but at least convinces me that there is a working prototype (I haven't looked further than the website as yet). The vision document has some nice ideas. Visi.io came up recently in the discussion of the future of spreadsheets.
The Milner Symposium 2012 was held in Edinburgh this April in memory of the late Robin Milner.
The Milner Symposium is a celebration of the life and work of one of the world's greatest computer scientists, Robin Milner. The symposium will feature leading researchers whose work is inspired by Robin Milner.
The programme consisted of academic talks by colleagues and past students. The talks and slides are available online.
I particularly liked the interleaving of the personal and human narrative underlying the scientific journey. A particularly good example is Joachim Parrow's talk on the origins of the pi calculus. Of particular interest to LtU members is the panel on the future of functional programming languages, consisting of Phil Wadler, Xavier Leroy, David MacQueen, Martin Odersky, Simon Peyton-Jones, and Don Syme.
Tony Arcieri, author of the Reia Ruby-like language for the Erlang BEAM platform, wrote a piece in July, The Trouble with Erlang (or Erlang is a ghetto), bringing together a long laundry list of complaints about Erlang and the concepts behind it, and arguing at the end that Clojure now provides a better basis for parallel programming in practice.
While the complaints include many points about syntax, data types, and the like, the heart of the critique is two-fold: first, that Erlang has terrible problems managing memory and does not scale as advertised, and that these failures partly follow from "Erlang hat[ing] state. It especially hates shared state." He points to the Goetz and Click argument in Concurrency Revolution From a Hardware Perspective (2010) that local state is compatible with the Actors model. He further argues that SSA as it is used in Erlang is less safe than local state.