C++ Futures

The next C++ standard is supposed to include futures in the libraries. Futures allow assignment to a value that is to be calculated at a later time - assigning a promise in lieu of the final value until such time as it becomes available. The current thread carries on until the value is actually needed, at which time the value is made available or a the thread goes into a wait state. Being a library facility, instead of a built-in language facility as in Oz or Alice-ML, the use of futures is not nearly as transparent. Still, one should be able to get a similar effect. In an article on Broken promises–C++0x futures, Bartosz Milewski criticizes the C++ implementation for it's lack of composability.

What was omitted from the standard, however, was the ability to compose futures. Suppose you start several threads to perform calculations or retrieve data in parallel. You want to communicate with those threads using futures. And here’s the problem: you may block on any individual future but not on all of them. While you are blocked on one, other futures might become ready. Instead of spending your precious time servicing those other futures, you might be blocked on the most time-consuming one. The only option to process futures in order of completion is by polling (calling is_ready on consecutive futures in a loop) and burning the processor.

My personal opinion is that what he is really saying is that concurrency through futures (which in the simple case would be a declarative form of concurreny) don't easily do message-based concurrency. I suppose one could look at CTM and describe how to do Erlang type concurrency with nothing more than dataflow variables. But I think the bigger problem is that we assume that there is a single approach to solving the concurrency problem. We might as well say that STM is good, but it fails to deliver declarative or message concurrency. I personally think languages, either through libraries or built-in language facilities, should provide multiple ways of dealing with concurrency and distribution. Although I don't use C++ anymore, I'm glad that they are integrating lightweight concurrency models into the language.

(Surprised we don't have a category dedicated to concurrency, so I'll post this under parallel/distributed.)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Futures, laziness, OO concurrency abstractions

I suppose the C++ proposal does not allow one to register a continuation. MS's Task Parallal Library has futures as well, but they expose a ContinueWith method which allows you to register a function to execute when the future resolves. This would solve the above issue for C++.

An alternate idea, is to provide a Unix select() type operation, which waits on a set of futures, and returns a list of resolved futures from that set. This requires a slightly different program structure, but it has a long history of working well for server-style systems.

I've played with my own implementations of futures on .NET, and while nice for simple async execution purposes, you quickly run into the standard lazy evaluation problems if they are used more pervasively: you need some way to enforce sequencing when performing asynchronous operations on an effectful resource, like a file stream. I imagine C++ futures have these issues as well.

Any suggestions for how to tame this? Are there any concurrency abstractions that integrate better with typical OO languages? Monads are obvious, but can't be properly typed on .NET. One idea I played with a few years ago was a lazy list structure, which fully encapsulated a resource and exposed only the current value, and the next, unresolved value. This is clearly monadic, but it enforces a particular structure that is easier to statically type.

Microsoft's CCR takes an interesting approach, using .NET's generators to yield synchronization values. Concurrent programs are like actors in fact. It allows you to write sequential code and the CCR runtime takes care of suspending and resuming execution for blocking operations.

The only downsides to the CCR approach that I ran into, are that generators are not serializable, so you can't migrate them to other machines or persist them to disk (well, you can, but you have to write your own serializer and ignore Serializable attributes).

Futures done right

I've talked to people involved in C++ futures design. Simple composability features similar to "select" have been proposed both for the Standard and for Boost, but were voted down.

What I had in mind was the ability to combine futures into higher order futures the way CML events work (granted, they use continuations, but I think a simpler selection mechanism would also work). The relevant paper is John Reppy's Ph.D. thesis, Higher Order Concurrency. The CML event library was also ported to Concurrent Haskell.

perhaps just a new function is necessary

That's a good observation!

If you want to build a new future by composing existing futures, where the composed future is ready only when all its parts are without polling, that is not too hard to do with the existing library functionality. (Although such an implementation would not be optimal.)

However, to capture the other half of WaitForMultipleObjects, i.e. waiting for *some* future to be ready, future "composition" would seem to imply a "sum" type of futures. Compared to the "product" type mentioned above, which waits for all to be ready, I think this sum type would be awkward to use; variants are not the most natural thing in C++.

All I think would be needed is a standard library function, analogous to WaitForMultipleObjects, taking futures instead of file descriptors. This could then be used to build future sum and product futures efficiently.

Sum of futures

It often happens in practice that there is a set of futures all returning the same type. In that case the logical sum could be very useful. One example is spawning two or more threads that do the same calculation using different algorithms or different data sources (e.g., searching multiple files). Whichever finishes first wins. Another example is a server waiting for several conversations.

In CML the variant problem is solved by providing separate continuations for each event using the function wrap. In principle this could be implemented in C++ by wrapping a future with a function object.

good point

So, for the heterogeneous sums using continuations, would you expect the interface to look something like:

class heterogenous_sum {
  heterogeneous_sum();
  template <class T>
    void add_case(future<T>, function<T>);
  void wait();
};

Where you first add all the desired cases and then call wait?

Service combinators

This reminds me of the service combinators concept of Cardelli and Davies. Their application was programming with unreliable and/or redundant network resources, but many of the concepts apply even in the absence of geographic distribution.

I implemented a library to support the service combinator primitives on top of Java threads almost ten years ago -- unfortunately, I don't have the code (and it was work for hire, so I couldn't distribute it in any case); but as I recall it was fairly straightforward (with only a few clever hacks necessary to work around then-limitations in the Java thread model).

Minor comment on waiting for multiple futures

The issue is related to nondeterminism. Deterministic operations on multiple futures, such as barrier synchronization (product), can be done if the only primitive is synchronizing on a single future. Nondeterministic operations such as waiting on the first client to respond (sum) need something more: a nondeterministic wait on multiple futures.

CTM proposes a WaitTwo operation that waits nondeterministically on two futures and can return 1 if the first is ready and 2 if the second is ready. You can implement WaitTwo with multiple threads and exceptions, but it is inefficient. It really should be a primitive. If you want to implement a MapReduce that is fault-tolerant, you need the nondeterministic wait.

Multiple approaches to concurrency

But I think the bigger problem is that we assume that there is a single approach to solving the concurrency problem.

This is IMHO one of the least appreciated features of Clojure: the fact that it has 4 orthogonal, composable, combinable concurrency approaches (5, if you count deeply immutable persistent data structures and pure functions): Vars (hybrid global/thread-local variables with concurrency semantics), Refs (STM variables), Agents (message passing concurrency) and Atoms (atomically updateable variables).

Limited Futures

I'm somewhat wary about futures... it is easy to apply them (entirely by accident) in such a way as you produce dependency cycles - not deadlocks, but cyclic dependencies. Given three actors or threads:

  A queries B receives future F1
  A forwards a request containing F1 to C (drops reply)
  A waits on F1
  B queries C receives future F2
  B replies with X2(F2) (into F1)
  C replies with X1(F1) (into F2)

In this case F1 = X2(X1(F1)). In C++, F1 knows it is a promise for some sort of abstract base class (promise). If X is a class of functor objects or template-driven computation, evaluation could easily end up in an infinite loop.

Of course, the cycle defining this object isn't necessarily malign, but it does depend on race conditions (a slightly different order of operations would have had C reply to B first) that are difficult to predict without knowing in advance the call-graph of the system. It's also worth noting that synchronized message handling and actor transactions don't help a whit. The above is with just three actors or threads... modular composition of concurrent systems really demands that one shouldn't need to know any complete call graph in advance.

I'm wary enough that I've decided in my language to take serious measures in limiting the forwarding of futures through procedures and between actors, limiting decisions regarding forwarding futures (aka promise pipelining) to an optimizer over well-defined call graphs (cliques in actor configurations) in well-defined error handling situations (e.g. only pipeline futures that have no risk of throwing an exception upon wait). Programmers are encouraged send or reply with futures to allow various optimizations, but actual pipelining is out of their control.

An implementation for Common Lisp

I came across Milewski's blog post a couple of days ago and thought it would be fun to do something along the lines for the PCall futures library for CL. You can find my writeup and patches here.

The surprising thing was that I came away with the conclusion that the way C++ will handle exceptions in futures is broken. Here is the relevant excerpt from the writeup (a PCall "task" is a future, and "join" is force):

What is not so obvious is how we handle errors. The current policy of
PCall (and the proposed C++ standard, and Ada) is to catch exceptions
unhandled in the task and re-throw them when join is called on the
task object. This mimics the behavior of futures when used purely for
lazy evaluation, but ignores the fact that futures used for
parallelism may be executed in a different dynamic context with
different global state. I think this is wrong for several reasons:

* It is misleading in terms of designing exception-handling protocols,
since what occurs when you join a task that had an error is not that
particular error, but information that the join failed because the
task experienced an error.

* If you don't have continuations, there is nothing generally
meaningful you can do about handling the exception since all the
dynamic context inside a task is unwound and lost when you join (this
isn't a problem for C++ since there's nothing meaningful you can do
with exceptions in the first place, because it always unwinds). This
can easily lead to unexpected results if your handlers expect certain
restarts to be available based on the type of the condition, or even
if you unwind but make assumptions about your state (which the task
didn't share) based on the condition (which is also why re-throwing is
a bad idea in C++).

The mechanism for non-determinism that I came up with is along the lines of select:

(join-one &rest tasks) => values-list, task

Returns the list of results, and the task object, of the first task
among the given to finish computing.

I didn't bother with composition since I think it would be more straightforward and efficient to construct specifically needed abstractions using closures and macros.

It might not be so bad. If

It might not be so bad.

If there is shared state between tasks/threads (meaning you're doing impure tasks -- I've found this to be natural), you are already doing some sort of synchronization between them. Cilk++ has a nice story on this with hyperobjects: each task gets a local copy (likely, lazily) for fast processing, and you define a (destructive) reduce operation between two instances of the shared data structure, like union for a set-like data structure, which gets called when your task is finished to merge this state with another task's.

What about exceptions? Never call reduce until there is no exception, or provide an unreduce op if it was called. There's a potential space leak, but I bet escape analysis etc. can help here, and there might be special classes of reducers. This would probably go a long way towards the common case.

Inability to resume/reset a

Inability to resume/reset a future after an exception is a sometimes painful limitation, I agree, but given that we can't have multiple parallel contexts each deciding to reset a future in a different way, I expect the limitation is somewhat inherent to the problem.

Wrapping exceptions with 'task error' or whatnot is certainly an option, though not one I much like at first impression. Why not leave wrapping exceptions to the users? ... I need to think about whether I should be doing the same in the language I'm designing.

Perhaps a problem in PCall is the inconsistency in 'join' executing in the caller's thread where "the dynamic environment in which the task runs unpredictable". Concurrency is unpredictable enough; you should aim for as much predictability as possible where it can be achieved. Consider blocking reset of conditions raised from thread-local join join, and ensuring all 'pcall' constructors grab a reference to the environment in which they are constructed as opposed to the environment in which they are joined.

From my own work on this subject, I'd also suggest separating the construction primitive (that captures the dynamic context, sets up the memoization, and captures exceptions) from the parallel call primitive (which simply initiates the evaluation in a parallel thread if not already initiated). This would allow you to use the same system for both lazy and lenient evaluation in a consistent manner.

Inability to resume/reset a

Inability to resume/reset a future after an exception is a sometimes painful limitation, I agree, but given that we can't have multiple parallel contexts each deciding to reset a future in a different way, I expect the limitation is somewhat inherent to the problem.

Couldn't you define the error handler when you create the future? An error when resolving a future thus creates a nested future which accompanies the exception thrown when trying to access the value. Clients can then synchronize on that future instead, which presumably handles any errors.

Couldn't you define the

Couldn't you define the error handler when you create the future?

Certainly, so long as you do so "in" the future - i.e. as part of constructing the future.

But languages with resumable exceptions (or condition/reset) advertise as a primary feature the capability to divide error-handling policy from the exception mechanism... i.e. a decision to resume operations with a result of 0.0 after a div-by-zero error can be wholly separated from the division operation. The use of futures defeats this feature to some degree by forcing one to push much error-handling policy into the future itself.

Admittedly, one could do so with a higher-order parameter that wraps a behavior with some error handling and resumption policy, so it is possible to dodge this problem, but workarounds are not convenient.

Your idea of including a few alternative (future) values or continuations as part of the exception raised is a possible one, but wouldn't integrate with most resumable exception mechanisms (which, for destructor safety, rarely offer first class continuations). Given call/cc one could probably provide access to the resumption points, but most languages don't provide call/cc.

It might not be so bad. If

Edit: moved this up to be a reply to the appropriate article

Similar Haskell module

As for the C++ deficiency, it sounds like a more complicated packaged_task template would be able to accept continuations to run when the main thread is finished. The primitive promise/future infrastructure seems to be there to build on.

To pass the time, I just uploaded the future package for Haskell. This should have similar semantics to promise/future in C++, plus I added continuations.

The action is run in a new thread, and its value or exception get captures. Other threads can wait (block,non-blocking,with a timeout) for the value or exception (either explicitly, or with the exception rethrown in the reading thread).

One can also give the Promise a continuation (function from value to an action). When the original action completes all queued continuations get run. All later continuations can be run immediately. By supplying the right continuations one can efficiently wait on multiple values. For added control I added an 'abort' command to try and stop a promise's execution thread.

Higher-order concurrency in Haskell

The whole CML event library was ported to Haskell by Avik Chaudhuri. Futures are just a special case of events.

Futures vs. Futures

I feel like having to point out once more that "futures" as implemented in libraries by a couple of languages mentioned here are not really futures in the original sense the term was used in literature by people like Baker & Hewitt, Halstead, Flanagan & Felleisen, etc. They lack _the_ central feature of "real" futures, which is data flow synchronisation. "Real" futures cannot be implemented as libraries.

(In a similar vein, "real" futures are _not_ a special case of CML-style events.)

They lack _the_ central

They lack _the_ central feature of "real" futures, which is data flow synchronisation. "Real" futures cannot be implemented as libraries.

You've said this before, but I don't understand what's missing from these pseudo-futures to make them full-fledged futures. Is it that ordinary references and futures are treated identically and are implicitly synchronized?

In Alice ML terms

The type of a future is the type given by the end result of the typing. Taking a simple example, in Alice ML we can have:

val p = Promise.promise()
val x : int = Promise.future(p)
val _ = Promise.fulfill (p, spawn (Thread.sleep(Time.fromMilliseconds(Int.toLarge(5000))); 10))
val n = x + x

What Andreas is saying (at least, what I think he is saying) is that there really is no future, rather the equivalent in other languages that use a library would be something like the following:

val p = Promise.promise()
val _ = Promise.fulfill (p, spawn (Thread.sleep(Time.fromMilliseconds(Int.toLarge(5000))); 10))
val _ = await (Promise.future(p))
val x : int = Promise.future(p)
val n = x + x

So you have promises and those promises can return a result at some point - such as an int, but you can not assign a future to an int. Of course, you could also skirt the issue by using a int future in lieu of an int:

val p = Promise.promise()
val _ = Promise.fulfill (p, spawn (Thread.sleep(Time.fromMilliseconds(Int.toLarge(5000))); 10))
val n = Promise.future(p) + Promise.future(p)

But in neither the second or the third case can you assign a future to a variable of type int and not have a wait involved prior to that assignment.

[Edit note: make that four]

I would put it like this

I would put it like this: library-style futures are almost the same as promises themselves, except that you cannot write them.

Your first example can be simplified to:

val x = spawn (Thread.sleep(...); 10)
val n = x + x

With "fake" futures you can only do the rough equivalent of

val x = promise()
do spawn (Thread.sleep(...); fulfill (x, 10))
val n = sync x + sync x

where "sync" is "await o future", i.e. accesses the content of the promise and blocks if it's not yet available. That means that all synchronization points have to be anticipated upfront.

(Not sure that helps anybody who does not know Alice, though.)

Well, the concrete example

(EDITED TO FIX less than sign TO html escape)

Well, the concrete example of "await" in Haskell library I posted about above is "get :: Promise a -> IO a".

One can hide the blocking in IO by use of "unsafePerformIO :: IO a -> a"

And then write

p <- forkPromise (threadDelay (5*10^6) >> return 10)
let x :: Int
    x = unsafePerformIO (get p)
let n :: Int
    n = x + x

Note that my Promise is constructed and launched by forkPromise.

So the result of the future (assuming no exceptions!) can look just like an immutable and pure "Int" if I bend the rules. Since the value returned by get is immutable this is actually safe (again, assuming no exceptions).

Well, that's the type part of the equation...

...the other question is whether the get operation results in a block if the promise has not been fulfilled yet? If the get blocks, then x does not hold a future... it holds the fulfilled promise. Or to use Andrea's terminology:

x = sync p

where instead of sync being defined as await o future, it would be unsafePerformIO . get

With Haskell's built in laziness, and my lack of knowledge of the language, I'm not sure what the answer is?

Schrödinger's blocking value

I will explain the Haskell-fu which is going on here:

Both my "get" and "wait" operations are blocking. The code is in the IO monad, so p synchronously forks and starts the execution:

p <- forkPromise ...
let x = unsafePerformIO (get p)
    n = x + x

The binding for both x and n are lazy. Only when one of these values is actually demanded will execution of "get" take place. The "get" operation is blocking. The unsafePerformIO has converted the blocking "get :: IO Int" into the lazy "x :: Int".

This should give you the best of both worlds. You can access the promise/future imperatively (check/wait/get/timedWait/...) and you can access it purely and lazily (unsafePerformIO on wait or get).

For my next trick I will need a jug of ordinary water.

Note that if the computation in the promise is "Int" instead of "IO Int" then using the "par" combinator in Haskell is much much easier than the Future library. Pure code is easy to parallelize in Haskell.

I may be swimming in a Turing tarpit but ...

... can you explain why futures implemented with libraries cannot be used for data flow synchronization?

I'd agree with a claim that C++ futures don't make such synchronization convenient (i.e. it can't be used for synchronization in arbitrary expressions) but that still leaves the ability to construct those big template-driven expressions complete with automatic synchronization when needing futures that aren't yet available.

Baker and Hewitt define

Baker and Hewitt define their approach to futures as a new evaluation strategy (call-by-future), and Flanagan and Felleisen essentially just introduce a new rule for futures (future x -> x) to eval. Saying that this can't be done in a library seems like a bit of a tautology to me. It's exactly like arguing that "real" lazyness can't be done in a strict language.

It's exactly like arguing

It's exactly like arguing that "real" lazyness can't be done in a strict language.

I don't get it. What is "real" vs "fake" laziness? Assuming an eagerly evaluated program uses lazy types throughout, how is its evaluation "less lazy" than it would be in a "real" lazy language?

Fake laziness

In the case of O'Caml, laziness is not transparent. Whereas in Alice ML, you can just introduce laziness to any list without effecting the types involved, in O'Caml lazy changes the underlying type.

type 'a node_t =
    | Empty
    | Node of 'a * 'a node_t lazy_t

I'd venture a guess that the objection to futures as a library is along the same line. The type of int and int future are not the same when relegated to libraries.

Spot on

The comparison with laziness is spot on. The issue is that a library approach is not composable with any code or interface that has not explicitly been written with futures in mind.

With "real" futures, every value can potentially be a future. That means that you can introduce futures wherever you want and pass futures to any old function. You rely on data flow synchronization to happen automatically at the latest possible time, thereby maximizing parallelism and minimizing latency. Likewise, as the implementor of a function you never have to worry about futures. That was the main idea behind futures - making parallelism orthogonal, minimally intrusive, and maximally flexible, at least in principle.

For example, f (spawn e1, spawn e2) evaluates f in parallel to both its argument, synchronizing (implicitly) only when absolutely needed.[*] In a sort of dual manner, you can build data structures incrementally or in parallel to using them without having to change their type.

The flipside of all this of course is that every language construct that accesses a value (function invocation, field access, case, primitive operators, you name it) has to be aware of the possibility of encountering futures. Obviously, you cannot retrofit that with a library, at least not in any mainstream language.

[*] Unlike "lenient" evaluation the above does not kill the parallel computations if f ignores its arguments. That requires an additional abstraction.

Monadic semantics, specialization

Right, so the problem is the same as with monadic code, where you have ordinary maps, folds, etc. and also lifted monadic maps, folds, etc. I believe this could all be addressed if the language were given a core monadic semantics, which would enable all the various types of maps, folds, etc. to be derived from a single definition.

I know there has been work on specifying extensible language features using monadic semantics, but has anyone combined that with partial evaluation to see how efficient the resulting code might be? I wonder how much of the extra abstraction could be partially evaluated away.

Since monads are mentioned...

..."real" promises shouldn't require monads, as there are no side effects - similar to lazy evaluation. (Unless you introduce an AwaitTwo or IsDet sort of non-determinism). Although laziness and futures are not synonomous, they are related concepts - especially in the Alice implementation.

..."real" promises shouldn't

..."real" promises shouldn't require monads, as there are no side effects - similar to lazy evaluation.

I think people steeped in advanced PL far too readily dismiss allocation as effect-free... ;-)

Still, monads as a tool for abstracting semantics is more important in this context than their usefulness for taming effects. Still, ensuring proper sequencing is once again important, so monads or equivalent will be essential.

(Unless you introduce an AwaitTwo or IsDet sort of non-determinism)

This is equivalent to Orc's asymmetric pruning combinator, which allows one to basically take the head of a list of promises (what they term "demonic non-determinism" IIRC).

As for futures and laziness, they strike me as almost synonymous. I don't see many important differences myself, except for the fact that futures are often created explicitly, where laziness is often implicit.

A pigeon and a plane both have wings...

As for futures and laziness, they strike me as almost synonymous. I don't see many important differences myself, except for the fact that futures are often created explicitly, where laziness is often implicit.

While I see what you are saying, in that both represent delayed evaluation, I think you have obscured an important difference between them.

With laziness, the evaluation is timed by the consumer, while with futures the evaluation is timed by the producer.

This is a fairly significant from a semantic and architectural point of view.

With laziness, the

With laziness, the evaluation is timed by the consumer, while with futures the evaluation is timed by the producer.

I'm afraid the meaning of "timed" in the above statement escapes me at the moment. I tend to mix up futures and promises however, so I believe a more accurate statement is lazy value == promise. Promises and lazy values are computed on-demand, where I believe futures simply represent a value from a concurrent computation. I believe that classification mirrors Alice ML taxonomy. In pseudo-ML:

(* a concurrently executing computation *)
module Future =
struct
  type 'a t
  (* starts the concurrent computation immediately *)
  val spawn: (() -> 'a) -> 'a t
  (* raises an exception if the computation raised an exception *)
  val value: 'a t -> 'a
  (* continuation invoked when value resolves *)
  val when: ('a t -> 'b) -> 'b t
end
(* a promise for a value == a lazily computed value *)
module Promise =
struct
  type 'a t
  (* construct a lazily computed value, whose computation is started when force is called *)
  val promise: (() -> 'a) -> 'a t
  val force: 'a t -> 'a Future.t
  val when: ('a t -> 'b) -> 'b t
end

Not quite

I tend to mix up futures and promises however

That is understandable, since the term "promise" is used very inconsistently in literature. Often it is synonym for future (and historically, that is how they were actually introduced, i.e. Baker & Hewitt used "future" because they didn't like the term "promise" from previous work).

so I believe a more accurate statement is lazy value == promise. Promises and lazy values are computed on-demand

No, in Alice ML, and apparently also in the C++ lib, promises are rather handles for manually writing futures. Laziness is something else. In Alice ML, concurrent, lazy and promised futures are three different forms of future.

Often it is synonym for

Often it is synonym for future (and historically, that is how they were actually introduced, i.e. Baker & Hewitt used "future" because they didn't like the term "promise" from previous work).

Funny, that's what I thought when I first came across the term "future", but then I thought, "nah, they'd never invent a whole other word for the exact same thing." So I went hunting for some sort of subtle semantic difference which could explain the different terminology. Thanks for the correction!

No, in Alice ML, and apparently also in the C++ lib, promises are rather handles for manually writing futures. Laziness is something else. In Alice ML, concurrent, lazy and promised futures are three different forms of future.

I had read that page recently, and thought it was consistent with my interpretation above, but your description just gave a few of the sentences a different meaning than I read into them before. I shall have to mull this over further.

Ordinary meaning only

I'm afraid the meaning of "timed" in the above statement escapes me at the moment.

I mean this is in the ordinary sense of "decides when something happens". I thought of using the word "trigger" there, but that is ambiguous in a different way in this situation.

A different way to say what is said is "with lazy evaluation the producer waits for the consumer; with futures the consumer (potentially) waits for the producer".

You can see why that makes a non-trivial difference.

with futures the consumer

with futures the consumer (potentially) waits for the producer"

I don't see the clear-cut producer/consumer relationship you're drawing in the concurrent setting. I can see a difference between eager concurrent, where concurrent execution begins immediately, and on-demand/lazy concurrent, where concurrent execution begins only when the value is requested. Is this what you're trying to point out?

This subtlety can make a difference, particularly with ordering of side-effects as I discovered in C#. Eager concurrent should work relatively fine in C#, but lazy concurrent requires some additional machinery which I'm finding hard to figure out how to do in C#.

The futures are so bright, I gotta wear shades

I don't see the clear-cut producer/consumer relationship you're drawing in the concurrent setting.

May I gently suggest that if you are designing a concurrent application and don't have some notion of who your consumers and producers are for various values, you probably have bigger problems than worrying about PL semantics. ;-)

However, let me take another run at this.

Let's define a notion of "semantic time" to describe how you can reason about the sequence of events when you are considering a program. An event in semantic time is earlier than another if there may be a wait (a block) before the later event occurs.

This is meaningful because knowing what might be waiting and when it might be waiting is important for avoiding concurrency problems, and understanding concurrency usually involves understanding logical sequencing.

There are basically three events to think about:

  • assigment - when a label for a value is defined and associated with a means to evaluate it
  • initiation of evaluation - when a process is initiated to begin the evaluation
  • completion of evaluation - when the evaluation is complete and the value is available for use

In non-concurrent execution, the two evaluation steps always happen simultaneously (from a semantic time point of view). Initiation = Completion

Eager evaluation requires assignment and initiation of execution to be simultaneous in semantic time. Assignment = Initiation

Lazy execution makes assignment earlier in semantic time than initiation of evaluation. Assignment ≤ Initiation

Futures are when the initiation of evaluation is earlier in semantic time than the completion of evaluation. Initiation ≤ Completion

(By the gaps in each definition, you can see how some of these might be combined, and some can't.)

As you can see from this set of definitions, futures put the waiting in a different place than laziness, which makes them fundamentally different when you are trying to understand what is going on in a concurrent environment.

Futures are when the

Futures are when the initiation of evaluation is earlier in semantic time than the completion of evaluation. Initiation ≤ Completion

I was following until this statement. How is it possible for evaluation to be completed before it was initiated? It looks to me like it can't, which means lazy evaluation also respects "Initiation ≤ Completion", which means all lazy values are futures, though not all futures are necessarily lazy. This is why I distinguished eager concurrent and lazy concurrent.

"Initiation ≤ Completion"

"Initiation ≤ Completion" means that initiation happens before completion, not the other way round.

Interestingly, with promised futures you have Initiation ≤ Completion ≤ Assignment. But in fact, in a language with "real" futures you can have much more complex patterns, because futures can actually be bound to other futures upon "completion", something that's sometimes called chaining.

"Initiation ≤ Completion"

"Initiation ≤ Completion" means that initiation happens before completion, not the other way round.

I know, I was asking how it could possibly be any other way. I don't see how it's possible, in which case all values respect "Initiation ≤ Completion", so all values are futures (fulfilled or unfulfilled). In particular, lazy values respect "Initiation ≤ Completion", thus lazy values are futures. I just noticed Marc's reply, so I'll take this up there.

But in fact, in a language with "real" futures you can have much more complex patterns, because futures can actually be bound to other futures upon "completion", something that's sometimes called chaining.

I believe this is called "promise pipelining" in E.

Ah, OK

Sorry, I was misreading your post then.

I believe this is called "promise pipelining" in E.

Yes, although their promises are the equivalent to what we call "concurrent futures", and there are no others. (Edit: point being that you cannot construct new patterns with just these alone.)

A contrast in attitudes

in a language with "real" futures you can have much more complex patterns

Thinking about this reminded me of the contrast in attitudes between declarative concurrency aficionados and the C++ guy quoted by Chris in the OP.

When you take fully stateful concurrency as your default, all blocking is frightening because it can quickly lead to deadlocks, etc. and a loss of ability to reason about your program.

When you start with a declarative concurrency baseline, you say "Let them block! Since everything is dataflow synchronized, you can't go wrong unless you do something dumb, such as write a non-terminating process".

Food for thought in comparing the two as design choices...

(Though professional practitioners should probably understand both mindsets. ;-) )

All futures will have their 15 mins of fame

How is it possible for evaluation to be completed before it was initiated?

It can't. However there are still two choices: guaranteed to be simultaneous or not.

which means all lazy values are futures

No. You can have lazy values where both evaluation steps are semantically simultaneous.

Remember that just because you are in an environment that allows concurrency, it isn't required that all values must involve concurrency. In fact, restricting concurrency to be used only for those values where it is necessary may be a trouble-saving design choice.

(It is not entirely clear to me whether Alice ML allows this option; a casual glance at the docs suggests it may introduce concurrency by default into its implementation of lazy values; this may be the source of confusion about the independence of the concepts.)

Concurrent laziness

It is not entirely clear to me whether Alice ML allows this option; a casual glance at the docs suggests it may introduce concurrency by default into its implementation of lazy values

Yes. The reason for that is that several threads might trigger the same lazy suspension quasi-simultaneously. It would be completely arbitrary to compute the suspension in either of the requesting threads. In Alice ML it would also break abstraction, because a computation can discover the identity of its thread using a library primitive.

Unless you disallow sharing lazy suspensions across threads I think it is the only sane design choice to compute them concurrently (at least conceptually).

Remember that just because

Remember that just because you are in an environment that allows concurrency, it isn't required that all values must involve concurrency. In fact, restricting concurrency to be used only for those values where it is necessary may be a trouble-saving design choice.

Ok, so you're saying the criteria are:

1. Futures are necessarily concurrent.
2. Lazy values need not be fulfilled concurrently, so they are not necessarily futures.
3. If an implementation happens to evaluate lazy values concurrently, they are futures.

That sounds fine to me, though I'm not sure it makes much sense in a concurrent language to have lazy values that are not thread-safe. If two threads force a lazy value, you need future-like semantics anyway (or some way to prevent sharing lazy values across threads).

[Edit: just noticed Andreas posted nearly the exactly the same thing. Weird.]

For example

though I'm not sure it makes much sense in a concurrent language to have lazy values that are not thread-safe

It would be equivalent to being able to declare something private or thread-local.

In a declarative-concurrency-defaulting language, like Alice ML, I can see why you wouldn't bother. (as explained by Andreas)

In a language (like the OP C++ example) where the default is higher up the statefulness chain, this might be a useful feature.

"3. If an implementation

"3. If an implementation happens to evaluate lazy values concurrently, they are futures."

I think a helpful insight into the distinction is to consider infinite lazy computations. Infinite futures computations then are "strict" in the sense that another thread/machine is actively performing evaluation, but "lazy" from the point of view of each thread/machine.

But futures aren't

But futures aren't necessarily eager either. Their evaluation can be triggered on-demand just like lazy values. From my reading of this thread, the key distinction between lazy values and futures is concurrency. If your lazy values are thread-safe, they are futures. Eager futures as you describe are also possible of course.

This also seems consistent with the GHC concurrent runtime paper I recently read, where they often pointed out the similarities between lazy values and futures.

The main practical difference between lazy & futures...

...is that laziness requires the assignment at the location in the code that the variable is declared. OTOH, Futures allow you to put the code that fulfills the promise anywhere in the code you want. It means you can divest the declaration from assignment, not only time-wise, but also distance-wise (as measured in code placement).

Not necessarily

That depends on the sort of future. The sort you describe (promised futures in Alice speak) are not necessarily the sort you find in all languages. In particular, Baker & Hewitt as well as Halstead originally only provided futures as the result of a spawned thread (a la concurrent futures in Alice). For these, like for lazy futures, the computation is already "assigned" at the point were you define them.

Coming from E and Waterken,

Coming from E and Waterken, I had never seen explicit future assignment until I came across Alice ML. I haven't looked too deeply into the history of futures, so I'm not sure how fundamental the assignment operation is.

[Edit: once again, Andreas concurrently answered, and so my reply is redundant. We should be using futures here. :-)]

Assignment

Promises in Alice ML, which provide assignable futures, evolved as a controlled form of logic variable. We were coming from Oz, which had added futures earlier and allowed extracting them from logic variables. By binding the logic variable, you also bound the associated future. But the idea of binding futures by an explicit operation is quite obvious, so there might have been other languages that invented it. (Still, I was somewhat surprised to see the C++ proposals using the very same terminology.)

Detour through Oz & CTM

CTM has this to say on the subject (Section 4.5.2 has a lot to say about the distinction between laziness and dataflow variables):

Since laziness and dataflow variables are independent concepts, this means there are three special moments in a variable’s lifetime:

  1. Creation of the variable as an entity in the language, such that it can be placed inside data structures and passed to or from a function or procedure. The variable is not yet bound to its value. We call such a variable a "dataflow variable".
  2. Specification of the function or procedure call that will evaluate the value of the variable (but the evaluation is not done yet).
  3. Evaluation of the function. When the result is available, it is bound to the variable. The evaluation might be done according to a trigger, which may be implicit such as a “need” for the value. Lazy execution uses implicit need.

My translation of the Oz code to Alice ML yielded the following (although it's really more ivolved since I'm not looking at futures here, and promise/futures are implicit in Oz dataflow variables):

val x = 11*11                             (* (1)&(2)&(3) together *)

fun lazy lazyMul (a, b) = a*b
val x = lazyMul(11, 11)                   (* (1)&(2) together *)
val _ = await x                           (* (3) seperate *)

val x = lazy 11 * 11                      (* (1)&(2) together *)
val _ = await x                           (* (3) seperate *)

val x = promise()                         (* (1) seperate *)
val _ = fulfill(x, 11 * 11)               (* (2)&(3) together *)

val x = promise()                         (* (1) seperate *)
val _ = spawn fulfill(x, 11 * 11)         (* (2)&(3) together *)
val _ = spawn if (future x > 100)
                 then inspect "big"
                 else ();

val x = promise()                         (* (1) seperate *)
val _ = fulfill(x, lazy 11 * 11)          (* (2) seperate *)
val _ = await x                           (* (3) seperate *)

val x = promise()                         (* (1) seperate *)
val _ = spawn fulfill(x, lazy 11 * 11)    (* (2) seperate *)
val _ = spawn await x                     (* (3) seperate *)

Ok, so I've conflated the term futures with the futures/promises of Alice ML. Still, the division of these three stages of assignment is that which I find most useful/powerful in both Alice and Oz. Or as CTM says:

In all these examples, X is eventually bound to 121. Allowing the three moments to be separate gives maximum expressiveness within a declarative framework. For example, laziness allows to do declarative calculations with potentially infinite lists. Laziness allows to implement many data structures as efficiently as with explicit state, yet still declaratively (see, e.g., [138]). Dataflow variables allow to write concurrent programs that are still declarative. Using both together allows to write concurrent programs that consist of stream objects communicating through potentially infinite streams. One way to understand the added expressiveness is to realize that dataflow variables and laziness each add a weak form of state to the model. In both cases, restrictions on using the state ensure the model is still declarative.

Weak Madness

So in Alice the thingy (promise/future/add-you-name-here) can be declared separately from being 'assigned'.

That seems like a bit too much rope to hang yourself with. One and only one such 'assignment' should be made. That becomes a dynamic property the compiler cannot enforce. I would guess that most uses could combine the declaration and 'assignment'. Hmmm...I could separate these in the "future" Haskell package.

What semantics are expected if a thingy is 'assigned' twice?

What semantics are expected

What semantics are expected if a thingy is 'assigned' twice?

It's a write-once variable, so I expect either nothing happens, or it throws an exception. I think MS's Task Parallel library also has a WriteOnce<T> type.

Promises

Yes, a promise (and thus its associated future) can only be assigned once. You simply get an exception if you try otherwise. (In theory, we also developed a linear type system to enforce that statically. But it would be too deep a change to (S)ML to integrate that into Alice.)

As I alluded to in an earlier post, the reason we added this kind of futures is that we wanted to be able to apply idioms from concurrent logic programming. Many concurrency primitives (e.g. mutexes, message channels, etc) are straightforward to express in terms of promises and references. Also, they give you the ability to construct plain old functional data structures in a top-down manner (e.g. you can do a tail-recursive append on lists).

Interestingly, on purely semantic level, promises do not add any expressiveness to a language that already has plain concurrent futures and references anyway - you can implement a promise ADT in half a dozen lines using these. However, doing so requires an auxiliary polling thread, so it's not very efficient. But it kind of shows that their presence does not make the language fundamentally "weaker".

Unification

I know that you've indicated that it was a conscious decision to eschew full unification in ML, as it would go against the goals of Alice being a conservative extension to SML. But I've never quite understood why you don't allow a promise to be fulfilled more than once - assuming that an equal value was assigned.

val p = promise();
fulfill(p, 123)
fulfill(p, 123)

I would assume that it has something to do with measuring equality (e.g. it might be assigning an (unevaluated) function) - and thus be a backhanded form of unification.

Multiple assignment

We couldn't think of any good example where you really want to do that. On the other hand, there are likely scenarios were you do it accidentally, and it seems better to get an error early and consistently there.

Also, there is the question of semantics like in cases where either the content or the new value contains (other) futures. Does that block? That would be the only clean semantics, but it looks like another opportunity for subtle bugs. Or what if the promise has been failed with an exception? In general, such an operation would be hyperstrict in both its argument, which actually makes it less expressive.

If you really want such an operation it is trivial to code up:

fun fulfill2(p, x) =
  fulfill(p, x) handle Promise =>
    if x = future p then () else raise Promise

Note that it has type ''a promise * ''a -> unit, i.e., is only applicable to equality types.

(Btw, the reason to eschew built-in (deep) unification was not SML compatibility. It is the questionable complexity/utility tradeoff it has. In practice, it almost always is the wrong tool.)

Precedence and equality

I think you'll find that the difference is not between ≤ and ≥, but between ≤ and =. With futures Initiation ≤ Completion. Without futures Initiation = Completion (as Marc said, in the absence of concurrency initiation and completion are logically simultaneous).

Futures done right

I just published another blog entry on this topic, looking for parallels between CML event library and C++ futures. In particular, How could composability of futures be achieved in C++?

(Surprised we don't have a

(Surprised we don't have a category dedicated to concurrency, so I'll post this under parallel/distributed.)

This department was intended for all things related to concurrency, parallel and distributed programing. I can't find my original message from when I created the department at the moment, but I think I talked about the confused terminology in the field, and explained why I chose "parallel" over "concurrency" [edit: here it is; the link I gave there is to the scope of topics covered by TPDS].

Seeing as concurrency is probably going to be the central topic for the foreseeable future, perhaps we need several department for the various aspects of this topic...

E's Promises (message passing alternative)

E's promises (http://erights.org) are future-like objects that are based on message passing paradigm (AFAIR the concept was borrowed from Concurrent Prolog, but I do not remember history well enough). There is no problem with composing them in any kind of the operation.

I have implemented the same concept in Java in the context of AsyncObjects project (http://asyncobjects.sf.net). Again, no problem with composition of more complex operation from simpler ones.