Semi-implicit batched remote code execution as staging

Oleg Kiselyov has just posted another amazing work: Semi-implicit batched remote code execution as staging.

Batching several remote-procedure or remote-object operations into one request decreases the number of network client/server round-trips, reduces the communication overhead and indeed significantly improves performance of distributed applications. The benefits are offset by the cost of restructuring the code to incite large batches; and by the increase in the difficulty of reasoning about the code, predicting its performance let alone establishing correctness. The overall research goal is to reduce the downside.

We describe a semi-implicit batching mechanism that makes the points of remote server communication explicit yet conceals the proxies, saving the trouble of writing them. The changes to the client code are minimal: mainly, adding calls to force. The type-checker will not let the programmer forget to call force. The remote batch server is simple and generic, with no need to optimize for specific clients.

Our mechanism batches both independent and data-dependent remote calls. Our mechanism is compositional, letting the programmer build nested applications and conditional (and, potentially, iterative) statements using composition, application and naming. Writing a remote program is exactly like writing a typed local program, which is type-checked locally, and can even be executed locally (for debugging).

The key insights are treating remote execution as a form of staging (meta-programming), generalizing mere remote function calls to remote applicative and conditional expressions, and introducing an embedded domain-specific language, Chourai, for such expressions. A batch of dependent remote function calls can then be regarded as a complex applicative expression in the A-normal form. Another key insight is that emulating call-by-value via call-by-need surprisingly makes sense.

Here's an example piece of Chourai code, for deleting albums whose rating is below 5 among the first n albums of an album database (called "large") hosted by the server. get_album, next_album, and similar functions constitute the "RPC" interface to the server.

     let delete_low_rating n =
      let rec loop album i =
        let t = guard (app2 lt (app get_rating album) (int 5)) 
                      (fun () -> app delete_album album) in
        if i >= n then force t else
        loop (app next_album album) (succ i)
      in loop (app get_album (string "large")) 0;;

Amazingly, delete_low_rating 4 requires just one round-trip to the server!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Generally known as promise

Generally known as promise pipelining in E. Nice to see an explicitly typed implementation, though the Waterken server has had a Java implementation of pipelining for some time.

Also, the story's link is broken.

Thanks

Fixed.

You might also be interested in Cook's Batches

See his blog - A Better Mousetrap (Distributed Objects)

His most recent paper of his work is from ICOODB 2010: Unifying Remote Data, Remote Procedures, and Web Services

I still think there is a long way to go in this problem area. Most of the solutions I've seen vary in applicability. The nice thing about Cook's is that his solution requires no syntactic modification to the Java language, you just implement a Service abstract base class and that does the work of an Iterator and allows you to do foreach.

What I want is a system that can behave as if the client and server are "hidden models". I am having a tough time explaining to people what that looks like, though, but I am making some progress in email conversations.

Desiderata for a remote batch processing language

This topic has me wondering what might be the set of desiderata for a remote batch processing language - i.e. the intermediate language for communication with a remote system. E.g. should it include 'foreach'? 'if'? 'while'? The Chourai language Oleg describes in the link is somewhat limited, though one could certainly add 'foreach' and 'while' and other operations to it.

Oleg would certainly list 'type safety' among the desiderata. But the problem becomes a lot harder once you start considering security, resource consumption, responsiveness (vs. denial-of-service), fairness for multiple clients, composition (what to do when blocking while our Chourai calls start making Chourai calls?), and so on.

Sandro mentioned the relationship between this 'implicit' remote batch processing, and E language's promise pipelining. But there exist at least two very significant differences:

  1. E language used a 'when' construct instead of a 'force' construct. E language doesn't actually allow programs to block while waiting on a response from a remote server; instead, the 'when' construct schedules an event to execute once the promises are fulfilled.
  2. E language promises can easily distribute across multiple servers. E.g. Alice can pass a promise from Bob in a call to Charlie. It is a unclear how this problem of composition or multi-server interaction is handled with Chourai or the intermediate language, other than the possibility of nested Chourai calls.

Ultimately, I expect there will be something of a standoff between expressiveness, security, compositionality. I have some desiderata in mind, but I'm interested in your intuitions.

Distributed GC or Bandwidth costs?

Cook's explicit batching seems to have one major advantage over the implicitly batched Chourai or promise pipelining: the scope of an intermediate variable from an explicit batch is well defined. This means: we don't waste bandwidth sending intermediate values across the network, and we can garbage collect the intermediate values very easily.

With Chourai or promise pipelining, the default scope is unknown, so we must either send back intermediate values or hold onto them until they are explicitly 'forced'. If we send them across the network, we'll be paying a lot of 'conservative' bandwidth costs for values we collect without using. We do avoid unnecessary round-tripping, so latency is low, but sending values just to drop them is an inefficient use of bandwidth.

We could save bandwidth by holding intermediate values on the remote server until individually 'forced' or the references are dynamically collected. However, to achieve this effectively would require integrating with the host language's garbage collection. (I've made something along this lines work using C++ destructors and smart pointers.) Anyhow, this gets complex very quickly unless the language is well designed for mostly-static garbage collection (in which case one can statically mark certain references as 'intermediate' and conservatively accept the rest).

It might be interesting to see something like a Chourai 'let' statement for more explicit batching with well-scoped intermediate variables. Oleg has done a lot of work on well-typed environments, records, and lists, so he could probably figure out how to make that happen.

How does this approach

How does this approach compare to just sending over a program to the server? This should reduce roundtrips most and seems the superior approach to me.

Sending code to the server

Sending code to the server is more general and powerful, but introduces concerns about security (sending malicious or buggy code), resource management (divergent code), and complexity (embedding an interpreter in the server app). By comparison, the promise pipelining style is a lot easier to reason about for security, has a resource cost no different from processing the remote calls individually, and has a simpler implementation that can be quite small and easily embedded in an RPC element without any significant interpretation.

I guess a fair comparison of

I guess a fair comparison of the two approaches includes to what (and if this question even makes sense) kind of programming language their approach corresponds. Malicious and buggy code can be completely excluded (for example, choose a purely functional language). Of course, bugs can occur, but they do not concern the server.

Of course, if the language is just somewhat powerful, divergence can be a problem. But then again, you can deal with the problems on the server, where most of the investment will be, anyway.

Side effects

I'm pretty sure any code you send to a server is going to have side-effects, since otherwise there isn't much point in sending the code to the server.

E supports secure code distribution due to object capability security - the distributed code would carry proof of whatever authorities it may access. However, it still supports promise pipelining because it isn't convenient to explicitly decide which code to package up and deliver.

I've for a long time been interested in designs that would support automated sharding and distribution of code (agents, objects, functions, and even smaller granularity) in both directions (pieces of client migrating or replicating to execute near the server, and vice versa) in order to improve performance and characteristics under disruption. Most of my language design efforts in seven years have been in this direction. But automated code distribution is a much more difficult problem, due to issues of information security and private data, type-system integration and versioning, remote debugging and maintenance, tolerance to disruption and delay and partitioning, consistency of replicated elements during and after network partitioning, distributed garbage collection, and near-transparent migration or replication.

Yeah, there will be

Yeah, there will be side-effects of course, but what I mean is purely functional modulo the get_album, get_next_album stuff etc. In other words, side effects that are official server interface functions are OK.

I see that automatic code distribution would be nice, but I really wouldn't go so far. Just as the paper is about "Semi-implicit" stuff, I just want my language to make it REALLY easy for me to specify how to distribute things. Babel-17 has been designed so that it can be easily extended into this direction.

Self-contained...

Even code-distribution by hand is a very difficult problem. A chunk of code is rarely self-contained - one has references to objects, references to library functions, references to nominative types, and so on.

My working hypothesis is that a language must at least be suitable for automated code distribution, even if it leaves the final decision to the developers, otherwise developers will be forced to manage the packaging by hand and that's too inconvenient to be practical.

However, the semi-implicit promises-are-nearly-code distribution is still feasible.

'Superior' is handwavy

What exactly is superior about it? Just reducing roundtrips?

In some sense what we're talking about here is how to create programs that can create ad-hoc specialized protocols between clients and servers, using as few simple algebraic rules as necessary...

I think in the most general direction, we're really talking about even possibly replacing something like Tabular Data Stream protocol used by MS SQL Server, and other similar database stream protocols used by other vendors. -- This addresses the sort of connection negotiation most driver software doesn't handle well today. I also think generalizing in this direction isn't exactly just sending over a program to the server. You have to return a result and be ready to receive that result as well.

"just reducing roundtrips"

"just reducing roundtrips" is the whole point of their paper ...

When I say "sending over a program" I was implying that you have to put the right framework in place to make this work. But this is not that difficult if you make the right choices. For example, you could extend the "concurrent" expression in Babel-17 to do just that.

Reducing roundtrips doesn't necessarily buy you performance

That's why I was commenting about superiority.

If the evaluation engine on the other side cannot optimize a complex predicate, then your feature is really just a denial of service attack waiting to happen.

I totally agree that there

I totally agree that there is some detail work to be done.

But what's the big difference between defending against a usual denial of service attack (which results from a program you know nothing about) and from a denial of service attack that results from a program in a "safe" language (like Babel-17) issuing the server requests? Yep, in the second case you have even more info to fend it off.

efficiency

The difference is efficiency of the attack.

You don't really have "more info to fend it off". If the cost to perform the analysis isn't a relatively small polynomial relative to the code size (e.g. O(N3), for a program of size N, might be edging on too expensive) then your attempts to 'fend off' the denial of service attack by analyzing that 'extra' information would serve as a vector of attack, and contribute to its efficiency.

Distinguishing malign code from benign code, or cheap code from expensive code, is an undecidable problem, in general... due to various issues such as the Halting problem.

So if you want any really nice properties in distributed code, you must achieve them by construction or some relatively simple substructural typing. To control against 'expensive' code, it might be better to constrain the distributed code to very limited loops, such as restricting to simple 'foreach' loops on immutable container values that will have a cost a fixed polynomial (based on depth of nesting) in the size of the code + size of the result.

Alternatively, you can control costs by controlling incentives. A clean ways to do this is to integrate with a memory and CPU service market. Code is distributed along with a logical 'e-wallet' that must pay for the CPU and space costs on the remote host(s). After all, nobody would mind an expensive virus taking up space and CPU and bandwidth if it had to rent it at a fair rate (at least assuming that is all the damage it could do, which might be ensured by object capability security). And nobody would write a virus in that circumstance.

I really like that idea of

I really like that idea of an 'e-wallet' that pays for CPU and bandwidth time taken. Sounds like the way to go.

Free services could be protected by the same technology that does the CPU time and bandwidth accounting for paid services.

Object capability security

can handle fine-grained resource scheduling, but solving this is only one half of the problem.

The other half is complexity on the OTHER side of the interface: whether a constructed type, evaluated as a program on the server, will exceed allocated resources. Maybe we don't want to allow users to construct types that can't actually be executed without timing out. Speaking from personal experience, this is a very good question to be asking in the context of visual languages that allow ad-hoc program composition. Object capability security helps, but it is not the whole story.

Safety comes in many forms

There are many ways to guarantee and define "safety". One example is Neil Jones's WHILE programming language, which guarantees the complexity of its programs.

A good practical discussion of main stream query languages (based on SQL-92) is given in parts of LtU member Vadim Tropashko's book SQL Design Patterns, where he discusses ways a DBA can write a query to guard against possible DoS attack vectors. In many ways, his points highlight deficiencies in the syntax of SQL and it is not too far of a leap to then consider what syntax could accomodate real world use cases.

Networking Named Content:

Networking Named Content: was just reading this paper, which covers a completely different networking architecture from our current IP-based structure. It's connected with languages in a few ways:

  1. Our current architecture is based around identity, ala objects; we currently take many pains to get around this and distribute objects to spread load. Content-based networking instead provides value semantics which is naturally distributed, and it achieves this by requiring idempotent data packets which provide pervasive caching, and naturally localizes data flow (this would seem to at least severely reduce DDoS vulnerability).
  2. All content is published in a structured hierarchical form, and it can be queried using a restricted language so clients can specify more precisely to each node in the search what fragment of content they're interested in without having to pull a large block of data across the network (Section 3.2, 3.3). This is connected to remote code execution/promise pipelining as discussed in this thread.
  3. The security properties look interesting as well, given its pervasive use of crypto. It provides a natural expression of delegation and so can expression capability as in the Waterken server, and even can express secure introduction ala httpsy (Section 5.2.2).

So while functional languages are expanding into current identity-based, imperative infrastructure, our infrastructure might be moving towards more a functional foundation. Interesting times!

+1

Highly interesting stuff, and I too have found it to be related to PLT.

Relatedly, my favorite quote on the interplay between languages and networking is Dan Connolly's:

Uniform Resource Locator is just the result of squeezing the term object reference through the IETF standardization process.

the related work section caught my eye

nasking: our infrastructure might be moving towards a more functional foundation.

I've been working full time in that infrastructure space the last four years, on dedup for WAN optimization. I only skimmed the paper, but I looked more closely at the section 7 on related work, which mentions both distributed hash tables and caching to reduce redundant content transmission. Yep, that's the right perspective. Actually I've been doing mainly caching for content distribution since 2001, and this has a functional note to it, and requires careful definition of what is meant by identity (since naming and cache invalidation are two classic hard problems).

Note I can't tell you how it works, since one of my roles is defining how it works in addition to implementation. But I'm interested in the language angle here. Naturally everything must be done asynchronously; that's why I make periodic comments on async code style. I'm thinking about a programming language that has async as a main focus, because it would help a lot on two fronts: testing under simulation, and compiling to standalone C modules using static allocation.

Anyway, you can flow a content-based architecture over current IP-based structure transparently given async api. It's really easy to join any async api to another. It's just going async in the first place is an amazing pain, and can benefit from language support. But I think it requires arrogating some of the operationg system's role in a programming language.

Channels

arrogating some of the operationg system's role in a programming language

Van Jacobson, who kicked off the named-data/content-centric networking project is also proposing channels, a replacement for sockets, whereby protocol processing is moved from the kernel into the application endpoints.

channels for what?

A reply seems too short with no explicit point. I infer many on your behalf you may not intend. For example, I assume you meant none of these points: 1) channels are a language feature; 2) leave it to Van; 3) changing Linux is the only solution; 4) boil-the-ocean plans are good; 5) optimize kernels but not apps; 6) channels obviate need for buffers. My best guess is you free associated.

What if you want async code with scheduling behavior working on more than one OS? Say your library must be able to work on any client, if installed. Say it's a product requirement. I think a programming language point of view is consistent with a portable code goal. How can you get similar async scheduling on different platforms? Think about it. It can't be in the OS. I expect folks leave such features out of languages when they fear treading in OS territory, since criticism is cheap.

What if your company has it's own TCP stack? Channels won't help.